1
00:00:00,000 --> 00:00:03,220
So you've finally gotten your hot little hands

2
00:00:03,220 --> 00:00:09,000
on a shiny new graphics card for your PC, and you're confident that by pushing it to its limits,

3
00:00:09,000 --> 00:00:13,520
you'll be able to run any game at mouth-watering settings.

4
00:00:15,560 --> 00:00:18,840
But it turns out there's a long-standing bottleneck

5
00:00:18,840 --> 00:00:21,920
that limits how many frames your GPU can push.

6
00:00:21,920 --> 00:00:25,920
And we're not talking about your CPU or the PCI Express slot.

7
00:00:25,920 --> 00:00:31,040
It's actually the GPU's pipeline, the set of steps that visual data has to go through

8
00:00:31,040 --> 00:00:35,260
in order to become a fully rendered image. But how does that pipeline bottleneck

9
00:00:35,260 --> 00:00:39,840
your gaming experience? Let's find out by learning how a graphics pipeline works.

10
00:00:39,840 --> 00:00:44,680
And we'd like to extend our warm thanks to Jianye Liu, Senior Program Manager at Microsoft

11
00:00:44,680 --> 00:00:49,780
for helping us understand the problem and the solution that's starting to gain adoption.

12
00:00:51,360 --> 00:00:56,160
The first three steps of the pipeline, called the geometry pipeline, are really important

13
00:00:56,160 --> 00:01:01,200
and actually end up being where the bottleneck is. Step one is when the raw visual data,

14
00:01:01,200 --> 00:01:07,160
basically just numbers, plain numbers, like mama used to make them, goes through an input assembler,

15
00:01:07,160 --> 00:01:11,360
which takes the vertices of the triangles that ultimately make up a finished image

16
00:01:11,360 --> 00:01:14,760
and organizes them so that they're pointing at each other correctly.

17
00:01:14,760 --> 00:01:20,000
Step two takes this organized set of vertices and puts them through a vertex shader,

18
00:01:20,000 --> 00:01:24,320
which raises or lowers the vertices to create a 3D mesh of sorts.

19
00:01:24,320 --> 00:01:29,600
For example, the vertex shader can create a bumpy texture on a wall or ripples on a pond.

20
00:01:29,600 --> 00:01:35,360
The third step is a rasterizer. This is the bit that puts pixels inside each triangle

21
00:01:35,360 --> 00:01:39,200
to fill out the image. Now, there are other steps beyond these three,

22
00:01:39,200 --> 00:01:42,240
namely pixel shading, which gives each pixel

23
00:01:42,240 --> 00:01:47,080
the appropriate color and lighting, and the output merger, which puts different visual elements

24
00:01:47,080 --> 00:01:51,240
together, such as ensuring a character standing in front of a wall is displayed properly

25
00:01:51,280 --> 00:01:54,920
so you only see the character and not the wall behind them.

26
00:01:54,920 --> 00:01:59,600
However, the big bottleneck we're talking about today involves those first three steps.

27
00:01:59,600 --> 00:02:03,720
But why are they such a bottleneck? I mean, the process seems pretty straightforward.

28
00:02:03,720 --> 00:02:08,920
Well, there are a few big reasons for this. One is that that first step, the input assembler,

29
00:02:08,920 --> 00:02:12,680
can only understand data that's organized in a very specific way.

30
00:02:12,680 --> 00:02:17,080
Typically, it can't accept compressed data that can be moved around more quickly.

31
00:02:17,080 --> 00:02:20,220
Or if a developer thinks of a more efficient way to organize their data,

32
00:02:20,220 --> 00:02:25,740
the input assembler simply won't be able to understand it. Two, there are actually several optional stages

33
00:02:25,740 --> 00:02:29,660
after the vertex shader. For example, one is called a geometry shader

34
00:02:29,660 --> 00:02:34,780
that can take a point and expand it out to a particular shape, such as a strand of hair.

35
00:02:34,780 --> 00:02:39,900
This is quicker than drawing a bunch of new triangles, but the geometry shader and other optional steps

36
00:02:39,900 --> 00:02:43,460
have been added over the years as games have become more complex.

37
00:02:43,460 --> 00:02:48,900
They're essentially glommed onto the pipeline in a rigid, sequential way that can't be processed

38
00:02:48,940 --> 00:02:53,860
in parallel, meaning they take longer. Three, because of this rigid sequence,

39
00:02:53,860 --> 00:02:57,740
you have to wait until you get to the rasterizer to start culling.

40
00:02:57,740 --> 00:03:01,780
Sounds dangerous. Or throwing away unused triangles that are way off

41
00:03:01,780 --> 00:03:06,820
in the distance or obscured by an object on the screen. This is a problem because by the time the data

42
00:03:06,820 --> 00:03:11,740
has gotten to the rasterizer phase of the pipeline, the GPU has already done a ton of legwork

43
00:03:11,740 --> 00:03:14,900
rendering unnecessary triangles.

44
00:03:14,900 --> 00:03:20,100
Sounds like my typical Wednesday evening. So because the geometry pipeline is so inflexible,

45
00:03:20,100 --> 00:03:25,340
the solution isn't to retool it. It's to replace it completely.

46
00:03:25,340 --> 00:03:30,860
This is where mesh shading comes in, one of the biggest features in the DirectX 12 Ultimate API.

47
00:03:30,860 --> 00:03:34,300
Instead of having discrete steps before the rasterizer,

48
00:03:34,300 --> 00:03:39,500
the mesh shader is one stage that can do a few really cool things.

49
00:03:39,500 --> 00:03:42,940
First, the data that you feed into it can be much more arbitrary

50
00:03:42,940 --> 00:03:47,940
so it can understand compressed data and other data sets an old school input assembler couldn't.

51
00:03:47,940 --> 00:03:52,500
You can't handle this new data old man. Essentially, the mesh shader is almost

52
00:03:52,500 --> 00:03:57,180
like a mini programmable computer. So if a developer wants to accomplish some rendering task

53
00:03:57,180 --> 00:04:02,300
more efficiently, they can just code it in. Each of the processes above can also intelligently

54
00:04:02,300 --> 00:04:06,580
communicate with each other within a mesh shader. So instead of waiting to cull triangles

55
00:04:06,580 --> 00:04:11,300
so late in the process, it can be done earlier. And the geometry can even be arranged in a way

56
00:04:11,300 --> 00:04:15,540
to make culling easier, demanding even less GPU power.

57
00:04:15,540 --> 00:04:20,540
Oh, and we haven't even talked about the whole reason it's called mesh shading in the first place.

58
00:04:20,540 --> 00:04:23,540
It's a really fun term, meshlets.

59
00:04:23,540 --> 00:04:27,980
Instead of working on one triangle at once, your GPU can instead work on meshes

60
00:04:27,980 --> 00:04:33,900
of multiple triangles in parallel. So instead of making a decision about culling one triangle,

61
00:04:33,900 --> 00:04:38,340
your GPU can instead do them in batches. Throwing out data, it doesn't need a process

62
00:04:38,340 --> 00:04:43,900
and saving precious resources. Older vertex shaders only saw a soup of points

63
00:04:43,900 --> 00:04:47,940
instead of an actual mesh, but mesh shaders are much smarter

64
00:04:47,940 --> 00:04:51,620
and they can know exactly what they're working with much earlier in the rendering process.

65
00:04:51,620 --> 00:04:56,300
So what does all of this mean? Well, because your GPU doesn't have to work

66
00:04:56,300 --> 00:05:00,500
as hard for each frame, it means faster frame rates and more detailed environments.

67
00:05:00,500 --> 00:05:03,660
Ultimately, the things we all want from our graphics cards.

68
00:05:03,660 --> 00:05:08,220
Currently, many game developers are looking at the best ways to implement mesh shading into their titles

69
00:05:08,260 --> 00:05:12,100
because it's such a customizable tool and the old geometry pipeline

70
00:05:12,100 --> 00:05:16,580
has been around for such a long time. It might take a few years before we see widespread adoption

71
00:05:16,580 --> 00:05:20,140
of mesh shading in popular games. The good news though, is that there's already

72
00:05:20,140 --> 00:05:23,260
hardware support for it with newer consumer graphics cards.

73
00:05:23,260 --> 00:05:28,300
So by the time we see these games on the market, chances are we'll all have next-gen graphics cards

74
00:05:28,300 --> 00:05:31,380
that support these new features. Every one of us.

75
00:05:33,380 --> 00:05:35,940
Yes! I can't wait.

76
00:05:36,940 --> 00:05:41,180
Well guys, that's a Techquickie video. Not sure if you've ever seen one before,

77
00:05:41,180 --> 00:05:45,220
but that's what it's like. Hey, speaking of like, you wanna like the video

78
00:05:45,220 --> 00:05:48,980
if you liked it? Hey, do you have a like? Do you have a like to give out?

79
00:05:48,980 --> 00:05:52,900
Please, we love your likes. Also check out other videos,

80
00:05:52,900 --> 00:05:57,300
comment below with video suggestions and don't forget to subscribe and follow Techquickie.

81
00:05:57,300 --> 00:05:58,140
Love you.
