WEBVTT

00:00:00.240 --> 00:00:07.279
so you've finally gotten your hot little hands on a shiny new graphics card for

00:00:04.880 --> 00:00:12.240
your pc and you're confident that by pushing it to its limits you'll be able

00:00:09.440 --> 00:00:15.040
to run any game at mouth-watering settings

00:00:15.839 --> 00:00:22.480
but it turns out there's a long-standing bottleneck that limits how many frames

00:00:20.000 --> 00:00:28.400
your GPU can push and we're not talking about your CPU or the pci express slot

00:00:25.760 --> 00:00:32.239
it's actually the GPU's pipeline the set of steps that visual data has to go

00:00:30.480 --> 00:00:35.600
through in order to become a fully rendered image but how does that

00:00:34.000 --> 00:00:39.920
pipeline bottleneck your gaming experience let's find out by learning

00:00:37.840 --> 00:00:44.640
how a graphics pipeline works and we'd like to extend our warm thanks to gianlu

00:00:42.480 --> 00:00:48.800
senior program manager at microsoft for helping us understand the problem and

00:00:46.800 --> 00:00:53.039
the solution that's starting to gain adoption

00:00:51.200 --> 00:00:56.960
the first three steps of the pipeline called the geometry pipeline are really

00:00:55.440 --> 00:01:02.000
important and actually end up being where the bottleneck is step one is when

00:00:59.680 --> 00:01:06.479
the raw visual data basically just numbers plain numbers like mama used to

00:01:04.640 --> 00:01:10.320
make them goes through an input assembler which takes the vertices of

00:01:08.320 --> 00:01:13.840
the triangles that ultimately make up a finished image and organizes them so

00:01:12.400 --> 00:01:18.640
that they're pointing at each other correctly step two takes this organized

00:01:16.560 --> 00:01:24.159
set of vertices and puts them through a vertex shader which raises or lowers the

00:01:21.360 --> 00:01:28.640
vertices to create a 3d mesh of sorts for example the vertex shader can create

00:01:26.240 --> 00:01:34.400
a bumpy texture on a wall or ripples on a pond the third step is a rasterizer

00:01:32.159 --> 00:01:39.119
this is the bit that puts pixels inside each triangle to fill out the image now

00:01:36.960 --> 00:01:43.840
there are other steps beyond these three namely pixel shading which gives each

00:01:41.680 --> 00:01:47.439
pixel the appropriate color and lighting and the output merger which puts

00:01:45.920 --> 00:01:51.200
different visual elements together such as ensuring a character standing in

00:01:49.040 --> 00:01:55.759
front of a wall is displayed properly so you only see the character and not the

00:01:53.200 --> 00:01:59.600
wall behind them however the big bottleneck we're talking about today

00:01:57.360 --> 00:02:03.520
involves those first three steps but why are they such a bottleneck i mean the

00:02:01.200 --> 00:02:07.759
process seems pretty straightforward well there are a few big reasons for

00:02:05.119 --> 00:02:12.480
this one is that that first step the input assembler can only understand data

00:02:09.920 --> 00:02:16.160
that's organized in a very specific way typically it can't accept compressed

00:02:14.640 --> 00:02:19.520
data that can be moved around more quickly or if a developer thinks of a

00:02:18.239 --> 00:02:23.440
more efficient way to organize their data the input assembler simply won't be

00:02:21.520 --> 00:02:27.920
able to understand it two there are actually several optional stages after

00:02:25.920 --> 00:02:32.319
the vertex shader for example one is called a geometry shader that can take a

00:02:30.239 --> 00:02:36.480
point and expand it out to a particular shape such as a strand of hair this is

00:02:35.040 --> 00:02:40.319
quicker than drawing a bunch of new triangles but the geometry shader and

00:02:38.720 --> 00:02:44.720
other optional steps have been added over the years as games have become more

00:02:42.239 --> 00:02:49.840
complex they're essentially glommed onto the pipeline in a rigid sequential way

00:02:47.440 --> 00:02:54.239
that can't be processed in parallel meaning they take longer three because

00:02:52.400 --> 00:02:57.599
of this rigid sequence you have to wait until you get to the rasterizer to start

00:02:56.560 --> 00:03:01.760
culling sounds dangerous or throwing away unused

00:03:00.400 --> 00:03:06.000
triangles that are way off in the distance or obscured by an object on the

00:03:03.680 --> 00:03:09.680
screen this is a problem because by the time the data has gotten to the

00:03:07.360 --> 00:03:14.720
rasterizer phase of the pipeline the GPU has already done a ton of legwork

00:03:11.599 --> 00:03:16.879
rendering unnecessary triangles

00:03:14.720 --> 00:03:21.680
sounds like my typical wednesday evening so because the geometry pipeline is so

00:03:18.879 --> 00:03:27.599
inflexible the solution isn't to retool it it's to replace it completely this is

00:03:25.519 --> 00:03:31.920
where mesh shading comes in one of the biggest features in the directx 12

00:03:29.440 --> 00:03:37.680
ultimate API instead of having discrete steps before the rasterizer the mesh

00:03:34.640 --> 00:03:39.360
shader is one stage that can do a few

00:03:37.680 --> 00:03:43.280
really cool things first the data that you feed into it can

00:03:41.280 --> 00:03:47.280
be much more arbitrary so it can understand compressed data and other

00:03:45.200 --> 00:03:52.080
data sets an old-school input assembler couldn't you can't handle this new data

00:03:49.519 --> 00:03:55.920
old man essentially the mesh shader is almost like a mini programmable computer

00:03:54.319 --> 00:03:59.920
so if a developer wants to accomplish some rendering task more efficiently

00:03:58.080 --> 00:04:03.680
they can just code it in each of the processes above can also intelligently

00:04:02.159 --> 00:04:08.000
communicate with each other within a mesh shader so instead of waiting to

00:04:05.599 --> 00:04:11.439
cull triangles so late in the process it can be done earlier and the geometry can

00:04:10.080 --> 00:04:17.120
even be arranged in a way to make culling easier demanding even less GPU

00:04:14.720 --> 00:04:20.799
power oh and we haven't even talked about the whole reason it's called mesh

00:04:18.720 --> 00:04:23.360
shading in the first place it's a really fun term

00:04:22.079 --> 00:04:27.759
meshlets instead of working on one triangle at

00:04:25.120 --> 00:04:32.320
once your GPU can instead work on meshes of multiple triangles in parallel so

00:04:30.880 --> 00:04:36.720
instead of making a decision about calling one triangle your GPU can

00:04:34.560 --> 00:04:41.520
instead do them in batches throwing out data it doesn't need to process and

00:04:38.479 --> 00:04:43.680
saving precious resources older vertex

00:04:41.520 --> 00:04:48.160
shaders only saw a soup of points instead of an actual mesh but mesh

00:04:45.919 --> 00:04:51.600
shaders are much smarter and they can know exactly what they're working with

00:04:49.600 --> 00:04:55.840
much earlier in the rendering process so what does all of this mean

00:04:54.160 --> 00:04:59.600
well because your GPU doesn't have to work as hard for each frame it means

00:04:57.680 --> 00:05:03.440
faster frame rates and more detailed environments ultimately the things we

00:05:01.520 --> 00:05:06.560
all want from our graphics cards currently many game developers are

00:05:04.880 --> 00:05:10.639
looking at the best ways to implement mesh shading into their titles because

00:05:08.320 --> 00:05:14.560
it's such a customizable tool and the old geometry pipeline has been around

00:05:12.320 --> 00:05:18.240
for such a long time it might take a few years before we see widespread adoption

00:05:16.400 --> 00:05:21.680
of mesh shading in popular games the good news though is that there's already

00:05:20.000 --> 00:05:25.520
hardware support for it with newer consumer graphics cards so by the time

00:05:23.600 --> 00:05:30.000
we see these games on the market chances are we'll all have next-gen graphics

00:05:27.759 --> 00:05:32.880
cards that support these new features every one of us

00:05:33.280 --> 00:05:38.960
yes i can't wait

00:05:37.120 --> 00:05:43.680
i can't wait for you to check out our sponsor freshbooks they're an invoicing

00:05:42.080 --> 00:05:48.320
and accounting solution that's built for owners and their clients they state that

00:05:45.759 --> 00:05:54.080
the average user saves 46 hours a month gets paid 18 days faster and increases

00:05:50.960 --> 00:05:55.600
their roi by 11 times using freshbooks

00:05:54.080 --> 00:05:58.720
freshbooks is a huge benefit for freelancers and small business owners

00:05:57.360 --> 00:06:02.639
who don't have time to waste on invoicing accounting and payment

00:06:00.400 --> 00:06:06.880
processing over 3000 business owners have rated freshbooks an average of four

00:06:04.560 --> 00:06:10.319
and a half out of five stars on get app and it's super easy to get up and

00:06:08.160 --> 00:06:14.639
running with award-winning support so you'll never be alone try freshbooks

00:06:12.560 --> 00:06:19.280
free for 30 days no credit card required go to freshbooks.com techwiki and enter

00:06:17.520 --> 00:06:24.160
techwiki in the how did you hear about us section makes sense well guys that's

00:06:22.400 --> 00:06:28.160
a tech wiki video not sure if you've ever seen one before but that's

00:06:26.240 --> 00:06:30.800
that's what it's like hey speaking of like you want to like the video if you

00:06:29.440 --> 00:06:35.680
liked it hey do you have a like do you have a like to give out

00:06:33.039 --> 00:06:38.319
please we love your likes also check out our other videos comment

00:06:37.120 --> 00:06:44.000
below with video suggestions and don't forget to subscribe and follow Techquickie

00:06:41.280 --> 00:06:44.000
love you
