WEBVTT

00:00:00.240 --> 00:00:05.600
No Doom Vulcan benchmarks, eh? No DX12 or Vulcan. Oh, I see you guys followed

00:00:03.760 --> 00:00:09.519
NVIDIA's review guide to a te. Sad to see Linus Tech Tips benchmarking APIs,

00:00:07.759 --> 00:00:14.559
which is close to its end of life. What about RX480 Vulcan performance? Fine,

00:00:12.000 --> 00:00:18.480
let's talk about the new APIs, DirectX12 and Vulcan. How they work, how they

00:00:16.480 --> 00:00:29.439
perform, what they mean for the future, everything.

00:00:29.439 --> 00:00:35.520
FreshBooks is the super simple invoicing solution that lets you get organized,

00:00:33.200 --> 00:00:40.160
save time, and get paid faster. Click now to try it for free. Now, what is an

00:00:38.399 --> 00:00:44.239
API or application programming interface? In layman's terms, it's a

00:00:42.480 --> 00:00:48.559
facilitator. It handles requests similarly to a waiter at a restaurant

00:00:46.480 --> 00:00:54.160
taking orders. You can think of the customer as system A. System A wants to

00:00:51.520 --> 00:00:59.520
access some functionality or information that system B, which we'll call the

00:00:56.399 --> 00:01:01.440
kitchen, controls. Now, system B doesn't

00:00:59.520 --> 00:01:07.200
want to just let system A directly access all of its information. There are

00:01:04.159 --> 00:01:09.520
security and liability issues here. But

00:01:07.200 --> 00:01:14.520
not only that, having all these different customers trying to talk to

00:01:11.520 --> 00:01:16.960
the cook directly unless timed

00:01:14.520 --> 00:01:22.080
nearperfectly has the potential to be more confusing than helpful. Enter the

00:01:19.520 --> 00:01:26.479
waiter or in this analogy, the API. The waiter can serve the customer with a

00:01:23.840 --> 00:01:31.040
menu of subruine definitions, protocols, and tools that the customer can use to

00:01:28.560 --> 00:01:35.920
achieve his task. The customer then sends that request through the waiter to

00:01:33.119 --> 00:01:40.880
the kitchen where system B will act out those instructions and then send a

00:01:38.240 --> 00:01:46.960
response back to system A through the API. Clean, simple, and delicious.

00:01:44.880 --> 00:01:50.960
Theoretically, this is how it works, but they can be more bloated and therefore

00:01:49.200 --> 00:01:54.799
slow. Have you ever been to a restaurant that has like four different menus, one

00:01:53.200 --> 00:01:59.040
for just all the different kinds of freaking water with many pages

00:01:56.880 --> 00:02:03.520
containing a multitude of items which are essentially the same damn thing?

00:02:01.680 --> 00:02:08.160
Then you can relate to the frustration of not being able to find what you're

00:02:05.759 --> 00:02:11.520
looking for at all. Do you ever just want to throw the damn menus all in the

00:02:10.080 --> 00:02:16.239
air, stomp over to the kitchen, and scream, "Can you just render me a

00:02:13.680 --> 00:02:20.480
chicken burger and some yam fries? Is that too much to ask?"

00:02:18.640 --> 00:02:24.640
Okay, let's come back to that later. First, what are they used for? Lots of

00:02:22.640 --> 00:02:29.440
things. APIs can be used for web-based systems, operating systems, databases,

00:02:26.959 --> 00:02:33.599
software libraries, or in this case, computer hardware. But as important and

00:02:31.840 --> 00:02:38.319
ubiquitous as they are, they're not something the average user will ever

00:02:35.920 --> 00:02:43.360
interface with. So, why do we suddenly care so much about DirectX12 and Vulcan

00:02:41.280 --> 00:02:48.000
as selling points for games and video cards? Well, hype machine aside,

00:02:45.440 --> 00:02:52.879
DirectX11 and OpenGL, the predecessors to the shiny new graphics APIs, are

00:02:50.640 --> 00:02:57.400
rather bureaucratic. Like the complicated menu analogy, but worse

00:02:55.200 --> 00:03:02.480
because there are even more degrees of abstraction. There are many systems and

00:02:59.840 --> 00:03:08.480
devices talking to each other at a time, passing instructions around with many of

00:03:04.800 --> 00:03:10.720
them being redundant or outdated. It's

00:03:08.480 --> 00:03:15.680
just not as efficient. And there are other problems. For instance, one of

00:03:13.519 --> 00:03:21.440
your processor cores carries the burden of managing the vast majority of all of

00:03:19.040 --> 00:03:25.440
the critical time-sensitive tasks. One of the main ways that the new graphics

00:03:23.200 --> 00:03:30.319
APIs are more efficient is that they can use previously untapped hardware

00:03:27.680 --> 00:03:35.200
resources. Multi-core CPUs are a great example here. Look at this relatively

00:03:32.560 --> 00:03:41.040
realistic hypothetical scenario given to us by AMD. Thanks, guys. They show that

00:03:38.400 --> 00:03:46.879
OpenGL stacks all of the work of presenting to the user and processing

00:03:43.360 --> 00:03:49.680
the OpenGL driver onto core 1 along with

00:03:46.879 --> 00:03:54.799
a large portion of processing the OpenGL runtime and the largest portion of game

00:03:52.640 --> 00:03:59.040
code processing. This leaves little for cores 2 to four to work on, almost

00:03:56.720 --> 00:04:03.680
nothing for five and six to work on, and literally nothing for seven and 8. And

00:04:01.439 --> 00:04:06.959
for your reference, DirectX11 works in a very similar manner. Now, let's look at

00:04:05.519 --> 00:04:11.519
Vulcan. Though this is generally applicable to both of the new generation

00:04:09.040 --> 00:04:15.519
APIs, one of your processing cores is still responsible for a bigger part of

00:04:13.519 --> 00:04:20.400
the work than any of the others, but overall utilization is noticeably

00:04:18.079 --> 00:04:24.320
higher, even all the way down to core 8. Now, this is a hypothetical example, so

00:04:22.560 --> 00:04:28.800
I'm not going to spend time analyzing the exact improvements to the

00:04:26.400 --> 00:04:33.759
millisecond, but in a nutshell, Vulcan and Direct X12 do a much better job

00:04:31.440 --> 00:04:38.720
overall of sharing the load across all cores. Another key objective is to shed

00:04:36.240 --> 00:04:42.160
old layers of unneeded abstraction. They've done this by simplifying

00:04:40.240 --> 00:04:47.840
protocol routes, minimizing graphical driver overhead, focusing heavily on

00:04:44.880 --> 00:04:52.080
preventing timeheavy draw calls sent linearly to a single core of the CPU

00:04:50.240 --> 00:04:57.919
instead of parallelizing draw call packages in order to ease up on CPU load

00:04:55.120 --> 00:05:02.800
and let the GPU function as it should. Kind of like multiple waiters attending

00:05:00.160 --> 00:05:07.919
to the needs of each diner at a table instead of having one waiter with many

00:05:05.440 --> 00:05:12.479
plates on his ARM, dropping them off one at a time. Another key is that game

00:05:09.919 --> 00:05:17.440
developers can talk more directly to the GPU hardware. This is what is meant by

00:05:14.800 --> 00:05:22.240
the term lower level API. Think of it like having a manual transmission in a

00:05:19.680 --> 00:05:25.360
car instead of an automatic. But to all the gear heads out there watching this,

00:05:23.840 --> 00:05:30.320
don't just assume that all your DirectX11 and OpenGL games are outdated

00:05:28.000 --> 00:05:36.479
junk at this point. Bear in mind that poor operation of a manual transmission

00:05:33.120 --> 00:05:39.680
can be worse than an automatic or simply

00:05:36.479 --> 00:05:42.160
not better. Just because a game has

00:05:39.680 --> 00:05:47.680
Vulcan or DirectX12 support doesn't mean that efficiency and performance

00:05:44.759 --> 00:05:52.639
automatically improve. And this is true especially for games that were released

00:05:49.680 --> 00:05:58.160
with it early on. Though of course there are a few exceptions. For the best

00:05:55.120 --> 00:06:00.560
results, the new APIs need to be part of

00:05:58.160 --> 00:06:05.759
the core design of not only the game itself, but the underlying engine as

00:06:03.360 --> 00:06:12.080
well. And you do the math. It can take easily two, three, or even more years to

00:06:08.479 --> 00:06:13.680
make a good AAA title. And these APIs

00:06:12.080 --> 00:06:18.639
haven't even been available for that long. Meaning the implementation of

00:06:15.759 --> 00:06:24.000
these APIs was likely added to tick a marketing checkbox, appease a longtime

00:06:20.880 --> 00:06:25.919
partner, or hopefully, and very likely,

00:06:24.000 --> 00:06:30.880
just to gain the valuable experience that you can only get by working

00:06:27.919 --> 00:06:35.199
directly on them. But when more true DirectX12 and Vulcan titles drop, this

00:06:33.039 --> 00:06:39.520
level of control for developers has the potential to be awesome, but a little

00:06:37.520 --> 00:06:44.400
scary, too, because we're putting a lot of responsibility on the game and more

00:06:42.000 --> 00:06:49.120
specifically the game engine developers. And this is a major shift. So, while I

00:06:46.960 --> 00:06:54.479
just spent like four minutes telling you how interesting and important the new

00:06:51.039 --> 00:06:56.240
APIs are, in reality, the truly

00:06:54.479 --> 00:07:01.039
interesting component of the equation is going to be the new game engines that

00:06:58.319 --> 00:07:07.120
arise because of the freedoms afforded to them by the APIs, not necessarily the

00:07:04.720 --> 00:07:11.520
APIs themselves. And from game engine creators who care, the John Carmax and

00:07:09.360 --> 00:07:16.479
Tim Sweenies of the world, you'll get amazing new features unlocked by this

00:07:13.599 --> 00:07:21.120
additional performance and flexibility. But back to the scary bit from before,

00:07:19.199 --> 00:07:26.160
we're also putting this greater degree of control and the responsibility that

00:07:23.759 --> 00:07:32.000
comes with it in the hands of gaming companies who tell us things like 30fps

00:07:29.039 --> 00:07:36.639
is a good thing or that implement game physics effects that are reliant on the

00:07:34.479 --> 00:07:41.120
frame rate. That hasn't been a thing since like Intel 46, and that was

00:07:38.479 --> 00:07:45.360
arguable back then. That's inexcusable trash. Back to the APIs, though, since

00:07:43.120 --> 00:07:50.080
that's what we're really here for. So far, we've focused on similarities, but

00:07:47.440 --> 00:07:55.039
how do DX12 and Vulcan differ from each other? Let's start with DirectX12. A big

00:07:52.240 --> 00:07:59.440
focus for Microsoft this time around has been the introduction of their new LDA

00:07:57.120 --> 00:08:04.560
and MDA modes for multiple graphics cards. For a rundown on how those work,

00:08:02.240 --> 00:08:09.599
check out this video on WTF is going on with SLI. Along with this, Microsoft's

00:08:07.120 --> 00:08:14.800
vision to allow developers to support a mixture of graphics card models and even

00:08:12.240 --> 00:08:19.120
brands so the user can get as much power as possible out of their available

00:08:16.639 --> 00:08:23.680
hardware. Think of this more in terms of using your onboard graphics to get a

00:08:21.360 --> 00:08:29.039
little bit more oomph than a compelling reason to build this monstrosity. One of

00:08:26.960 --> 00:08:34.159
the ways this could work is split frame rendering or SFR. This is when a portion

00:08:31.840 --> 00:08:38.399
of the screen is rendered by one GPU and the rest is rendered by a second. But

00:08:36.240 --> 00:08:42.959
we'll have to wait and see how well this performs in the real world. So that's

00:08:41.039 --> 00:08:48.640
cool. But being a Microsoft product, DirectX12 will only function on Windows

00:08:45.519 --> 00:08:50.880
10 and Xbox. So Linux, OSX, Android

00:08:48.640 --> 00:08:55.120
enthusiasts, and actually even people who are still on Windows 7 and Windows 8

00:08:52.800 --> 00:08:59.200
are left out in the cold. Which brings us to Vulcan. Vulcan, brought to us by

00:08:57.360 --> 00:09:04.399
the Kronos Group, is the primary successor to OpenGL and is proudly

00:09:02.200 --> 00:09:09.360
cross-platform, working on everything we just mentioned and more. This is a huge

00:09:07.279 --> 00:09:14.320
deal for Steam OS in particular and Linux in general because it should bring

00:09:11.920 --> 00:09:20.240
with it a stronger support for a wider range of titles, something Linux has

00:09:16.480 --> 00:09:22.320
struggled with ever since, well, ever.

00:09:20.240 --> 00:09:27.920
So, Vulcan is a big deal. Google has been using it as Android's low-level API

00:09:24.480 --> 00:09:30.240
since 2015. And Dan Ginsburgg from Valve

00:09:27.920 --> 00:09:34.720
has talked about it on stage and said that Vulcan is the future. Although

00:09:32.959 --> 00:09:40.959
that's not surprising considering DirectX12 doesn't work on Steam OS and

00:09:37.920 --> 00:09:43.279
Valve Microsoft relationshipness has

00:09:40.959 --> 00:09:47.600
been getting a little tense lately. Moving on, there was one last thing that

00:09:45.600 --> 00:09:51.560
they had in common. Asynchronous compute. Now, this has been a highly

00:09:49.440 --> 00:09:56.720
controversial subject, making it unfortunately outside the scope of this

00:09:53.760 --> 00:10:00.320
video. But if you guys want to see a similar video dedicated to it, let me

00:09:58.800 --> 00:10:04.240
know in the comments down below. In a nutshell, though, it allows additional

00:10:02.080 --> 00:10:08.240
lightweight work to run in parallel alongside the main graphics thread. So,

00:10:06.320 --> 00:10:13.600
a specific lighting technique or post-process anti-aliasing method like

00:10:11.000 --> 00:10:18.160
TSSA. Now, this dynamic scheduling introduces some challenges for NVIDIA

00:10:15.680 --> 00:10:23.440
and AMD that did not exist in a more static ecosystem, but that's for them to

00:10:20.880 --> 00:10:28.320
worry about and for me to maybe cover in another video. That being said, it does

00:10:25.839 --> 00:10:32.720
seem that AMD has been pushing harder than NVIDIA, and it shows

00:10:30.279 --> 00:10:38.160
performance-wise as AMD is currently seeing more of a benefit than NVIDIA in

00:10:35.440 --> 00:10:43.200
the 3D Mark Time Spy benchmark, which allows you to switch asynchronous on or

00:10:40.800 --> 00:10:47.200
off. When reviewing the numbers for these APIs, you'll notice that they're

00:10:45.040 --> 00:10:51.760
kind of all over the place. Looking at Rise of the Tomb Raider, a DirectX game,

00:10:49.839 --> 00:10:57.760
we can see that while it does make a significant improvement at 1080p, once

00:10:54.480 --> 00:11:00.320
you step up a bit to 1440p or even 4K,

00:10:57.760 --> 00:11:04.079
things seem to fall apart a little bit, and

00:11:01.160 --> 00:11:07.920
DirectX11 actually pulls ahead. Hitman, on the other hand, did not share this

00:11:05.760 --> 00:11:12.000
funky scaling pattern and seemed to improve when using DirectX12, or at the

00:11:09.680 --> 00:11:16.160
very least stay the same across all the graphics cards I tested it with, which

00:11:13.760 --> 00:11:22.560
should represent GP 104 with the GTX 1080, GP 106 with the GTX 1060 6 GB

00:11:19.760 --> 00:11:27.920
edition for the NVIDIA side, and the power of AMD's new Polaris architecture

00:11:24.880 --> 00:11:30.000
with the RX 480. Moving on to Ashes of

00:11:27.920 --> 00:11:34.880
the Singularity. This game is like half benchmark, half game. So, it shows the

00:11:32.720 --> 00:11:39.760
biggest improvement by far when using DirectX12 instead of DirectX11. And for

00:11:37.839 --> 00:11:43.680
the Vulcan fans out there, I've heard there will be a patch coming for you as

00:11:41.519 --> 00:11:48.640
well. One great additional piece of insight that Ash's displays at the end

00:11:45.760 --> 00:11:53.360
of their benchmark is CPU frames versus GPU frames. What this essentially means

00:11:51.279 --> 00:11:58.720
is how many frames your CPU could potentially push compared to what your

00:11:55.800 --> 00:12:03.360
GPU actually pushed. So, if the CPU number is higher, you could get more

00:12:00.959 --> 00:12:06.800
overall FPS with a graphics card upgrade. Notice that when we run the

00:12:05.040 --> 00:12:11.279
more CPU focused version of the benchmark, the numbers are at par

00:12:08.959 --> 00:12:17.760
because now you're intentionally CPU bottlenecked. Moving on to Vulcan. Doom

00:12:14.720 --> 00:12:20.320
numbers for the NVIDIA side are just a

00:12:17.760 --> 00:12:26.240
mess. For the GTX 1080, Vulcan was similar to Tomb Raider worse at 4K than

00:12:23.120 --> 00:12:27.680
OpenGL 4.5. dropping down to 1080p,

00:12:26.240 --> 00:12:32.240
however, and it's a whole different story with massive improvements shown.

00:12:30.000 --> 00:12:37.440
The RX 480 does show the performance trend that we would expect, improving

00:12:34.880 --> 00:12:41.839
considerably when running the Vulcan API and utilizing asynchronous compute

00:12:39.440 --> 00:12:46.639
shaders. Good. So, the features sound pretty great. The performance, depending

00:12:44.160 --> 00:12:52.240
on your configuration, is probably good, but might not be great. What about

00:12:49.440 --> 00:12:56.480
supported games? On the DirectX12 side of things, the list seems rather

00:12:54.079 --> 00:13:00.800
populated and there is a fair number of titles on the way that are claiming they

00:12:58.639 --> 00:13:06.800
will support it. But this doesn't mean that DirectX12 integration hasn't been

00:13:03.519 --> 00:13:09.680
riddled with issues. Quantum Break was a

00:13:06.800 --> 00:13:13.839
mess. Tomb Raider's backend barely even functioned for a while to the point

00:13:11.519 --> 00:13:19.440
where they added a warning if you enable it. and the upcoming DSX Mankind

00:13:17.040 --> 00:13:23.279
Divided, which I think is out now. It was supposed to launch with support for

00:13:21.279 --> 00:13:26.959
it, and that's been pulled for a while, so they can fix it up. And Vulcan side

00:13:25.279 --> 00:13:31.360
of things is rough, too. Their support list has four items on it. One of which

00:13:29.519 --> 00:13:37.320
is upcoming. One of which is Dota Freaking 2. Not sure about you, but I

00:13:34.399 --> 00:13:42.480
sure needed more FPS in Dota on my DirectX12 capable hardware. Another is

00:13:40.560 --> 00:13:47.680
the rather wonderful single player game from 2014. Possibly not that many

00:13:45.360 --> 00:13:52.160
current players, unfortunately. And then lastly, we have Doom. A beautiful

00:13:50.079 --> 00:13:57.360
looking game with proper built-in support and asynchronous compute

00:13:54.160 --> 00:13:59.839
capabilities. Cool, but that's singular.

00:13:57.360 --> 00:14:03.839
So there you go. In conclusion, the future is bright with physics and AI

00:14:02.000 --> 00:14:08.079
heavy games being among the most exciting things we have to look forward

00:14:05.279 --> 00:14:12.160
to, which is awesome. But the present is more a light at the end of the tunnel

00:14:09.839 --> 00:14:16.720
situation. Get excited for new game engines that support these APIs and be

00:14:14.240 --> 00:14:23.440
on the lookout for awesome new game engine technology that will likely arise

00:14:20.079 --> 00:14:26.079
from the improvements made here. This

00:14:23.440 --> 00:14:31.120
may finally signal the return of our old cores for gaming episodes. Maybe we'll

00:14:28.720 --> 00:14:37.600
finally have an episode that doesn't end in four cores is the best. Because by

00:14:34.480 --> 00:14:41.320
that time, your $1,700 10 core Extreme

00:14:37.600 --> 00:14:43.639
Edition will finally matter.

00:14:41.320 --> 00:14:48.959
Maybe. Today, we're highlighting the K7XX limited edition Ruby Red

00:14:46.240 --> 00:14:53.279
headphones, of course, from Massdrop. They also have a bunch of other cool

00:14:50.959 --> 00:14:57.199
things that you can buy like with other people so that the price goes down and

00:14:55.519 --> 00:15:00.959
if more people buy it, the price goes down further till a logical minimum. And

00:14:59.519 --> 00:15:04.959
it it's it's really cool. You should check out Massrop regardless. The

00:15:02.800 --> 00:15:09.600
products we're showcasing today is the same specwise as the K7XX headphones

00:15:08.079 --> 00:15:14.000
that Lionus reviewed and you guys all liked sometime last year. You can check

00:15:11.360 --> 00:15:16.800
that video out here or all over my I don't know where it's going to be. The

00:15:15.440 --> 00:15:21.839
real difference, however, is that this run uses red accents on the ear cups and

00:15:19.880 --> 00:15:25.760
headband. Remember that this is a limited drop, so if you want a pair,

00:15:23.920 --> 00:15:31.199
you're going to have to act relatively fast. These headphones were configured

00:15:28.240 --> 00:15:35.440
by Massdrop and manufactured by AKG. Just a note for international orders, if

00:15:32.959 --> 00:15:39.360
you're outside of the US, there's a $25 shipping and handling fee put on top.

00:15:37.199 --> 00:15:43.920
And that's it. Check out Massdrop. Thanks for watching, guys. If this video

00:15:41.959 --> 00:15:47.360
sucked, but if you liked it, you know what to do. So, hit the like button, get

00:15:45.360 --> 00:15:49.680
subscribed, do all that kind of stuff. Uh, check out the link in the video

00:15:48.560 --> 00:15:53.759
description down below where you can buy like GPUs or something. I guess we talked about those in this video. Also,

00:15:52.800 --> 00:16:00.720
check out the link in the video description down below to buy a shirt and go to the forum to talk about

00:15:58.000 --> 00:16:04.639
everything I said in this video. There's might be something wrong. File the crap

00:16:03.120 --> 00:16:09.920
out of me for that and then I'll learn and that'll be good.

00:16:06.839 --> 00:16:12.160
But, man, anyways, watch this video

00:16:09.920 --> 00:16:15.360
which is about WTF is going with SLI because we're making a series of these

00:16:13.360 --> 00:16:19.680
now. Although continuing to do these might kill me, so I don't know.