WTF is going on with DX12 and Vulkan?
Linus Tech Tips
·Linus Tech Tips
·2017-05-06
·
2,833 words · ~14 min read
0:00
No Doom Vulcan benchmarks, eh? No DX12 or Vulcan. Oh, I see you guys followed
0:03
NVIDIA's review guide to a te. Sad to see Linus Tech Tips benchmarking APIs,
0:07
which is close to its end of life. What about RX480 Vulcan performance? Fine,
0:12
let's talk about the new APIs, DirectX12 and Vulcan. How they work, how they
0:16
perform, what they mean for the future, everything.
0:29
FreshBooks is the super simple invoicing solution that lets you get organized,
0:33
save time, and get paid faster. Click now to try it for free. Now, what is an
0:38
API or application programming interface? In layman's terms, it's a
0:42
facilitator. It handles requests similarly to a waiter at a restaurant
0:46
taking orders. You can think of the customer as system A. System A wants to
0:51
access some functionality or information that system B, which we'll call the
0:56
kitchen, controls. Now, system B doesn't
0:59
want to just let system A directly access all of its information. There are
1:04
security and liability issues here. But
1:07
not only that, having all these different customers trying to talk to
1:11
the cook directly unless timed
1:14
nearperfectly has the potential to be more confusing than helpful. Enter the
1:19
waiter or in this analogy, the API. The waiter can serve the customer with a
1:23
menu of subruine definitions, protocols, and tools that the customer can use to
1:28
achieve his task. The customer then sends that request through the waiter to
1:33
the kitchen where system B will act out those instructions and then send a
1:38
response back to system A through the API. Clean, simple, and delicious.
1:44
Theoretically, this is how it works, but they can be more bloated and therefore
1:49
slow. Have you ever been to a restaurant that has like four different menus, one
1:53
for just all the different kinds of freaking water with many pages
1:56
containing a multitude of items which are essentially the same damn thing?
2:01
Then you can relate to the frustration of not being able to find what you're
2:05
looking for at all. Do you ever just want to throw the damn menus all in the
2:10
air, stomp over to the kitchen, and scream, "Can you just render me a
2:13
chicken burger and some yam fries? Is that too much to ask?"
2:18
Okay, let's come back to that later. First, what are they used for? Lots of
2:22
things. APIs can be used for web-based systems, operating systems, databases,
2:26
software libraries, or in this case, computer hardware. But as important and
2:31
ubiquitous as they are, they're not something the average user will ever
2:35
interface with. So, why do we suddenly care so much about DirectX12 and Vulcan
2:41
as selling points for games and video cards? Well, hype machine aside,
2:45
DirectX11 and OpenGL, the predecessors to the shiny new graphics APIs, are
2:50
rather bureaucratic. Like the complicated menu analogy, but worse
2:55
because there are even more degrees of abstraction. There are many systems and
2:59
devices talking to each other at a time, passing instructions around with many of
3:04
them being redundant or outdated. It's
3:08
just not as efficient. And there are other problems. For instance, one of
3:13
your processor cores carries the burden of managing the vast majority of all of
3:19
the critical time-sensitive tasks. One of the main ways that the new graphics
3:23
APIs are more efficient is that they can use previously untapped hardware
3:27
resources. Multi-core CPUs are a great example here. Look at this relatively
3:32
realistic hypothetical scenario given to us by AMD. Thanks, guys. They show that
3:38
OpenGL stacks all of the work of presenting to the user and processing
3:43
the OpenGL driver onto core 1 along with
3:46
a large portion of processing the OpenGL runtime and the largest portion of game
3:52
code processing. This leaves little for cores 2 to four to work on, almost
3:56
nothing for five and six to work on, and literally nothing for seven and 8. And
4:01
for your reference, DirectX11 works in a very similar manner. Now, let's look at
4:05
Vulcan. Though this is generally applicable to both of the new generation
4:09
APIs, one of your processing cores is still responsible for a bigger part of
4:13
the work than any of the others, but overall utilization is noticeably
4:18
higher, even all the way down to core 8. Now, this is a hypothetical example, so
4:22
I'm not going to spend time analyzing the exact improvements to the
4:26
millisecond, but in a nutshell, Vulcan and Direct X12 do a much better job
4:31
overall of sharing the load across all cores. Another key objective is to shed
4:36
old layers of unneeded abstraction. They've done this by simplifying
4:40
protocol routes, minimizing graphical driver overhead, focusing heavily on
4:44
preventing timeheavy draw calls sent linearly to a single core of the CPU
4:50
instead of parallelizing draw call packages in order to ease up on CPU load
4:55
and let the GPU function as it should. Kind of like multiple waiters attending
5:00
to the needs of each diner at a table instead of having one waiter with many
5:05
plates on his ARM, dropping them off one at a time. Another key is that game
5:09
developers can talk more directly to the GPU hardware. This is what is meant by
5:14
the term lower level API. Think of it like having a manual transmission in a
5:19
car instead of an automatic. But to all the gear heads out there watching this,
5:23
don't just assume that all your DirectX11 and OpenGL games are outdated
5:28
junk at this point. Bear in mind that poor operation of a manual transmission
5:33
can be worse than an automatic or simply
5:36
not better. Just because a game has
5:39
Vulcan or DirectX12 support doesn't mean that efficiency and performance
5:44
automatically improve. And this is true especially for games that were released
5:49
with it early on. Though of course there are a few exceptions. For the best
5:55
results, the new APIs need to be part of
5:58
the core design of not only the game itself, but the underlying engine as
6:03
well. And you do the math. It can take easily two, three, or even more years to
6:08
make a good AAA title. And these APIs
6:12
haven't even been available for that long. Meaning the implementation of
6:15
these APIs was likely added to tick a marketing checkbox, appease a longtime
6:20
partner, or hopefully, and very likely,
6:24
just to gain the valuable experience that you can only get by working
6:27
directly on them. But when more true DirectX12 and Vulcan titles drop, this
6:33
level of control for developers has the potential to be awesome, but a little
6:37
scary, too, because we're putting a lot of responsibility on the game and more
6:42
specifically the game engine developers. And this is a major shift. So, while I
6:46
just spent like four minutes telling you how interesting and important the new
6:51
APIs are, in reality, the truly
6:54
interesting component of the equation is going to be the new game engines that
6:58
arise because of the freedoms afforded to them by the APIs, not necessarily the
7:04
APIs themselves. And from game engine creators who care, the John Carmax and
7:09
Tim Sweenies of the world, you'll get amazing new features unlocked by this
7:13
additional performance and flexibility. But back to the scary bit from before,
7:19
we're also putting this greater degree of control and the responsibility that
7:23
comes with it in the hands of gaming companies who tell us things like 30fps
7:29
is a good thing or that implement game physics effects that are reliant on the
7:34
frame rate. That hasn't been a thing since like Intel 46, and that was
7:38
arguable back then. That's inexcusable trash. Back to the APIs, though, since
7:43
that's what we're really here for. So far, we've focused on similarities, but
7:47
how do DX12 and Vulcan differ from each other? Let's start with DirectX12. A big
7:52
focus for Microsoft this time around has been the introduction of their new LDA
7:57
and MDA modes for multiple graphics cards. For a rundown on how those work,
8:02
check out this video on WTF is going on with SLI. Along with this, Microsoft's
8:07
vision to allow developers to support a mixture of graphics card models and even
8:12
brands so the user can get as much power as possible out of their available
8:16
hardware. Think of this more in terms of using your onboard graphics to get a
8:21
little bit more oomph than a compelling reason to build this monstrosity. One of
8:26
the ways this could work is split frame rendering or SFR. This is when a portion
8:31
of the screen is rendered by one GPU and the rest is rendered by a second. But
8:36
we'll have to wait and see how well this performs in the real world. So that's
8:41
cool. But being a Microsoft product, DirectX12 will only function on Windows
8:45
10 and Xbox. So Linux, OSX, Android
8:48
enthusiasts, and actually even people who are still on Windows 7 and Windows 8
8:52
are left out in the cold. Which brings us to Vulcan. Vulcan, brought to us by
8:57
the Kronos Group, is the primary successor to OpenGL and is proudly
9:02
cross-platform, working on everything we just mentioned and more. This is a huge
9:07
deal for Steam OS in particular and Linux in general because it should bring
9:11
with it a stronger support for a wider range of titles, something Linux has
9:16
struggled with ever since, well, ever.
9:20
So, Vulcan is a big deal. Google has been using it as Android's low-level API
9:24
since 2015. And Dan Ginsburgg from Valve
9:27
has talked about it on stage and said that Vulcan is the future. Although
9:32
that's not surprising considering DirectX12 doesn't work on Steam OS and
9:37
Valve Microsoft relationshipness has
9:40
been getting a little tense lately. Moving on, there was one last thing that
9:45
they had in common. Asynchronous compute. Now, this has been a highly
9:49
controversial subject, making it unfortunately outside the scope of this
9:53
video. But if you guys want to see a similar video dedicated to it, let me
9:58
know in the comments down below. In a nutshell, though, it allows additional
10:02
lightweight work to run in parallel alongside the main graphics thread. So,
10:06
a specific lighting technique or post-process anti-aliasing method like
10:11
TSSA. Now, this dynamic scheduling introduces some challenges for NVIDIA
10:15
and AMD that did not exist in a more static ecosystem, but that's for them to
10:20
worry about and for me to maybe cover in another video. That being said, it does
10:25
seem that AMD has been pushing harder than NVIDIA, and it shows
10:30
performance-wise as AMD is currently seeing more of a benefit than NVIDIA in
10:35
the 3D Mark Time Spy benchmark, which allows you to switch asynchronous on or
10:40
off. When reviewing the numbers for these APIs, you'll notice that they're
10:45
kind of all over the place. Looking at Rise of the Tomb Raider, a DirectX game,
10:49
we can see that while it does make a significant improvement at 1080p, once
10:54
you step up a bit to 1440p or even 4K,
10:57
things seem to fall apart a little bit, and
11:01
DirectX11 actually pulls ahead. Hitman, on the other hand, did not share this
11:05
funky scaling pattern and seemed to improve when using DirectX12, or at the
11:09
very least stay the same across all the graphics cards I tested it with, which
11:13
should represent GP 104 with the GTX 1080, GP 106 with the GTX 1060 6 GB
11:19
edition for the NVIDIA side, and the power of AMD's new Polaris architecture
11:24
with the RX 480. Moving on to Ashes of
11:27
the Singularity. This game is like half benchmark, half game. So, it shows the
11:32
biggest improvement by far when using DirectX12 instead of DirectX11. And for
11:37
the Vulcan fans out there, I've heard there will be a patch coming for you as
11:41
well. One great additional piece of insight that Ash's displays at the end
11:45
of their benchmark is CPU frames versus GPU frames. What this essentially means
11:51
is how many frames your CPU could potentially push compared to what your
11:55
GPU actually pushed. So, if the CPU number is higher, you could get more
12:00
overall FPS with a graphics card upgrade. Notice that when we run the
12:05
more CPU focused version of the benchmark, the numbers are at par
12:08
because now you're intentionally CPU bottlenecked. Moving on to Vulcan. Doom
12:14
numbers for the NVIDIA side are just a
12:17
mess. For the GTX 1080, Vulcan was similar to Tomb Raider worse at 4K than
12:23
OpenGL 4.5. dropping down to 1080p,
12:26
however, and it's a whole different story with massive improvements shown.
12:30
The RX 480 does show the performance trend that we would expect, improving
12:34
considerably when running the Vulcan API and utilizing asynchronous compute
12:39
shaders. Good. So, the features sound pretty great. The performance, depending
12:44
on your configuration, is probably good, but might not be great. What about
12:49
supported games? On the DirectX12 side of things, the list seems rather
12:54
populated and there is a fair number of titles on the way that are claiming they
12:58
will support it. But this doesn't mean that DirectX12 integration hasn't been
13:03
riddled with issues. Quantum Break was a
13:06
mess. Tomb Raider's backend barely even functioned for a while to the point
13:11
where they added a warning if you enable it. and the upcoming DSX Mankind
13:17
Divided, which I think is out now. It was supposed to launch with support for
13:21
it, and that's been pulled for a while, so they can fix it up. And Vulcan side
13:25
of things is rough, too. Their support list has four items on it. One of which
13:29
is upcoming. One of which is Dota Freaking 2. Not sure about you, but I
13:34
sure needed more FPS in Dota on my DirectX12 capable hardware. Another is
13:40
the rather wonderful single player game from 2014. Possibly not that many
13:45
current players, unfortunately. And then lastly, we have Doom. A beautiful
13:50
looking game with proper built-in support and asynchronous compute
13:54
capabilities. Cool, but that's singular.
13:57
So there you go. In conclusion, the future is bright with physics and AI
14:02
heavy games being among the most exciting things we have to look forward
14:05
to, which is awesome. But the present is more a light at the end of the tunnel
14:09
situation. Get excited for new game engines that support these APIs and be
14:14
on the lookout for awesome new game engine technology that will likely arise
14:20
from the improvements made here. This
14:23
may finally signal the return of our old cores for gaming episodes. Maybe we'll
14:28
finally have an episode that doesn't end in four cores is the best. Because by
14:34
that time, your $1,700 10 core Extreme
14:37
Edition will finally matter.
14:41
Maybe. Today, we're highlighting the K7XX limited edition Ruby Red
14:46
headphones, of course, from Massdrop. They also have a bunch of other cool
14:50
things that you can buy like with other people so that the price goes down and
14:55
if more people buy it, the price goes down further till a logical minimum. And
14:59
it it's it's really cool. You should check out Massrop regardless. The
15:02
products we're showcasing today is the same specwise as the K7XX headphones
15:08
that Lionus reviewed and you guys all liked sometime last year. You can check
15:11
that video out here or all over my I don't know where it's going to be. The
15:15
real difference, however, is that this run uses red accents on the ear cups and
15:19
headband. Remember that this is a limited drop, so if you want a pair,
15:23
you're going to have to act relatively fast. These headphones were configured
15:28
by Massdrop and manufactured by AKG. Just a note for international orders, if
15:32
you're outside of the US, there's a $25 shipping and handling fee put on top.
15:37
And that's it. Check out Massdrop. Thanks for watching, guys. If this video
15:41
sucked, but if you liked it, you know what to do. So, hit the like button, get
15:45
subscribed, do all that kind of stuff. Uh, check out the link in the video
15:48
description down below where you can buy like GPUs or something. I guess we talked about those in this video. Also,
15:52
check out the link in the video description down below to buy a shirt and go to the forum to talk about
15:58
everything I said in this video. There's might be something wrong. File the crap
16:03
out of me for that and then I'll learn and that'll be good.
16:06
But, man, anyways, watch this video
16:09
which is about WTF is going with SLI because we're making a series of these
16:13
now. Although continuing to do these might kill me, so I don't know.