WTF is going on with DX12 and Vulkan?

Linus Tech Tips ·Linus Tech Tips ·2017-05-06 · 2,833 words · ~14 min read
Floatplane YouTube

Transcript

JSON SRT VTT 217
0:00 No Doom Vulcan benchmarks, eh? No DX12 or Vulcan. Oh, I see you guys followed
0:03 NVIDIA's review guide to a te. Sad to see Linus Tech Tips benchmarking APIs,
0:07 which is close to its end of life. What about RX480 Vulcan performance? Fine,
0:12 let's talk about the new APIs, DirectX12 and Vulcan. How they work, how they
0:16 perform, what they mean for the future, everything.
0:29 FreshBooks is the super simple invoicing solution that lets you get organized,
0:33 save time, and get paid faster. Click now to try it for free. Now, what is an
0:38 API or application programming interface? In layman's terms, it's a
0:42 facilitator. It handles requests similarly to a waiter at a restaurant
0:46 taking orders. You can think of the customer as system A. System A wants to
0:51 access some functionality or information that system B, which we'll call the
0:56 kitchen, controls. Now, system B doesn't
0:59 want to just let system A directly access all of its information. There are
1:04 security and liability issues here. But
1:07 not only that, having all these different customers trying to talk to
1:11 the cook directly unless timed
1:14 nearperfectly has the potential to be more confusing than helpful. Enter the
1:19 waiter or in this analogy, the API. The waiter can serve the customer with a
1:23 menu of subruine definitions, protocols, and tools that the customer can use to
1:28 achieve his task. The customer then sends that request through the waiter to
1:33 the kitchen where system B will act out those instructions and then send a
1:38 response back to system A through the API. Clean, simple, and delicious.
1:44 Theoretically, this is how it works, but they can be more bloated and therefore
1:49 slow. Have you ever been to a restaurant that has like four different menus, one
1:53 for just all the different kinds of freaking water with many pages
1:56 containing a multitude of items which are essentially the same damn thing?
2:01 Then you can relate to the frustration of not being able to find what you're
2:05 looking for at all. Do you ever just want to throw the damn menus all in the
2:10 air, stomp over to the kitchen, and scream, "Can you just render me a
2:13 chicken burger and some yam fries? Is that too much to ask?"
2:18 Okay, let's come back to that later. First, what are they used for? Lots of
2:22 things. APIs can be used for web-based systems, operating systems, databases,
2:26 software libraries, or in this case, computer hardware. But as important and
2:31 ubiquitous as they are, they're not something the average user will ever
2:35 interface with. So, why do we suddenly care so much about DirectX12 and Vulcan
2:41 as selling points for games and video cards? Well, hype machine aside,
2:45 DirectX11 and OpenGL, the predecessors to the shiny new graphics APIs, are
2:50 rather bureaucratic. Like the complicated menu analogy, but worse
2:55 because there are even more degrees of abstraction. There are many systems and
2:59 devices talking to each other at a time, passing instructions around with many of
3:04 them being redundant or outdated. It's
3:08 just not as efficient. And there are other problems. For instance, one of
3:13 your processor cores carries the burden of managing the vast majority of all of
3:19 the critical time-sensitive tasks. One of the main ways that the new graphics
3:23 APIs are more efficient is that they can use previously untapped hardware
3:27 resources. Multi-core CPUs are a great example here. Look at this relatively
3:32 realistic hypothetical scenario given to us by AMD. Thanks, guys. They show that
3:38 OpenGL stacks all of the work of presenting to the user and processing
3:43 the OpenGL driver onto core 1 along with
3:46 a large portion of processing the OpenGL runtime and the largest portion of game
3:52 code processing. This leaves little for cores 2 to four to work on, almost
3:56 nothing for five and six to work on, and literally nothing for seven and 8. And
4:01 for your reference, DirectX11 works in a very similar manner. Now, let's look at
4:05 Vulcan. Though this is generally applicable to both of the new generation
4:09 APIs, one of your processing cores is still responsible for a bigger part of
4:13 the work than any of the others, but overall utilization is noticeably
4:18 higher, even all the way down to core 8. Now, this is a hypothetical example, so
4:22 I'm not going to spend time analyzing the exact improvements to the
4:26 millisecond, but in a nutshell, Vulcan and Direct X12 do a much better job
4:31 overall of sharing the load across all cores. Another key objective is to shed
4:36 old layers of unneeded abstraction. They've done this by simplifying
4:40 protocol routes, minimizing graphical driver overhead, focusing heavily on
4:44 preventing timeheavy draw calls sent linearly to a single core of the CPU
4:50 instead of parallelizing draw call packages in order to ease up on CPU load
4:55 and let the GPU function as it should. Kind of like multiple waiters attending
5:00 to the needs of each diner at a table instead of having one waiter with many
5:05 plates on his ARM, dropping them off one at a time. Another key is that game
5:09 developers can talk more directly to the GPU hardware. This is what is meant by
5:14 the term lower level API. Think of it like having a manual transmission in a
5:19 car instead of an automatic. But to all the gear heads out there watching this,
5:23 don't just assume that all your DirectX11 and OpenGL games are outdated
5:28 junk at this point. Bear in mind that poor operation of a manual transmission
5:33 can be worse than an automatic or simply
5:36 not better. Just because a game has
5:39 Vulcan or DirectX12 support doesn't mean that efficiency and performance
5:44 automatically improve. And this is true especially for games that were released
5:49 with it early on. Though of course there are a few exceptions. For the best
5:55 results, the new APIs need to be part of
5:58 the core design of not only the game itself, but the underlying engine as
6:03 well. And you do the math. It can take easily two, three, or even more years to
6:08 make a good AAA title. And these APIs
6:12 haven't even been available for that long. Meaning the implementation of
6:15 these APIs was likely added to tick a marketing checkbox, appease a longtime
6:20 partner, or hopefully, and very likely,
6:24 just to gain the valuable experience that you can only get by working
6:27 directly on them. But when more true DirectX12 and Vulcan titles drop, this
6:33 level of control for developers has the potential to be awesome, but a little
6:37 scary, too, because we're putting a lot of responsibility on the game and more
6:42 specifically the game engine developers. And this is a major shift. So, while I
6:46 just spent like four minutes telling you how interesting and important the new
6:51 APIs are, in reality, the truly
6:54 interesting component of the equation is going to be the new game engines that
6:58 arise because of the freedoms afforded to them by the APIs, not necessarily the
7:04 APIs themselves. And from game engine creators who care, the John Carmax and
7:09 Tim Sweenies of the world, you'll get amazing new features unlocked by this
7:13 additional performance and flexibility. But back to the scary bit from before,
7:19 we're also putting this greater degree of control and the responsibility that
7:23 comes with it in the hands of gaming companies who tell us things like 30fps
7:29 is a good thing or that implement game physics effects that are reliant on the
7:34 frame rate. That hasn't been a thing since like Intel 46, and that was
7:38 arguable back then. That's inexcusable trash. Back to the APIs, though, since
7:43 that's what we're really here for. So far, we've focused on similarities, but
7:47 how do DX12 and Vulcan differ from each other? Let's start with DirectX12. A big
7:52 focus for Microsoft this time around has been the introduction of their new LDA
7:57 and MDA modes for multiple graphics cards. For a rundown on how those work,
8:02 check out this video on WTF is going on with SLI. Along with this, Microsoft's
8:07 vision to allow developers to support a mixture of graphics card models and even
8:12 brands so the user can get as much power as possible out of their available
8:16 hardware. Think of this more in terms of using your onboard graphics to get a
8:21 little bit more oomph than a compelling reason to build this monstrosity. One of
8:26 the ways this could work is split frame rendering or SFR. This is when a portion
8:31 of the screen is rendered by one GPU and the rest is rendered by a second. But
8:36 we'll have to wait and see how well this performs in the real world. So that's
8:41 cool. But being a Microsoft product, DirectX12 will only function on Windows
8:45 10 and Xbox. So Linux, OSX, Android
8:48 enthusiasts, and actually even people who are still on Windows 7 and Windows 8
8:52 are left out in the cold. Which brings us to Vulcan. Vulcan, brought to us by
8:57 the Kronos Group, is the primary successor to OpenGL and is proudly
9:02 cross-platform, working on everything we just mentioned and more. This is a huge
9:07 deal for Steam OS in particular and Linux in general because it should bring
9:11 with it a stronger support for a wider range of titles, something Linux has
9:16 struggled with ever since, well, ever.
9:20 So, Vulcan is a big deal. Google has been using it as Android's low-level API
9:24 since 2015. And Dan Ginsburgg from Valve
9:27 has talked about it on stage and said that Vulcan is the future. Although
9:32 that's not surprising considering DirectX12 doesn't work on Steam OS and
9:37 Valve Microsoft relationshipness has
9:40 been getting a little tense lately. Moving on, there was one last thing that
9:45 they had in common. Asynchronous compute. Now, this has been a highly
9:49 controversial subject, making it unfortunately outside the scope of this
9:53 video. But if you guys want to see a similar video dedicated to it, let me
9:58 know in the comments down below. In a nutshell, though, it allows additional
10:02 lightweight work to run in parallel alongside the main graphics thread. So,
10:06 a specific lighting technique or post-process anti-aliasing method like
10:11 TSSA. Now, this dynamic scheduling introduces some challenges for NVIDIA
10:15 and AMD that did not exist in a more static ecosystem, but that's for them to
10:20 worry about and for me to maybe cover in another video. That being said, it does
10:25 seem that AMD has been pushing harder than NVIDIA, and it shows
10:30 performance-wise as AMD is currently seeing more of a benefit than NVIDIA in
10:35 the 3D Mark Time Spy benchmark, which allows you to switch asynchronous on or
10:40 off. When reviewing the numbers for these APIs, you'll notice that they're
10:45 kind of all over the place. Looking at Rise of the Tomb Raider, a DirectX game,
10:49 we can see that while it does make a significant improvement at 1080p, once
10:54 you step up a bit to 1440p or even 4K,
10:57 things seem to fall apart a little bit, and
11:01 DirectX11 actually pulls ahead. Hitman, on the other hand, did not share this
11:05 funky scaling pattern and seemed to improve when using DirectX12, or at the
11:09 very least stay the same across all the graphics cards I tested it with, which
11:13 should represent GP 104 with the GTX 1080, GP 106 with the GTX 1060 6 GB
11:19 edition for the NVIDIA side, and the power of AMD's new Polaris architecture
11:24 with the RX 480. Moving on to Ashes of
11:27 the Singularity. This game is like half benchmark, half game. So, it shows the
11:32 biggest improvement by far when using DirectX12 instead of DirectX11. And for
11:37 the Vulcan fans out there, I've heard there will be a patch coming for you as
11:41 well. One great additional piece of insight that Ash's displays at the end
11:45 of their benchmark is CPU frames versus GPU frames. What this essentially means
11:51 is how many frames your CPU could potentially push compared to what your
11:55 GPU actually pushed. So, if the CPU number is higher, you could get more
12:00 overall FPS with a graphics card upgrade. Notice that when we run the
12:05 more CPU focused version of the benchmark, the numbers are at par
12:08 because now you're intentionally CPU bottlenecked. Moving on to Vulcan. Doom
12:14 numbers for the NVIDIA side are just a
12:17 mess. For the GTX 1080, Vulcan was similar to Tomb Raider worse at 4K than
12:23 OpenGL 4.5. dropping down to 1080p,
12:26 however, and it's a whole different story with massive improvements shown.
12:30 The RX 480 does show the performance trend that we would expect, improving
12:34 considerably when running the Vulcan API and utilizing asynchronous compute
12:39 shaders. Good. So, the features sound pretty great. The performance, depending
12:44 on your configuration, is probably good, but might not be great. What about
12:49 supported games? On the DirectX12 side of things, the list seems rather
12:54 populated and there is a fair number of titles on the way that are claiming they
12:58 will support it. But this doesn't mean that DirectX12 integration hasn't been
13:03 riddled with issues. Quantum Break was a
13:06 mess. Tomb Raider's backend barely even functioned for a while to the point
13:11 where they added a warning if you enable it. and the upcoming DSX Mankind
13:17 Divided, which I think is out now. It was supposed to launch with support for
13:21 it, and that's been pulled for a while, so they can fix it up. And Vulcan side
13:25 of things is rough, too. Their support list has four items on it. One of which
13:29 is upcoming. One of which is Dota Freaking 2. Not sure about you, but I
13:34 sure needed more FPS in Dota on my DirectX12 capable hardware. Another is
13:40 the rather wonderful single player game from 2014. Possibly not that many
13:45 current players, unfortunately. And then lastly, we have Doom. A beautiful
13:50 looking game with proper built-in support and asynchronous compute
13:54 capabilities. Cool, but that's singular.
13:57 So there you go. In conclusion, the future is bright with physics and AI
14:02 heavy games being among the most exciting things we have to look forward
14:05 to, which is awesome. But the present is more a light at the end of the tunnel
14:09 situation. Get excited for new game engines that support these APIs and be
14:14 on the lookout for awesome new game engine technology that will likely arise
14:20 from the improvements made here. This
14:23 may finally signal the return of our old cores for gaming episodes. Maybe we'll
14:28 finally have an episode that doesn't end in four cores is the best. Because by
14:34 that time, your $1,700 10 core Extreme
14:37 Edition will finally matter.
14:41 Maybe. Today, we're highlighting the K7XX limited edition Ruby Red
14:46 headphones, of course, from Massdrop. They also have a bunch of other cool
14:50 things that you can buy like with other people so that the price goes down and
14:55 if more people buy it, the price goes down further till a logical minimum. And
14:59 it it's it's really cool. You should check out Massrop regardless. The
15:02 products we're showcasing today is the same specwise as the K7XX headphones
15:08 that Lionus reviewed and you guys all liked sometime last year. You can check
15:11 that video out here or all over my I don't know where it's going to be. The
15:15 real difference, however, is that this run uses red accents on the ear cups and
15:19 headband. Remember that this is a limited drop, so if you want a pair,
15:23 you're going to have to act relatively fast. These headphones were configured
15:28 by Massdrop and manufactured by AKG. Just a note for international orders, if
15:32 you're outside of the US, there's a $25 shipping and handling fee put on top.
15:37 And that's it. Check out Massdrop. Thanks for watching, guys. If this video
15:41 sucked, but if you liked it, you know what to do. So, hit the like button, get
15:45 subscribed, do all that kind of stuff. Uh, check out the link in the video
15:48 description down below where you can buy like GPUs or something. I guess we talked about those in this video. Also,
15:52 check out the link in the video description down below to buy a shirt and go to the forum to talk about
15:58 everything I said in this video. There's might be something wrong. File the crap
16:03 out of me for that and then I'll learn and that'll be good.
16:06 But, man, anyways, watch this video
16:09 which is about WTF is going with SLI because we're making a series of these
16:13 now. Although continuing to do these might kill me, so I don't know.