WEBVTT

00:00:00.000 --> 00:00:04.560
It's here, it is ripping fast, and it's $2,000 US dollars.

00:00:04.560 --> 00:00:07.800
But in exchange for your least favorite of kidneys,

00:00:07.800 --> 00:00:11.960
NVIDIA promises that their brand new GeForce RTX 5090

00:00:11.960 --> 00:00:17.480
will deliver a level of performance that obliterates their only real competition.

00:00:17.480 --> 00:00:20.560
NVIDIA, more GPU cores, boom.

00:00:20.560 --> 00:00:23.760
More VRAM, and faster VRAM, boom, boom.

00:00:23.760 --> 00:00:31.800
Enhanced RT and AI cores, wider memory bus, and PCIe Gen 5, boom, boom, boom.

00:00:31.800 --> 00:00:37.000
On top of that, NVIDIA has packed in a deep learning super shed load of new features that I would

00:00:37.000 --> 00:00:41.560
love to tell you about. But unless NVIDIA also invented AI teeth extraction,

00:00:41.560 --> 00:00:46.520
I won't be able to. So instead, I leave this review in your capable hands.

00:00:46.520 --> 00:00:47.400
See you later.

00:00:49.920 --> 00:00:52.920
I got it, I got it, I got it. First up, graphics performance.

00:00:52.920 --> 00:00:54.000
Let's cut to the chase.

00:00:56.520 --> 00:01:04.760
Let's get right to raw gaming results.

00:01:04.760 --> 00:01:08.360
No ray tracing, no upscaling, and we're starting with 1440p.

00:01:08.360 --> 00:01:13.000
Across our suite of games at 1440p, the 5090 never falls flat on its face, obviously,

00:01:13.000 --> 00:01:16.120
but still manages to be underwhelming.

00:01:16.120 --> 00:01:21.520
In the Vulcan-based Red Dead Redemption 2, we see less than a 10% improvement over the 4090,

00:01:21.520 --> 00:01:24.880
and that lackluster uplift is repeated in F124.

00:01:24.920 --> 00:01:30.960
More problematic is the embarrassingly small 30% lead over the 7900XTX, which frequently

00:01:30.960 --> 00:01:34.520
goes for a little over two-fifths of the price.

00:01:34.520 --> 00:01:40.040
Oof. And Returnal doesn't bring better news. But as we move on to newer, more graphically intensive games,

00:01:40.040 --> 00:01:44.640
the 5090 does start to pull away from the pack. In the gorgeous thriller, Alan Wake 2,

00:01:44.640 --> 00:01:52.520
it beats the 3090 Ti by more than double and looks great in Blacksmith Wukong, beating the 4090 by 27%.

00:01:52.560 --> 00:01:56.800
Cyberpunk is another strong point compared to the previous generations, but as low as they are

00:01:56.800 --> 00:02:02.240
on the chart, it's worth noting AMD's strong performance per dollar in this game, at least when ray tracing

00:02:02.240 --> 00:02:07.960
isn't enabled. We'll get to that later. For now, this might be obvious, but if you are a 1440p player,

00:02:07.960 --> 00:02:11.560
the 5090 is overkill with the current crop of CPUs.

00:02:11.560 --> 00:02:15.960
If you're on the latest 9800X3D, you might find that the 5090 exerts a little bit more

00:02:15.960 --> 00:02:19.000
of a commanding lead, but I think that anyone with this setup

00:02:19.000 --> 00:02:22.720
should be putting their money into a new monitor rather than a new CPU.

00:02:22.720 --> 00:02:27.320
Let's move on to 4K testing, where the CPU bottlenecks are less likely to rear their ugly heads.

00:02:27.320 --> 00:02:30.400
In Cyberpunk, the 5090 is the first card

00:02:30.400 --> 00:02:33.560
to ever crack triple digits at our ultra preset,

00:02:33.560 --> 00:02:37.520
scoring a 30 FPS lead over the 4090.

00:02:37.520 --> 00:02:42.440
In Alan Wake 2, the story remains largely the same, offering a noticeable difference in playability

00:02:42.440 --> 00:02:47.040
compared to any previous flagship. Blacksmith Wukong at cinematic settings, however,

00:02:47.040 --> 00:02:50.520
is the Everest-like summit, where even the 5090

00:02:50.560 --> 00:02:56.440
falls short of 60 FPS average. Perhaps an overclocking Sherpa could get us to the peak,

00:02:56.440 --> 00:03:02.000
but that's a subject for another day. In Red Dead Redemption 2, the 5090 does not impress,

00:03:02.000 --> 00:03:06.160
especially when you consider its price. And in F124, the 5090 continues to operate

00:03:06.160 --> 00:03:11.360
at that kind of level of performance that no one else can touch. The problem is that for all the hype,

00:03:11.360 --> 00:03:18.320
the performance bump is roughly on par with the price bump, making the 5090 look less like a truly next-generation product

00:03:18.360 --> 00:03:22.040
and more like a 4090 Super GT Zikaiburkei.

00:03:22.040 --> 00:03:27.360
But what's the deal? I thought that Blackwell was supposed to be the giant leap forward with all that flip metering

00:03:27.360 --> 00:03:30.800
and neural rendering and increased rate triangle intersections.

00:03:30.800 --> 00:03:34.800
What the fuck? Whoa, whoa, whoa, whoa, all right, hold on. That's a lot of words,

00:03:34.800 --> 00:03:37.920
but to understand how they're going to impact performance,

00:03:37.920 --> 00:03:41.480
and they will, we need to understand what they mean.

00:03:41.480 --> 00:03:45.360
See, Blackwell brings so many new enhancements

00:03:45.400 --> 00:03:50.000
that NVIDIA marketing doesn't even call it a GPU architecture.

00:03:50.000 --> 00:03:54.200
No, it's called a neural rendering architecture.

00:03:54.200 --> 00:03:58.240
So what is that? As far as we can tell,

00:03:58.240 --> 00:04:02.200
it's about equal parts, genuine innovation and marketing fluff.

00:04:02.200 --> 00:04:07.520
We'll start with the innovation. Up until this point, NVIDIA's AI accelerating tensor cores

00:04:07.520 --> 00:04:12.480
could not be accessed by a graphics API like Vulkan or DirectX.

00:04:12.480 --> 00:04:18.040
But through collaboration with Microsoft, DirectX now has the Cooperative Vectors API,

00:04:18.040 --> 00:04:21.440
which means that gamers can use neural shaders.

00:04:21.440 --> 00:04:26.400
Unlike typical shaders, this allows geometry to be imbued with extra properties.

00:04:26.400 --> 00:04:30.200
But now that extra property could be a small neural network

00:04:30.200 --> 00:04:34.840
that could generate more geometry or help ease ray tracing calculations.

00:04:34.840 --> 00:04:40.000
For instance, mega geometry. This one allows for real time generation

00:04:40.000 --> 00:04:44.640
of level of detail steps without requiring any normal maps.

00:04:44.640 --> 00:04:49.400
Think of like UE5's Nanite, which helps ease jarring LOD change effects

00:04:49.400 --> 00:04:52.880
and saves developer time, but with, you know, AI.

00:04:52.880 --> 00:04:57.240
To take advantage of these features, NVIDIA loaded the 5090 with the hardware it needs

00:04:57.240 --> 00:05:00.880
to accelerate them. It's got fifth gen tensor cores,

00:05:00.880 --> 00:05:07.440
which drastically reduces memory usage for simpler AI models that don't require high precision.

00:05:07.440 --> 00:05:10.960
As for the non-AI stuff, we get upgraded fourth gen RT cores,

00:05:10.960 --> 00:05:16.480
which now double the ray-triangle intersection rate with just 75% of the memory footprint.

00:05:16.480 --> 00:05:21.280
And as for the regular old CUDA cores, well, those just don't seem to have changed very much.

00:05:21.280 --> 00:05:24.640
So far, the 5090 has managed a best case scenario

00:05:24.640 --> 00:05:31.720
of about 30% faster than its predecessor, seemingly entirely thanks to the 33% higher GPU core camp.

00:05:31.720 --> 00:05:35.160
This, combined with their reuse of TSMC's 4N process node

00:05:35.160 --> 00:05:40.080
from last gen, explains why the new chip is so big and why NVIDIA had to sacrifice some clock speed

00:05:40.120 --> 00:05:44.440
to keep their yields, and therefore pricing still attainable to the 1%.

00:05:44.440 --> 00:05:49.720
GDDR7, on the other hand, is kind of a big deal. It boasts double the data rate of GDDR6

00:05:49.720 --> 00:05:54.400
while using half as much power per bit. This is an enlarged part thanks to the shift

00:05:54.400 --> 00:05:59.100
to PAM3 signaling. PAM, short for pulse amplitude modulation,

00:05:59.100 --> 00:06:02.520
is akin to how we store data in multi-level cell flash storage.

00:06:02.520 --> 00:06:05.800
GDDR6 uses PAM4, meaning that each clock

00:06:05.800 --> 00:06:09.520
can be encoded for four different states, rather than just two.

00:06:09.520 --> 00:06:15.120
But it came with a big trade-off, the error rate, since the signals are so similar in amplitude

00:06:15.120 --> 00:06:18.720
that sometimes they can be hard to tell apart, especially when there's interference.

00:06:18.720 --> 00:06:23.240
PAM3 improves the situation by just trying to handle three states instead of four,

00:06:23.240 --> 00:06:27.320
giving a little bit more room between each of them. This improves signal integrity,

00:06:27.320 --> 00:06:30.480
allowing GDDR7 to run at higher frequency

00:06:30.480 --> 00:06:37.040
while consuming less power to make up for the trade-offs. And let's not forget that we finally got 32 gigs of VRAM.

00:06:37.040 --> 00:06:42.040
This will be a huge jump for AIDarks, and maybe gamers someday.

00:06:42.040 --> 00:06:46.960
But there are some other cool things, like NVIDIA's new ninth-gen N-Vanc hardware video encoders,

00:06:46.960 --> 00:06:50.160
which support higher-quality 422 10-bit video.

00:06:50.160 --> 00:06:53.320
This, for the right people, is a huge deal,

00:06:53.320 --> 00:06:57.960
and might make Blackwell a must-have upgrade. And for the folks out there who own monitors,

00:06:57.960 --> 00:07:03.520
hi, Ploof, we finally get a card that can actually take advantage of DP 2.1 UHBR20,

00:07:03.520 --> 00:07:07.280
a new DisplayPort standard that can drive 4K 240Hz

00:07:07.280 --> 00:07:11.280
without display-stream compression. And all of this while talking to your computer

00:07:11.280 --> 00:07:16.940
at PCIe Gen 5. It's 2025, and ray-tracing is no longer an afterthought,

00:07:16.940 --> 00:07:20.680
or even a choice in some cases, with the new name of Jones being the first game

00:07:20.680 --> 00:07:24.580
to outright require support. So let's talk about it.

00:07:24.580 --> 00:07:30.000
For RT testing, we use the highest settings, starting at 1440, and I want to get this out of the way.

00:07:30.000 --> 00:07:34.200
AMD does not ray-trace well. Now, in fact, 2 makes for a very playable experience

00:07:34.200 --> 00:07:37.600
on the 5090, with 1% lows well above 60 FPS.

00:07:37.600 --> 00:07:41.580
Numbers, it can't quite hit yet on the absolutely brutal Blackmith Wukong,

00:07:41.580 --> 00:07:45.440
though it is playable, unlike the poor 7900XTX.

00:07:45.440 --> 00:07:50.800
Ouch! In Cyberpunk, the 5090 has just a 20% lead over the 4090,

00:07:50.800 --> 00:07:54.820
but compared to the 4080 Super, it maintains its price-to-performance ratio,

00:07:54.820 --> 00:07:58.480
which I generally consider to be pretty darn acceptable for a Halo-class card.

00:07:58.520 --> 00:08:03.320
In the lightly ray-traced F124, AMD comes back to life a little,

00:08:03.320 --> 00:08:07.520
performing well against the 4080 Super, and the same can be said for Returnal,

00:08:07.520 --> 00:08:12.520
but there's no question that the 5090 is king for RT at 1440P,

00:08:13.600 --> 00:08:19.160
with a crown that only gets more dazzling at 4K. Blackmith Wukong falls below what we consider playable

00:08:19.160 --> 00:08:23.000
for an intense action game. Don't worry, we'll talk about AI upscaling later,

00:08:23.000 --> 00:08:28.320
because first, dang, look at this thing! Maintaining performance in the 50s at these settings,

00:08:28.480 --> 00:08:32.160
this economy? Dang, NVIDIA, that's pretty impressive.

00:08:32.160 --> 00:08:35.300
And if you care more about absolute cinema than framerate,

00:08:35.300 --> 00:08:40.480
well, it holds above 30 FPS in Alan Wake 2, which should go great with your popcorn.

00:08:40.480 --> 00:08:44.080
F124 and Returnal are similar stories as the 1440P results,

00:08:44.080 --> 00:08:48.880
just with more pixels and fewer FPS. All of this taken together means we're looking at

00:08:48.880 --> 00:08:53.600
a greater than 30% lead over last gen at a 25% higher price,

00:08:53.600 --> 00:08:57.020
meaning the new RT cores are providing some benefit,

00:08:57.020 --> 00:09:01.420
but it's pretty small compared to the impact of NVIDIA just plunking in more of them.

00:09:01.420 --> 00:09:04.540
This is obviously a downer compared to the good old days

00:09:04.540 --> 00:09:09.140
when we used to get yearly GPU refreshes with dramatic improvements to performance per dollar.

00:09:09.140 --> 00:09:13.840
But it's clear that unless cutting-edge semiconductor manufacturing miraculously gets cheaper,

00:09:13.840 --> 00:09:18.940
those days are never coming back. So if we compare this more to, say,

00:09:18.940 --> 00:09:23.240
adding a second card in SLI, a feature NVIDIA no longer supports,

00:09:23.240 --> 00:09:29.540
then the glass half-full-take is, hey, at least it costs less than $240.90s.

00:09:29.540 --> 00:09:32.900
But NVIDIA still has some tricks up their sleeve.

00:09:32.900 --> 00:09:37.060
Holy heck, that's one tight leather jacket. How the heck did you fit that stuff in those sleeves?

00:09:37.060 --> 00:09:41.820
With DLSS4 and multi-frame gen, by making use of flip metering and swapping them

00:09:41.820 --> 00:09:46.020
from convolution neural networks to transformer-based models.

00:09:46.020 --> 00:09:49.900
There's a lot of words again, lots of words again. Let's break it down.

00:09:49.900 --> 00:09:53.980
DLSS4 is NVIDIA's latest suite of AI enhancements,

00:09:53.980 --> 00:09:58.260
and it's the biggest change in years. Previous versions of DLSS included

00:09:58.260 --> 00:10:01.460
a convolutional neural network or CNN.

00:10:01.460 --> 00:10:04.940
A CNN can be thought of as a series of filters

00:10:04.940 --> 00:10:08.740
that look for specific details. When used for image processing,

00:10:08.740 --> 00:10:15.220
one layer could be looking for vertical edges, one for horizontal edges, and one for contrast, et cetera.

00:10:15.220 --> 00:10:19.980
The neural network then observes the results from the filters and can use that information

00:10:19.980 --> 00:10:24.340
to identify things. Like if an image contains a dog or a stop sign

00:10:24.340 --> 00:10:27.340
and seems convoluted, well, it literally is.

00:10:27.340 --> 00:10:30.860
On 4,000 series GPUs, this information was combined

00:10:30.860 --> 00:10:34.780
with an optical flow accelerator that interpreted the motion in the scene

00:10:34.780 --> 00:10:38.460
to upscale or generate frames. So why the switch?

00:10:38.460 --> 00:10:41.940
Scaling. Each filter can only scan and compute

00:10:41.940 --> 00:10:46.340
a small number of pixels at a time. When you have millions of pixels,

00:10:46.340 --> 00:10:50.140
dozens of times per second, increasing performance can be tough.

00:10:50.140 --> 00:10:54.940
So DLSS uses a new transformer model, which, as NVIDIA explains,

00:10:54.940 --> 00:11:00.140
allows them to evaluate the relative importance of each pixel across an entire frame

00:11:00.140 --> 00:11:04.580
and over multiple frames to achieve a deeper understanding of the scenes

00:11:04.580 --> 00:11:08.580
that offers greater stability, reduced ghosting, higher detail in motion,

00:11:08.580 --> 00:11:11.780
and smoother edges. They also scale better,

00:11:11.780 --> 00:11:16.660
which is part of why they have become so heavily used in things like large language models.

00:11:16.660 --> 00:11:22.500
The transformer is the T in chat GPT, so while a CNN can see this picture and say,

00:11:22.500 --> 00:11:25.820
there is a cat, there is a product from LTTstore.com.

00:11:25.820 --> 00:11:31.340
A transformer might say, there is a cat enjoying the premium CRT-themed peck cave

00:11:31.340 --> 00:11:36.380
from LTTstore.com. However, while a transformer can process

00:11:36.380 --> 00:11:40.340
complete images faster, it does require more data for training,

00:11:40.340 --> 00:11:46.100
and honestly, with the side-by-side comparisons, it is tough to tell the difference in image quality

00:11:46.100 --> 00:11:49.500
between the two models, like really tough.

00:11:49.500 --> 00:11:53.540
There are clear benefits in specific areas that NVIDIA points out,

00:11:53.540 --> 00:11:56.580
things like fences, power lines, and barbed wires,

00:11:56.580 --> 00:12:01.620
but there's still obvious artifacts when dealing with semi-transparent objects,

00:12:01.620 --> 00:12:07.300
or just very busy scenes. A lot of the artifacts are different than DLSS3,

00:12:07.300 --> 00:12:11.660
but are still present. On the bright side, at least on high-end cards,

00:12:11.660 --> 00:12:15.900
the transformer models don't show a substantial performance hit.

00:12:15.900 --> 00:12:19.300
Enough theory, I wanna talk about MF-ing G.

00:12:19.300 --> 00:12:23.580
Multi-frame Gen is perhaps the most game-changing tech landing with these new cards.

00:12:23.580 --> 00:12:27.940
Like the previous version of Framegen, it uses AI to generate in-between frames

00:12:27.940 --> 00:12:32.700
based on optical flow data and rendered frames, but Multi-frame Gen now allows users

00:12:32.700 --> 00:12:35.980
to generate up to three in-betweens, rather than just one,

00:12:35.980 --> 00:12:39.300
boosting frame rates to up to four times native.

00:12:39.300 --> 00:12:43.980
Does it work? Well, according to our charts, yes, very yes.

00:12:43.980 --> 00:12:49.420
The numbers double, triple, and quadruple, and make the 5090 look absolutely ridiculous,

00:12:49.420 --> 00:12:52.620
at least in the charts. But as big as that bar is,

00:12:52.620 --> 00:12:56.420
the real frames haven't changed. So what's the deal?

00:12:56.420 --> 00:13:01.540
Well, MFG's pretty wild. DLSS3 Framegen required specific optical flow

00:13:01.540 --> 00:13:05.380
accelerating hardware on GPUs, and combined that with the game data,

00:13:05.380 --> 00:13:09.660
like depth and motion vectors to generate in-between frames. And it was an okay solution,

00:13:09.660 --> 00:13:13.900
but you had to have two bits of hardware processing each frame,

00:13:13.900 --> 00:13:17.520
and that's just inefficient, and could even cause the GPU to throttle,

00:13:17.520 --> 00:13:20.620
resulting in a lower base frame rate to multiply off of.

00:13:20.660 --> 00:13:24.380
That's why you didn't just see a straight doubling of frame rate when you turned Framegen on.

00:13:24.380 --> 00:13:28.500
The 5090 and multi-FrameGen is Shu Aida's optical flow accelerator,

00:13:28.500 --> 00:13:32.620
and instead utilized tightly integrated Tensor and CUDA cores in Blackwell

00:13:32.620 --> 00:13:35.660
to run a lightweight AI optical flow model.

00:13:35.660 --> 00:13:39.980
Not hardware accelerated, it's just an AI model. This means that single Framegen

00:13:39.980 --> 00:13:43.980
should now run 40% faster while using 30% less VRAM.

00:13:43.980 --> 00:13:48.700
We know it works, but how does it look? Well, it depends on who you ask.

00:13:48.700 --> 00:13:51.820
If you want to see your FPS number much higher then it works great.

00:13:51.820 --> 00:13:56.900
Nothing else sort of hacking your FPS counter will let you get nearly 600 FPS in Cyberpunk.

00:13:56.900 --> 00:14:00.860
But visually, it's not perfect, and Framegen weirdness still persists.

00:14:00.860 --> 00:14:05.260
Look at the combing on these crosswalks in Cyberpunk, or this bottle phasing in and out in the benchmark,

00:14:05.260 --> 00:14:09.140
or the doubling of the fan blades in these large HVAC units. In Alan Wake 2,

00:14:09.140 --> 00:14:12.900
there's obvious artifacting around the player model and around the edge of your flashlight,

00:14:12.900 --> 00:14:17.500
which is sadly exactly where you will be looking 100% of the time.

00:14:17.500 --> 00:14:22.100
And it's worth noting that these artifacts are not present when we're just using DLSS for upscaling.

00:14:22.100 --> 00:14:26.780
Curiously, while Cyberpunk and Alan Wake were both updated with explicit support for multi Framegen,

00:14:26.780 --> 00:14:30.780
the feature can be forced in any DLSS3 single Framegen

00:14:30.780 --> 00:14:36.340
supported games through the NVIDIA driver. And the game that fared the best was Dragon Age of Veilguard.

00:14:36.340 --> 00:14:39.740
The world's full of magic, so who's to say exactly what a vortex of shadow

00:14:39.740 --> 00:14:44.220
is supposed to look like? And since it already ran at a solid 70 FPS

00:14:44.220 --> 00:14:48.820
with all the settings cranked, no Framegen, it kept input latency manageable,

00:14:48.820 --> 00:14:52.860
which is really important, because if your base FPS is only 30 frames,

00:14:52.860 --> 00:14:57.220
well, Framegen will make it look smooth, but you'll observe many visual anomalies,

00:14:57.220 --> 00:15:01.060
and latency is still dictated by your true frame rate,

00:15:01.060 --> 00:15:05.460
meaning that the game feels far less responsive than it looks like it should be.

00:15:05.460 --> 00:15:09.460
The good news, says NVIDIA, is that MFG at least isn't adding any latency

00:15:09.460 --> 00:15:13.500
compared to the base frame rate, but we felt like it would be a good idea to verify that.

00:15:13.500 --> 00:15:17.700
And verify we did, using our trusty LDAT, our click to photon test results

00:15:17.700 --> 00:15:22.260
showed that Framegen does not increase latency over native with reflex on.

00:15:22.260 --> 00:15:25.300
In fact, it seems to actually lower the latency slightly.

00:15:25.300 --> 00:15:28.420
It doesn't really make sense to us, so we're gonna chalk that up to sampling error,

00:15:28.420 --> 00:15:32.580
but if there is any effect on latency, it's so minor that it's not noticeable

00:15:32.580 --> 00:15:36.180
compared to our total system latency, and that is mighty impressive.

00:15:36.180 --> 00:15:40.260
But the issue remains, you can't beat your base frame rate's latency,

00:15:40.260 --> 00:15:44.860
and when it's disconnected from the motion you see on screen, it almost makes it worse.

00:15:44.860 --> 00:15:48.340
Perhaps the situation will improve with reflex too, but that's not here yet,

00:15:48.340 --> 00:15:51.500
so we aren't gonna dwell on it, even if the tech is really cool.

00:15:51.500 --> 00:15:55.620
Our take on multi-frame gen right now is that it has the same key flaws before.

00:15:55.620 --> 00:15:58.740
It's a win more feature that works the best

00:15:58.740 --> 00:16:03.420
when it makes the least sense to use, which means it is definitely not the silver bullet

00:16:03.420 --> 00:16:06.940
that NVIDIA's graphs make it out to be. With all this power,

00:16:06.940 --> 00:16:11.820
it would make a lot more sense to use the 5090 to make some money, right? So let's talk productivity.

00:16:11.820 --> 00:16:16.340
Hey, you might not know me, but I do this kind of thing and this kind of thing.

00:16:16.340 --> 00:16:19.340
And NVIDIA's new architecture has benefits for me too.

00:16:19.340 --> 00:16:22.660
The encoder and decoders provide support for 422 chroma sampling,

00:16:22.660 --> 00:16:27.740
which will make working with high-end video files much faster, especially for multi-camera video edits.

00:16:27.740 --> 00:16:31.220
The encoders also provide better quality at smaller file sizes.

00:16:31.220 --> 00:16:36.460
Sadly, we can't verify that for you today as we are currently re-evaluating our encoding benchmarks,

00:16:36.460 --> 00:16:40.220
but the new media engine is almost certainly playing a role in Puget Bench,

00:16:40.220 --> 00:16:43.540
where we see a nice 9% bump in Premiere Pro performance

00:16:43.540 --> 00:16:47.500
and an even nicer, nearly 20% improvement DaVinci Resolve

00:16:47.500 --> 00:16:51.140
when compared to the 4090. In Blender, NVIDIA has us considering

00:16:51.140 --> 00:16:55.140
finding a new benchmark as the 5090 has brought Barbershop render times

00:16:55.140 --> 00:16:59.100
to less than half a minute, more than double the speed of the 3090 Ti.

00:16:59.100 --> 00:17:03.820
Nice. Overwrought editing transition here. Double the 3090 Ti, you say?

00:17:03.820 --> 00:17:08.900
AI nerds, rejoice! If you're like me and for your sake, I hope you're not,

00:17:08.900 --> 00:17:12.380
you've been dying for NVIDIA to release a new 32-gig consumer card.

00:17:12.380 --> 00:17:17.700
So with all the bragging NVIDIA's been doing about AI tops, I'm expecting some big numbers.

00:17:17.700 --> 00:17:22.340
And in the Procyon text benchmarks, what the fuck? Number is not big.

00:17:22.340 --> 00:17:27.260
Sure, the 5090 is still the best card on the charts, but I was expecting more than this.

00:17:27.260 --> 00:17:30.620
We'll see roughly 20 to 30% improvement over the 4090,

00:17:30.620 --> 00:17:35.180
depending on the benchmark, and 60 to 70% over the 3090 Ti.

00:17:35.180 --> 00:17:40.060
I can see why they keep talking about AI tops and not specific performance.

00:17:40.060 --> 00:17:44.580
In ML Perf, the story remains largely the same in the time to first token

00:17:44.580 --> 00:17:50.020
and the token generation rate benchmarks. For image generation, our preferred Procyon benchmark

00:17:50.020 --> 00:17:54.620
doesn't support the 5090 yet. So we tested using the benchmark provided by NVIDIA.

00:17:54.620 --> 00:18:00.260
Prepare your salt grains for the taking. In the Procyon Flux FP8 image generation,

00:18:00.260 --> 00:18:03.780
the 5090 leads by a margin in line with the rest of our benchmarks.

00:18:03.780 --> 00:18:08.500
But when we switch to FP4 precision, the 5090 shows how powerful the native hardware support

00:18:08.500 --> 00:18:13.420
can be, taking less than one quarter of the time to generate images compared to the 4090.

00:18:13.420 --> 00:18:17.900
We'd love to see how this fairs against the 3090 Ti, but this NVIDIA provided benchmark

00:18:17.900 --> 00:18:21.440
doesn't support older cards. I was expecting AI to be the place

00:18:21.440 --> 00:18:24.620
where this card really shines, but I guess potential buyers will have to settle

00:18:24.620 --> 00:18:29.140
for just having the best consumer friendly grade card for AI.

00:18:29.140 --> 00:18:34.460
We'll call it the nifty feinty. Somehow there is still more review to go.

00:18:34.460 --> 00:18:39.660
Good thing this is so digestible and uncomplicated so far, right? Blackwell's main efficiency improvements over ADA

00:18:39.660 --> 00:18:42.660
seem to come from what they're calling Max-Q functionality.

00:18:42.660 --> 00:18:46.220
It boils down to a few small but significant changes.

00:18:46.220 --> 00:18:50.360
Improvements to power gating, thanks to an additional power rail and improved logic

00:18:50.360 --> 00:18:54.080
allows more of the GPU to switch to a low power state more rapidly.

00:18:54.080 --> 00:18:57.660
In CPU bound or frame cap scenarios, this could help save some power,

00:18:57.660 --> 00:19:01.540
especially on mobile chips. But in GPU bound full load scenarios,

00:19:01.580 --> 00:19:06.520
this monster will absolutely draw its fully rated 575 watts,

00:19:06.520 --> 00:19:10.940
including transient spikes of as high as 637 watts.

00:19:10.940 --> 00:19:14.900
And you can see even in real world gaming, it will pull space heater levels of power

00:19:14.900 --> 00:19:18.620
with an average of 554 watts in F124.

00:19:18.620 --> 00:19:22.460
And to manage all that power, they had to make a unique cooler design.

00:19:22.460 --> 00:19:26.100
While everyone else was making four slot behemoths, NVIDIA built something very different.

00:19:26.100 --> 00:19:30.660
And man, does it look good. Not only is it classy, it's also innovative.

00:19:30.700 --> 00:19:34.700
Over ambitious even. The main board of the video card is just the middle section

00:19:34.700 --> 00:19:39.500
and the outputs in PCIe connector are on daughter boards connected via what they call a flexible PCB.

00:19:39.500 --> 00:19:44.220
It's like a stiff ribbon cable, I guess. This allows for a double flow through design

00:19:44.220 --> 00:19:50.380
with fans blowing through dense heat sinks. Anecdotally, the fans run quiet, too quiet even.

00:19:50.380 --> 00:19:54.660
So like the 4090, most of the time, the loudest thing about the card will be its coil wine.

00:19:54.660 --> 00:20:00.420
It's not the worst we've heard, but it's noticeable. And if you expected it to run cooler than the 4090 founders,

00:20:00.420 --> 00:20:05.580
it doesn't. But given how much smaller it is, not to mention the enormous thermal load it's dealing with,

00:20:05.580 --> 00:20:10.060
I'd say it's doing a great job. But the new cooler style brings new build considerations.

00:20:10.060 --> 00:20:13.540
We know that flow through coolers can have a noticeable impact on CPU temps,

00:20:13.540 --> 00:20:17.300
especially for those using tower heat sinks. So as a double flow through, double bad,

00:20:17.300 --> 00:20:21.340
we took the 5090 and the 4090 FE and put them in a Corsair 4000D.

00:20:21.340 --> 00:20:24.500
And even with both running at 450 Watts to control the experiment,

00:20:24.500 --> 00:20:29.100
our poor Noctua NHD15 saw CPU temps that are roughly three degrees higher

00:20:29.140 --> 00:20:34.300
in both synthetic and gaming workloads. A CPU cooler upgrade, perhaps an intake-mounted radiator,

00:20:34.300 --> 00:20:38.220
could be in order for some folks. Wow, that was a lot to talk about.

00:20:38.220 --> 00:20:41.500
So what's our conclusion? Well, if you're a professional game developer

00:20:41.500 --> 00:20:45.780
or any other professional, or basically anyone who can use the new performance

00:20:45.780 --> 00:20:48.820
and especially the new features to make money,

00:20:48.820 --> 00:20:52.900
it's a no-brainer. As for the gamers, well, if the 4090 was stupid,

00:20:52.900 --> 00:20:57.540
stupid price, but stupid performance, the 5090 is stupider.

00:20:57.540 --> 00:21:03.660
It provides 30-ish percent more performance by using 33% more hardware and 30-ish percent more power,

00:21:03.660 --> 00:21:08.980
and it gets a roughly 25% price increase. It's a 4090 plus plus.

00:21:08.980 --> 00:21:13.060
On the one hand, you could look at this and say, wow, this doesn't look good

00:21:13.060 --> 00:21:16.380
for the rest of the 50 series lineup, but it's also worth considering

00:21:16.380 --> 00:21:19.380
that the 4090 was kind of an outlier for 40 series,

00:21:19.380 --> 00:21:24.340
offering a huge boost over its predecessor with the rest offering smaller upgrades.

00:21:24.340 --> 00:21:28.820
So if you're on 30 series and still not in the mega, mega baller income bracket,

00:21:28.820 --> 00:21:34.180
I guess all we can do is wait and see. See if Lydus has the strength to do the segue.

00:21:34.180 --> 00:21:39.180
Who are you? If you like this video,

00:21:39.180 --> 00:21:43.540
check out the one you get on ACCS chronging,

00:21:43.540 --> 00:21:45.740
watch this good nugget for now.

00:21:46.980 --> 00:21:49.660
I wish I could eat a nugget right now.