WEBVTT

00:00:00.240 --> 00:00:05.680
surely a shiny new pci express gen 4

00:00:03.600 --> 00:00:09.360
system is going to absolutely blow the doors off your old gen 3 one right

00:00:08.559 --> 00:00:13.920
well yes but maybe not for the reason that you would

00:00:12.960 --> 00:00:19.920
think MSI sponsored this video where we are going to be digging deep into the

00:00:17.359 --> 00:00:24.000
performance impact of turbo charging your pci express speed

00:00:22.080 --> 00:00:28.160
why did they sponsor that when their marketing so prominently features pci

00:00:26.320 --> 00:00:34.160
express gen 4 you might ask cause MSI don't give a what PCIe

00:00:30.400 --> 00:00:36.719
gen you buy as long as it's MSI so lol

00:00:34.160 --> 00:00:36.719
here we go

00:00:43.440 --> 00:00:48.320
if you've never heard of our Techquickie channel and you're wondering what the sun

00:00:51.440 --> 00:00:56.640
i think this backplate actually took the brunt of it oh nice if you've never

00:00:55.440 --> 00:01:03.440
heard of our Techquickie channel and you're wondering what the sam hill of pci express is the tldr is it's the

00:01:01.280 --> 00:01:08.560
method that modern cpus use to communicate with virtually everything

00:01:05.840 --> 00:01:13.920
else in your pc and the total bandwidth available to it for high speed graphics

00:01:11.360 --> 00:01:18.240
or storage or networking is determined by two things

00:01:15.360 --> 00:01:22.560
the number of lanes the CPU can allocate to a device and the speed at which those

00:01:21.040 --> 00:01:25.520
lanes can operate think of pci express kind of like a

00:01:24.560 --> 00:01:29.680
highway adding more lanes is challenging because

00:01:27.759 --> 00:01:35.360
it requires more interconnects which makes the components larger and more

00:01:31.759 --> 00:01:37.520
complex adding cost by contrast if you

00:01:35.360 --> 00:01:41.520
could just make all the cars drive faster that's way more economical

00:01:40.400 --> 00:01:46.240
as long as you can keep them from crashing so engineers at pci sig and its

00:01:44.240 --> 00:01:51.439
members have been working hard to improve data integrity allowing these

00:01:48.479 --> 00:01:56.720
pci express links or lanes to reach higher and higher clock speeds each

00:01:54.399 --> 00:02:01.520
generation since the first has roughly doubled the speed of each individual

00:01:58.880 --> 00:02:07.920
lane with pci express gen 4 offering nearly 2 gigabytes per second per lane

00:02:05.840 --> 00:02:14.959
that means that our featured card here an MSI GeForce rtx 3090 supreme x can be

00:02:11.440 --> 00:02:16.720
fed at nearly 32 gigabytes per second on

00:02:14.959 --> 00:02:21.120
a compliant motherboard and good thing right because it not only

00:02:18.959 --> 00:02:25.840
boasts the most powerful GPU on the market it's factory overclocked to boot

00:02:24.000 --> 00:02:31.120
so armed with only the information you have so far you'd probably think that

00:02:28.400 --> 00:02:36.000
pci express gen 4 is an absolute necessity for gamers and power users

00:02:33.519 --> 00:02:42.080
everywhere but a theory is just that a theory we need to put it to the test so

00:02:38.720 --> 00:02:45.280
we'll be using a ryzen 9 5950x and a top

00:02:42.080 --> 00:02:47.040
of the line MSI x570 godlike motherboard

00:02:45.280 --> 00:02:51.760
to remove as many other system bottlenecks as possible we're also going

00:02:48.959 --> 00:02:56.560
to throw an rx 6900xt into the mix for team red representation starting with

00:02:53.680 --> 00:03:00.879
NVIDIA our rtx 3090 pulls roughly the same numbers in shadow of the tomb

00:02:57.920 --> 00:03:05.599
raider with a slight dip in gen 3's 95th percentile minimums what that means is

00:03:03.440 --> 00:03:09.840
that the experience is generally the same except that when the game hitches

00:03:07.680 --> 00:03:13.760
or stutters it's a little bit more severe on gen 3 though

00:03:12.159 --> 00:03:19.519
we couldn't detect it with the human eye in this case in fact the difference wasn't much

00:03:16.959 --> 00:03:24.480
beyond our margin of error in any of the tests that we ran including

00:03:21.280 --> 00:03:26.959
forza horizon 4 f1 2020 and red dead

00:03:24.480 --> 00:03:31.360
redemption 2 where gen 3 frame rates are consistently a frame or two behind gen 4

00:03:29.519 --> 00:03:34.799
but no more where the pattern doesn't hold is in cs

00:03:33.840 --> 00:03:40.799
go check out that 60 FPS minimum frame rate

00:03:37.360 --> 00:03:42.799
drop and 50 FPS average loss for

00:03:40.799 --> 00:03:48.000
whatever reason this game seems to absolutely hammer the pci express bus

00:03:45.760 --> 00:03:52.159
compared to other newer titles that we tested so the faster these transfers can

00:03:50.319 --> 00:03:56.480
happen the better on AMD the story is much the same shadow

00:03:54.799 --> 00:04:01.599
of the tomb raider doesn't see any variation and we're only a frame off in

00:03:58.720 --> 00:04:06.400
forza horizon 4. f1 2020 and red dead redemption 2 also show only a modest

00:04:04.000 --> 00:04:10.560
difference on gen 4 versus gen 3 but then again

00:04:07.680 --> 00:04:17.519
look at those cs go scores we've gone from over 400 FPS average with gen 4 to

00:04:13.680 --> 00:04:20.479
just 311 in gen 3. that's a difference

00:04:17.519 --> 00:04:25.759
of almost 30 percent from these results then it's clear that these kinds of

00:04:22.479 --> 00:04:28.400
gains are rare but they are out there if

00:04:25.759 --> 00:04:32.800
you're running the right applications also we expect this speed to become more

00:04:31.040 --> 00:04:36.080
important as direct storage makes its way into pc games

00:04:34.400 --> 00:04:40.320
if you haven't heard of direct storage it is functionally similar to the way

00:04:37.919 --> 00:04:45.040
that the playstation 5 and the xbox series is allow the graphics chip to

00:04:42.800 --> 00:04:49.680
bypass other system bottlenecks and directly access your game storage drive

00:04:47.360 --> 00:04:54.240
for faster loading times and sony says with their implementation even real-time

00:04:51.840 --> 00:04:59.360
asset streaming that's still a ways off on the pc though so in the meantime is

00:04:57.120 --> 00:05:02.400
there anything that can reliably push gen 4 today

00:05:00.880 --> 00:05:06.560
productivity maybe on the NVIDIA side it starts out pretty

00:05:04.639 --> 00:05:11.680
bleak blender sees only a modest improvement in cuda mode while optics is

00:05:08.720 --> 00:05:16.560
a complete wash but then we quickly found an improvement in v-ray where we

00:05:13.759 --> 00:05:21.039
saw measurable bumps in both cuda and rtx enabled workflows davinci resolve

00:05:19.120 --> 00:05:25.280
too gives us a modest little performance increase and the same is true for adobe

00:05:22.960 --> 00:05:28.560
photoshop although that last one in particular was somewhat unexpected

00:05:27.199 --> 00:05:32.240
things start to peter out again with luxmark4 though where we're looking at a

00:05:30.320 --> 00:05:36.479
roughly three percent improvement and octane bench doesn't meaningfully

00:05:33.840 --> 00:05:41.120
improve at all as for spec view perf 2020 most of this is so close that we

00:05:39.280 --> 00:05:46.240
might as well call it a tie outside of maya's stand out

00:05:43.520 --> 00:05:50.960
one percent increase over gen 3. moving over to AMD the blender opencl result is

00:05:49.039 --> 00:05:55.680
basically the same between gen 3 and gen 4 but unlike NVIDIA davinci resolve and

00:05:53.840 --> 00:06:00.160
adobe photoshop don't see a major improvement in performance either

00:05:57.759 --> 00:06:06.319
curiously though where NVIDIA stumbled at luxmark AMD shines with a hefty 27

00:06:04.479 --> 00:06:10.880
improvement in the food render and eleven percent in the hall bench that is

00:06:09.120 --> 00:06:14.800
definitely nothing to sneeze at if you use lux core renderer

00:06:12.720 --> 00:06:18.240
finally spec view perf brings AMD back to

00:06:15.919 --> 00:06:21.199
they're the same picture territory with little to show for the extra bandwidth

00:06:19.680 --> 00:06:25.360
that gen 4 offers again maya stands out as

00:06:23.280 --> 00:06:31.280
maybe slightly better but it's nothing that you're going to want to replace your motherboard for these numbers both

00:06:28.800 --> 00:06:35.919
agree with and also contradict some of the earlier results found not only by us

00:06:33.520 --> 00:06:40.720
when we reviewed the Radeon rx 5700 series but also by gamersnexus who did a

00:06:38.720 --> 00:06:44.800
full video on gen4 performance back in september what this shows us is that

00:06:42.639 --> 00:06:48.960
there's been some quiet movement between then and now with advances in CPU

00:06:47.039 --> 00:06:55.919
performance on the platforms that support gen 4 perhaps being the biggest

00:06:52.319 --> 00:06:59.280
we used a ryzen 5950x today that is a

00:06:55.919 --> 00:07:02.000
significantly faster CPU than the 3900x

00:06:59.280 --> 00:07:06.800
that we used last time and the 3900xt that gamersnexus used in their video so

00:07:04.800 --> 00:07:10.800
could it be that Intel's upcoming rocket lake cpus could hold the key to even

00:07:08.880 --> 00:07:13.919
better gen 4 performance we will be checking it out so get

00:07:12.479 --> 00:07:18.400
subscribed to make sure you don't miss it whatever Intel has up its sleeve

00:07:15.680 --> 00:07:24.960
though pci express gen 4 and up for that matter really weren't made for consumers

00:07:21.039 --> 00:07:26.800
anyway at least not directly i mean it's

00:07:24.960 --> 00:07:31.840
great for hooking up more devices to a motherboard chipset a faster link here

00:07:29.919 --> 00:07:38.319
means faster storage ports faster network ports more usb ports etc etc

00:07:35.599 --> 00:07:42.960
but for most people the benefit of gen 4 is pretty small compared to simply

00:07:40.240 --> 00:07:47.759
having a faster CPU it's just that you can't really get a faster CPU without

00:07:45.280 --> 00:07:51.840
going gen 4. no the real reason pci express keeps getting faster is because

00:07:49.759 --> 00:07:56.080
of the data center and then we just get their trickle-down technology

00:07:54.000 --> 00:07:59.280
in a data center it's not so that they can just

00:07:57.199 --> 00:08:04.479
shove faster devices into their racks i mean that is a thing but it's more about

00:08:02.160 --> 00:08:09.919
being able to shove more devices into them and using features like bifurcation

00:08:06.879 --> 00:08:11.759
and devices like pci express switches to

00:08:09.919 --> 00:08:16.160
split the lanes if you've got faster lanes you not only

00:08:14.240 --> 00:08:20.720
can have faster devices for the same number of lanes you can have more

00:08:18.560 --> 00:08:25.440
devices with the same number of lanes without sacrificing the speed that you

00:08:22.720 --> 00:08:30.080
already have incidentally that's exactly what we're doing with our high speed

00:08:27.120 --> 00:08:36.000
NVMe storage server we've got a gen 4 carrier card that has eight gen 3 ssds

00:08:33.279 --> 00:08:39.839
on it running at full speed if we put that card into a gen 3 slot they would

00:08:38.159 --> 00:08:45.040
end up being bottlenecked because natively that slot only has enough lanes

00:08:42.560 --> 00:08:48.240
for four ssds which actually got us thinking

00:08:46.000 --> 00:08:52.560
what if i'm not just a gamer what if i want to use my system as a workstation

00:08:50.240 --> 00:08:56.399
by day and gaming rig by night and plug in like two graphics cards for some

00:08:54.560 --> 00:09:00.800
heavy workloads ah okay

00:08:57.920 --> 00:09:05.440
so some motherboards like our MSI x570 godlike are capable of bifurcating pci

00:09:03.519 --> 00:09:10.560
express lanes that means they can take a 16 lane slot like this one up here and

00:09:08.080 --> 00:09:14.720
split those lanes out to multiple physical slots allowing more than one

00:09:12.880 --> 00:09:19.440
device to share the bandwidth that otherwise would have gone to a single

00:09:16.880 --> 00:09:24.240
device to find out if this helps we installed a second rtx 3090 in our

00:09:21.920 --> 00:09:29.680
system and ran our benchmarks at least the ones that benefit from dual GPU in

00:09:26.560 --> 00:09:30.880
gen 3 and gen 4 modes again

00:09:29.680 --> 00:09:34.800
and turns out that there's not much benefit

00:09:32.800 --> 00:09:39.040
here right now either the few benchmarks where there were

00:09:36.640 --> 00:09:43.519
improvements with gen 4 were v-ray by about two or three percent and luxmark

00:09:41.600 --> 00:09:48.160
by about a single percentage point that's it a little anticlimactic but if

00:09:46.399 --> 00:09:52.720
you think about it these kinds of workloads aren't going to scale linearly

00:09:50.399 --> 00:09:56.640
in terms of bandwidth required anyway and where we'd be more likely to see a

00:09:54.320 --> 00:10:00.800
significant improvement is in crunching deep learning data sets but

00:09:58.959 --> 00:10:05.519
that's a workload that does skew quite a bit closer to the data center than to

00:10:02.720 --> 00:10:11.279
the desktop also in the data center they might have quite a lot more than two

00:10:07.680 --> 00:10:13.200
devices per 16 lanes anyway don't fret

00:10:11.279 --> 00:10:17.360
it took a while for gen 3 to be fully utilized when that took over from gen 2

00:10:15.200 --> 00:10:20.640
as well so now that AMD has had a generation of support and Intel is

00:10:19.279 --> 00:10:24.560
bringing it with their upcoming rocket lake cpus we are headed in the right

00:10:22.720 --> 00:10:28.240
direction just in time for direct storage to start making proper use of it

00:10:27.120 --> 00:10:34.320
we hope and besides a few percent improvement is

00:10:31.200 --> 00:10:37.519
nothing to sneeze at you start with some

00:10:34.320 --> 00:10:39.920
gen 4 PCIe toss in some resizable bar

00:10:37.519 --> 00:10:43.839
support sprinkle on faster RAM and a pinch of community optimizations and all

00:10:42.399 --> 00:10:47.360
of a sudden you've got a pretty compelling generational upgrades so you

00:10:46.079 --> 00:10:53.519
know what i say keep it coming just like i keep the sponsors coming big

00:10:51.440 --> 00:10:57.920
thanks to MSI for providing us all the equipment and sponsoring this deep dive

00:10:55.519 --> 00:11:01.839
into pci express gen 3 vs gen 4 performance go check them out we used

00:11:00.160 --> 00:11:06.320
their top end stuff here but their gaming x trio cards are also great for

00:11:04.079 --> 00:11:12.480
performance and they have a wide range of motherboards x570 b550 to satisfy

00:11:09.519 --> 00:11:16.720
your ryzen fix oh and of course their z490 boards are already wired for gen 4

00:11:15.200 --> 00:11:20.320
when rocket lake launches so we're going to have all that linked down below

00:11:18.800 --> 00:11:24.959
thanks for watching guys if you enjoyed this one maybe go check out our video on

00:11:22.640 --> 00:11:29.680
the server that liquid sent us for a taste of just how fast pci express can

00:11:27.760 --> 00:11:32.320
go when you load it up with enough devices
