WEBVTT

00:00:00.000 --> 00:00:03.320
Your PC probably has a slot that looks like this.

00:00:03.320 --> 00:00:08.200
It's a PCI Express X16 slot, and it's commonly used for graphics cards,

00:00:08.200 --> 00:00:14.640
as it's one of the fastest connections on your motherboard. Even the chonkiest, most powerful GPUs can be handled

00:00:14.640 --> 00:00:20.240
by a single one of these slots, no problem. But there is another.

00:00:20.240 --> 00:00:23.640
Did you know that PCI Express X32

00:00:23.640 --> 00:00:29.400
is a real thing that actually exists? What the heck would you do with a slot that's twice as long?

00:00:29.400 --> 00:00:33.600
Well, to be clear, it's very difficult to find slots physically bigger

00:00:33.600 --> 00:00:37.360
than that standard X16 size. Remember that those numbers,

00:00:37.360 --> 00:00:41.440
when we're talking about PCI Express, refer to the number of lanes,

00:00:41.440 --> 00:00:48.320
not necessarily the physical size of the slot. A tiny M.2 SSD often uses four PCI Express lanes,

00:00:48.320 --> 00:00:53.800
even though the connector is quite a bit smaller than a regular PCIe X4 sized slot.

00:00:53.800 --> 00:00:57.360
How exactly does PCI Express X32 work then?

00:00:57.360 --> 00:01:03.480
But it turns out that the PCI Express standard doesn't support a single link greater than X16.

00:01:03.480 --> 00:01:09.200
The reason for this is because it's very difficult to actually implement links this wide in hardware.

00:01:09.200 --> 00:01:12.200
You see, when you send data down a PCI Express link,

00:01:12.200 --> 00:01:17.040
say to a graphics card, that data is striped across multiple lanes,

00:01:17.040 --> 00:01:21.280
and doing this is not trivial. When the data arrives wherever it's going,

00:01:21.280 --> 00:01:25.040
it has to be deskewed, meaning synchronized.

00:01:25.040 --> 00:01:29.800
Although PCI Express is a serial interface that doesn't require the data on each lane

00:01:29.800 --> 00:01:35.880
to arrive at exactly the same time, synchronizing the data still involves hardware overhead.

00:01:35.880 --> 00:01:40.160
And once you start getting past 16 lanes, it's just too much to keep up with.

00:01:40.160 --> 00:01:45.120
So instead of having one big slot, these higher PCI Express connections

00:01:45.120 --> 00:01:50.400
use a trick called driver binding. Essentially, this allows multiple PCIe devices

00:01:50.400 --> 00:01:53.720
to talk to each other and coordinate their traffic

00:01:53.760 --> 00:02:01.000
so they can act as one big device. So a PCIe X32 link is actually two X16 links

00:02:01.440 --> 00:02:07.440
mashed together in software, with the devices installed in two normal X16 slots.

00:02:07.440 --> 00:02:12.720
The performance overhead involved with driver binding for X32 isn't too bad.

00:02:12.720 --> 00:02:16.580
But if you were to theoretically go up to say X64,

00:02:16.580 --> 00:02:20.440
you'd likely need more hardware, as at that point you just have too many transactions

00:02:20.440 --> 00:02:24.640
for your poor system to handle. But wait a sec, why the heck would you need

00:02:24.640 --> 00:02:29.840
that much bandwidth to begin with? Now the appeal of having this many PCI Express lanes

00:02:29.840 --> 00:02:33.000
isn't so you can do something like squeeze more performance

00:02:33.000 --> 00:02:38.400
out of your graphics card. Even an RTX 4090 barely improves going from eight lanes

00:02:38.400 --> 00:02:41.400
to 16 lanes on a PCIe 4.0 slot.

00:02:41.400 --> 00:02:47.560
Instead, driver binding is used in applications where all the bandwidth you can get is appealing.

00:02:47.560 --> 00:02:51.160
You often see X32 links in certain kinds of networking cards,

00:02:51.200 --> 00:02:57.200
mostly for server and data center use. Although NVIDIA is obviously known more for GPUs and AI,

00:02:57.200 --> 00:03:00.680
they make a line of network adapters that supports both Ethernet

00:03:00.680 --> 00:03:04.640
and another high speed networking protocol called Infiniband.

00:03:04.640 --> 00:03:09.360
This product, for example, goes into a standard PCIe Express X16 slot,

00:03:09.360 --> 00:03:13.520
but it comes with a second auxiliary card with the same connector.

00:03:13.520 --> 00:03:18.360
They work in tandem to provide an X32 connection for extra bandwidth in case you're using

00:03:18.360 --> 00:03:22.160
a previous revision of PCIe Express that doesn't provide enough bandwidth

00:03:22.160 --> 00:03:27.360
to take full advantage of the network adapter's speed. There's also a special cable connecting them

00:03:27.360 --> 00:03:33.000
to allow them to share data while taking some of the pressure off of the PCIe Express bus itself.

00:03:33.000 --> 00:03:37.920
And connecting multiple machines with crazy amounts of bandwidth isn't even a rare use case.

00:03:37.920 --> 00:03:40.960
With how much growth we've seen with data hungry cloud services

00:03:40.960 --> 00:03:44.840
for applications like AI, gaming, and ultra high def video,

00:03:44.840 --> 00:03:50.400
data centers are already moving towards 400 gigabit connections or even beyond.

00:03:50.400 --> 00:03:53.600
That's 400 times faster than what you'll find

00:03:53.600 --> 00:03:59.720
in most garden variety desktop PCs, which is also part of why you simply don't need

00:03:59.720 --> 00:04:02.720
PCIe Express X32 in your personal rig.

00:04:02.720 --> 00:04:07.040
Although I'm sure some of you are already thinking of ways you're gonna justify your purchase.

00:04:07.040 --> 00:04:11.280
Hey, thanks for watching this video. Like it if you liked it. Dislike it if you disliked it.

00:04:11.280 --> 00:04:14.680
Check out our other video on PCIe Express 6.0.

00:04:14.680 --> 00:04:19.560
Comment below with video suggestions and don't forget to subscribe and follow to TechWiki,

00:04:19.560 --> 00:04:23.320
the channel that's all about doing the tech real quick.
