WEBVTT

00:00:00.000 --> 00:00:09.660
what gets FPS gpus what gets more FPS

00:00:06.299 --> 00:00:11.340
more gpus obviously everybody thinks so

00:00:09.660 --> 00:00:16.619
and manufacturers have been working on this technology since the late 90s so

00:00:13.920 --> 00:00:19.859
then what the heck happened to it I'm gonna let you know a little secret

00:00:17.760 --> 00:00:23.279
come here hide

00:00:21.720 --> 00:00:27.060
future now that might seem like a bold

00:00:25.140 --> 00:00:31.740
prediction given that NVIDIA just completely removes the multi-GPU links

00:00:29.580 --> 00:00:36.480
from their latest generation even on the premium workstation cards but there is a

00:00:34.320 --> 00:00:42.239
clear course set for better performance through multi-GPU Graphics Solutions and

00:00:40.020 --> 00:00:46.680
the most bizarre part is that it's Apple who ultimately did it to properly see

00:00:44.820 --> 00:00:53.219
where we're going sometimes we have to visit the past which is why we got our

00:00:48.960 --> 00:00:57.180
hands on this legendary piece of Kit the

00:00:53.219 --> 00:01:00.539
quantum 3D Mercury brick is an 8 GPU

00:00:57.180 --> 00:01:02.460
design from 25 years ago that has a lot

00:01:00.539 --> 00:01:07.200
more in common with your future gaming rig than you probably realize like you

00:01:05.460 --> 00:01:12.060
realize I'm about to tell you about our sponsor zoho1 unify all your business

00:01:10.080 --> 00:01:16.200
processes into one comprehensive platform and improve productivity with

00:01:14.159 --> 00:01:20.460
Zoho one you can lay all concerns about integration and compatibility to rest

00:01:18.420 --> 00:01:24.740
follow the link below for a free 30-day trial Our Story begins today with the

00:01:22.619 --> 00:01:28.080
quantum 3D obsidian 100sb-440v that Ross tregamba

00:01:26.880 --> 00:01:33.060
thoughtfully brought along when we shot our Voodoo 5 6000 video the name might

00:01:30.900 --> 00:01:38.159
not sound like much but it's a great example of something that the original

00:01:34.740 --> 00:01:41.880
3dfx Voodoo one was capable of even if

00:01:38.159 --> 00:01:44.579
it rarely did it dual GPU operation in

00:01:41.880 --> 00:01:49.079
this case on a single board now these ones were intended for desktop PCS but

00:01:47.100 --> 00:01:54.899
the primary purpose of cards like this was to drive arcade machines more on

00:01:52.259 --> 00:02:00.479
that later because first I want to talk about the brick this extremely rare

00:01:58.140 --> 00:02:07.320
piece of computing history uses a four card setup with each card equipped with

00:02:03.360 --> 00:02:08.880
two Voodoo 2 gpus that is eight gpus

00:02:07.320 --> 00:02:13.980
running all together on the same workload now I know what you're thinking

00:02:11.720 --> 00:02:19.020
that looks like it would have a tough time fitting in a PC that also needs

00:02:16.319 --> 00:02:25.140
dedicated cards for sound networking and yes 2D Graphics to run the desktop

00:02:22.020 --> 00:02:27.480
because the voodoo 2 was 3D only and

00:02:25.140 --> 00:02:31.739
you'd be right about that but again more on that later for now

00:02:29.220 --> 00:02:35.099
it's time to see this thing in action I was going to be the one to install the

00:02:33.360 --> 00:02:40.020
brick in this simulator chassis from 1999 but understandably the owner didn't

00:02:37.560 --> 00:02:44.879
trust me to do it without dropping it so um I didn't but we've got some b-roll of

00:02:42.180 --> 00:02:49.140
how it goes in it's wild I mean all four cards need to go into their own PCI

00:02:46.739 --> 00:02:54.000
slots then it's got to be wired up to a VGA out and in then it's got to be wired

00:02:52.080 --> 00:02:57.480
up to a VGA pass-through at the back of the chassis then these two cables are

00:02:56.099 --> 00:03:03.360
wired up to the back for a feature called swap lock that allows more than

00:03:00.239 --> 00:03:07.019
one of these than twenty thousand dollar

00:03:03.360 --> 00:03:08.940
machines to be utilized at once the rest

00:03:07.019 --> 00:03:12.959
of these specs are high end even if relatively mundane we've got dual

00:03:10.800 --> 00:03:18.900
Pentium 3500 megahertz processors what appears to be an Intel first party

00:03:15.300 --> 00:03:21.060
extended motherboard a mere 425 watt

00:03:18.900 --> 00:03:25.860
power supply from the legendary PC power and cooling we've got 256 Megs of RAM a

00:03:24.000 --> 00:03:31.019
gigantic sound card that's as big as some gpus today and then a ton of

00:03:28.739 --> 00:03:35.760
cooling is especially for the time dual 92 millimeter fans and then a couple

00:03:33.420 --> 00:03:40.560
more 80 mils that blow directly down onto the heatsinks over the gpus with

00:03:37.920 --> 00:03:43.459
this what looks very handmade shroud let's fire it up

00:03:43.680 --> 00:03:46.680
oh

00:03:46.799 --> 00:03:53.220
this is part of the simulator experience it needs to sound like it's taking off

00:03:51.299 --> 00:03:57.000
wait there's no event for those 80 millimeter fans

00:03:55.379 --> 00:04:02.819
they're just yeah it's got enough spacing in there for just okay all right

00:04:00.239 --> 00:04:06.659
Windows NT let's go we have a few options for things we can run we could

00:04:04.620 --> 00:04:09.959
play games on it which obviously is the first thing I would have done if I

00:04:07.920 --> 00:04:12.959
worked at a military installation in one of these showtimes how fast does it run

00:04:11.700 --> 00:04:16.739
halfway but we actually also have some simulator

00:04:15.120 --> 00:04:22.139
software cool it's so weird seeing a Sierra

00:04:18.780 --> 00:04:24.540
splash screen when you fire up half-life

00:04:22.139 --> 00:04:33.800
you mean valve right right about okay all the geometry what the what that

00:04:29.580 --> 00:04:36.419
stop stop buddy check out the pixelation

00:04:33.800 --> 00:04:41.340
whoa the first thing I noticed was that compared to the voodoo 5 6000 the frame

00:04:38.820 --> 00:04:47.460
rate is obviously lower what I didn't think to look at

00:04:42.900 --> 00:04:52.080
is house move it is what the

00:04:47.460 --> 00:04:54.360
this was possible in 1999

00:04:52.080 --> 00:04:59.400
I think that's functionally 16 times anti-aliasing this looks

00:04:57.300 --> 00:05:03.000
modern cleaner than modern

00:05:01.259 --> 00:05:09.000
it looks like a much higher resolution than it is that is wild

00:05:06.479 --> 00:05:13.620
like I'm specifically going out of my way to put these straight lines at just

00:05:11.820 --> 00:05:18.060
you know ever so slight an angle so that I can see the stairs

00:05:15.300 --> 00:05:22.380
it's just not there it really makes the crappy old texture stand out doesn't it

00:05:19.979 --> 00:05:27.479
oh yeah like if this had the horsepower to run modern games with modern

00:05:24.419 --> 00:05:29.720
resolution textures it would look so

00:05:27.479 --> 00:05:29.720
good

00:05:30.500 --> 00:05:33.500
blah blah

00:05:34.139 --> 00:05:40.620
blah blah got him it's Hardware level anti-aliasing okay that's pretty wild I

00:05:39.180 --> 00:05:44.759
think I do kind of need to see it without the hardware anti-aliasing then

00:05:42.960 --> 00:05:49.800
okay so how do you uh is that something you can change while the game is running

00:05:46.320 --> 00:05:52.740
yes oh no way yeah so no wait wait what

00:05:49.800 --> 00:05:57.600
so this is a Medusa cable uh-huh so it's got a 26 spin guy here two bgas so what

00:05:56.100 --> 00:06:03.240
we're gonna do is we're gonna bypass this by just plugging into one of the

00:06:00.300 --> 00:06:08.940
boards individually so in this case each board is a dual GPU GPU and the other

00:06:06.840 --> 00:06:14.100
three boards are just contributing to more anti-aliasing correct yeah so the

00:06:12.120 --> 00:06:18.600
bridge is basically combining the image it's basically four instances of voodoo

00:06:16.440 --> 00:06:25.440
to SLI okay basically what it is so I'll get the same FPS correct but it'll look

00:06:22.020 --> 00:06:27.360
like crap exactly

00:06:25.440 --> 00:06:34.740
we're both going pretty much the same place slightly different stages I can confirm

00:06:31.800 --> 00:06:39.720
anti-aliasing no longer working now it still looks smoother than it

00:06:36.960 --> 00:06:44.460
would on an LCD just because CRTs have they're not really pixels they're more

00:06:42.000 --> 00:06:48.120
like dots so they're they're not squared at the edges

00:06:45.600 --> 00:06:52.860
but you can really see the difference here oh look especially on objects that

00:06:51.060 --> 00:06:57.960
are farther away where the level of detail is lower like those picture

00:06:54.300 --> 00:06:59.759
frames yeah that's a big yikes

00:06:57.960 --> 00:07:05.160
and the shimmering as I'm moving back and forth that is a big difference not

00:07:03.780 --> 00:07:09.900
the kind of difference the average home user would pay for but a difference

00:07:07.440 --> 00:07:14.180
nonetheless let's try some simulation stuff it's a non-interactive right I

00:07:12.419 --> 00:07:18.800
guess I wouldn't have whatever control cockpit they would use for something

00:07:16.259 --> 00:07:18.800
like this

00:07:20.039 --> 00:07:23.539
and that is smooth

00:07:23.639 --> 00:07:30.300
60fps in 1999

00:07:27.840 --> 00:07:35.759
that's the thing we got to remember hold on I'm not I'm not done I'm not done

00:07:32.639 --> 00:07:38.900
checking out how smooth the terrain

00:07:35.759 --> 00:07:41.520
looks like yeah we're seeing that

00:07:38.900 --> 00:07:47.039
whether that's like a frame pacing issue or whether that's just a vsync issue

00:07:45.240 --> 00:07:52.500
isn't it pretty incredible how well the AAA Works in this thing

00:07:48.740 --> 00:07:54.960
anti-aliasing is unreal

00:07:52.500 --> 00:07:58.440
like look at these particle effects like yeah they're not amazing or

00:07:56.940 --> 00:08:02.639
whatever but they exist

00:08:01.020 --> 00:08:08.220
there's just no other way to put it this machine is mind-blowing but what's most

00:08:05.759 --> 00:08:13.979
mind-blowing about it at least to me is that the maker of the brick Quantum 3D

00:08:10.740 --> 00:08:15.900
still exists as a producer of simulation

00:08:13.979 --> 00:08:19.620
hardware and software if you remember the Evans and Sutherland Sim Fusion card

00:08:18.240 --> 00:08:22.860
that we looked at a couple of years ago these guys probably would have been in

00:08:21.479 --> 00:08:29.580
competition with that at some point going back to the days of the brick though they sometimes sold enhanced

00:08:26.759 --> 00:08:34.979
versions of 3dfx's single GPU cards to enthusiasts with extras like additional

00:08:32.520 --> 00:08:39.000
video memory and texture units but their main business was to act as 3dfx's

00:08:37.020 --> 00:08:43.200
arcade and simulation division until they were spun off as their own company

00:08:40.500 --> 00:08:47.339
in 1997. with that history it makes a lot of sense that they had the know-how

00:08:44.700 --> 00:08:52.140
to create Exotic Solutions like these out of voodoo gpus the first known use

00:08:49.980 --> 00:08:57.240
of a Quantum 3D card was Ice Home Run Derby in 1996 and a lot of high-profile

00:08:55.320 --> 00:09:01.800
games like San Francisco Rush would go on to use them too these were not cheap

00:08:59.640 --> 00:09:06.720
solutions ending up insist is costing anywhere from over a thousand to

00:09:04.019 --> 00:09:10.440
thousands to in excess of ten thousand dollars but while it would have been

00:09:08.580 --> 00:09:14.580
Unthinkable for a home user to spend more than a few hundred dollars on a

00:09:12.360 --> 00:09:19.620
gaming device at the time at the high end Quantum 3DS devices were used for

00:09:16.740 --> 00:09:23.580
military simulation and even down at the low end they could be expected to

00:09:21.060 --> 00:09:26.940
deliver tens of thousands of dollars of revenue for their owners quarter by

00:09:25.200 --> 00:09:31.560
quarter over the lifetime of the machine and it's that lifetime that was one of

00:09:29.580 --> 00:09:36.899
their big advantages because before Quantum 3D it was quite common for

00:09:33.839 --> 00:09:40.080
arcade games to use bespoke Hardware or

00:09:36.899 --> 00:09:42.720
at best a platform like the Neo Geo by

00:09:40.080 --> 00:09:47.220
comparison Quantum 3DS platform was more or less a PC in all but form factor

00:09:45.060 --> 00:09:51.420
which made changing over to a new game even one that requires a hardware

00:09:48.959 --> 00:09:55.920
upgrade much simpler and more cost effective than it had been in the past

00:09:52.860 --> 00:09:58.740
often game data was even stored on a bog

00:09:55.920 --> 00:10:03.300
standard IDE hard drive and because the Glide API that the gpus ran on was

00:10:00.899 --> 00:10:09.240
readily available and had a similar syntax to opengl games were much easier

00:10:06.300 --> 00:10:13.260
to prototype and develop then as now the hardware is only part of the battle and

00:10:11.339 --> 00:10:17.160
the real work is in making it easier for developers to get the most out of it and

00:10:15.180 --> 00:10:21.540
get the most out of it they did the additional power on tap from Quantum 3D

00:10:19.500 --> 00:10:27.660
cards could be used in one of two ways either as a traditional SLI setup where

00:10:24.540 --> 00:10:30.120
the additional GPU or gpus could be used

00:10:27.660 --> 00:10:34.140
to push resolutions or frame rates higher than they would be able to

00:10:31.440 --> 00:10:39.740
otherwise or as was common in the simulator space to have multiple cards

00:10:36.600 --> 00:10:42.240
generate the same image but with rotated

00:10:39.740 --> 00:10:46.440
anti-aliasing patterns this was a way for them to Output ridiculously smooth

00:10:44.220 --> 00:10:50.399
Graphics in the absence of resolution that was high enough to prevent aliasing

00:10:48.300 --> 00:10:55.260
that is how the Mercury brick would have been used and this was actually very

00:10:52.860 --> 00:11:00.240
important because in a simulator hard pixel edges can be very distracting

00:10:57.360 --> 00:11:06.420
unlike our new circuit desk pad on LTT Store stealthy AF now of course with the

00:11:03.480 --> 00:11:10.740
demise of 3dfx came the demise of SLI as the world knew it then but that didn't

00:11:08.459 --> 00:11:14.399
mean that companies like Quantum 3D and Evans in Sutherland were left without

00:11:12.300 --> 00:11:18.839
Hardware to sell multi-GPU development was seen as essential in the Sim Space

00:11:16.680 --> 00:11:23.160
and continued in Earnest even while there wasn't yet an official way to make

00:11:20.820 --> 00:11:28.140
it work for major players be they NVIDIA or ATI in fact Alienware even had a

00:11:26.459 --> 00:11:32.940
project in the works to run multiple gpus before New Age SLI and Crossfire

00:11:31.260 --> 00:11:37.260
came along when those Technologies finally hit the scene though multi-GPU

00:11:35.040 --> 00:11:40.800
came back to the masses in a big way with some very interesting

00:11:38.579 --> 00:11:45.839
implementations early Crossfire for example required a master GPU that

00:11:43.980 --> 00:11:50.220
stitched together the output from multiple gpus rendering different parts

00:11:48.120 --> 00:11:54.180
of the scene with an external dongle this method is called split frame

00:11:52.140 --> 00:11:58.500
rendering and is similar in concept to how old school SLI used to work it

00:11:56.700 --> 00:12:03.000
wasn't ideal though even aside from the hardware complexity this setup limited

00:12:00.540 --> 00:12:06.660
the maximum resolution there was also software Crossfire at the time which

00:12:04.800 --> 00:12:11.579
allowed dongle free operation directly over the PCI Express bus but PCIe wasn't

00:12:09.899 --> 00:12:16.740
as fast then as it is now and it was only really suitable for low end gpus

00:12:13.680 --> 00:12:18.839
newage SLI and Crossfire X meanwhile

00:12:16.740 --> 00:12:23.579
used Bridges between the Cards themselves which is actually quite

00:12:20.940 --> 00:12:27.959
similar to 3dfx's SLI the difference though was that rather than stitching

00:12:25.620 --> 00:12:32.160
together scan lines or quadrants of the screen the cards could communicate with

00:12:29.760 --> 00:12:37.440
each other over the bridge and divide up their work by each rendering alternating

00:12:34.380 --> 00:12:40.920
frames leading to you guessed it a

00:12:37.440 --> 00:12:44.579
return to multi-GPU cards like the GTX

00:12:40.920 --> 00:12:46.680
590 and Radeon 6990 even three-way and

00:12:44.579 --> 00:12:51.360
4-way configurations were possible using multiple Bridges these setups were

00:12:48.839 --> 00:12:55.440
gargantuan and power hungry but they represented the best performance you

00:12:52.980 --> 00:12:59.459
could get at the time and unlike in the past they were available to the general

00:12:57.180 --> 00:13:03.660
consumer now while it's obviously simpler to implement the alternate frame

00:13:01.800 --> 00:13:07.740
rendering approach unfortunately led to the micro stuttering that we became

00:13:05.279 --> 00:13:12.480
familiar with in the 2010s since each GPU might complete its frame in a

00:13:10.200 --> 00:13:16.260
slightly different amount of time the pacing of frame delivery could be all

00:13:14.399 --> 00:13:20.940
over the place resulting in impressive benchmarks but a really poor gaming

00:13:18.660 --> 00:13:24.839
experience some attempts were made to smooth out frame delivery NVIDIA for

00:13:23.100 --> 00:13:28.200
example released a high bandwidth version of the SLI bridge that allowed

00:13:26.579 --> 00:13:32.820
the cards to communicate more quickly but it just wasn't able to keep up with

00:13:30.660 --> 00:13:36.899
advancements in monitor technology both in terms of resolution and refresh rate

00:13:35.279 --> 00:13:42.600
these Niche projects were taking considerable resources from both AMD and

00:13:39.779 --> 00:13:46.139
NVIDIA that along with tepid enthusiasm from game developers who really prefer

00:13:44.399 --> 00:13:51.000
to Target the lowest common denominator ultimately led to the demise of consumer

00:13:48.839 --> 00:13:55.139
multi-GPU it's not that NVIDIA gave up without a fight though they actually

00:13:52.620 --> 00:13:59.100
quietly made an attempt to resurrect SLI in the RTX 2000 generation when they

00:13:57.420 --> 00:14:03.120
explored checkerboard frame rendering unlike split frame rendering we which

00:14:01.079 --> 00:14:07.920
would have both gpus contributing to each displayed frame but would often

00:14:05.279 --> 00:14:12.839
have an ugly scene down the middle CFR would divide the frame into small chunks

00:14:10.260 --> 00:14:17.339
for each card to render then composite the result into a single image in theory

00:14:15.720 --> 00:14:21.600
it could have been the best of all worlds and even avoid NVIDIA having to

00:14:19.680 --> 00:14:27.120
specifically optimize a given game for multi-GPU but they ultimately abandoned

00:14:24.899 --> 00:14:31.019
it for a few reasons first the processing overhead was significant

00:14:28.860 --> 00:14:36.480
which could increase latency and second DirectX 12's explicit multi-GPU model

00:14:34.139 --> 00:14:41.160
showed up around that time which handed multi-GPU control over to the game

00:14:38.519 --> 00:14:45.899
developer entirely offloading it from the driver which game developers

00:14:43.079 --> 00:14:51.120
obviously haven't done anything with but that doesn't mean the multi-GPU dream is

00:14:48.120 --> 00:14:53.639
dead and of all people it's Apple

00:14:51.120 --> 00:14:59.699
showing us the way you see the M1 series of socs has far more in common with our

00:14:56.940 --> 00:15:03.000
Quantum 3D brick than meets the eye have a look at these die shots and you'll see

00:15:01.139 --> 00:15:08.639
what I mean we could think of each of these GPU core blocks on the M1 Pro as

00:15:05.880 --> 00:15:13.139
similar to the GPU chips on our brick with a fast enough interconnect they can

00:15:11.040 --> 00:15:18.420
act as one to render a single output like we see in the M1 Max but then okay

00:15:16.680 --> 00:15:23.279
nothing that we're looking at here is really multi-GPU it's just a bigger

00:15:21.660 --> 00:15:27.959
single die GPU it's only when we move to the M1 Ultra

00:15:25.560 --> 00:15:35.100
that things get really interesting you see the 64 core GPU in the M1 Ultra acts

00:15:32.399 --> 00:15:40.320
for all intents and purposes like a single die solution but Apple disclosed

00:15:38.040 --> 00:15:44.220
in their launch event that is actually a dual die design that uses a

00:15:42.540 --> 00:15:52.260
game-changing packaging solution that they call Ultra Fusion so it's two gpus

00:15:47.420 --> 00:15:54.420
acting as one it's SLI but with

00:15:52.260 --> 00:15:59.399
absolutely none of the downsides how on Earth did they do it well you see the

00:15:56.760 --> 00:16:04.860
problem has always been the interconnect speed see traditionally once you go off

00:16:02.519 --> 00:16:09.120
a single die your bandwidth drops dramatically that's why NVIDIA couldn't

00:16:06.899 --> 00:16:14.699
achieve vram coherency across multiple cards via their SLI Bridges the links

00:16:11.760 --> 00:16:18.839
were just too slow it's also why tricks like split frame or alternate frame

00:16:16.980 --> 00:16:23.760
rendering have always been necessary they couldn't just pull together the GPU

00:16:21.839 --> 00:16:28.440
resources because the left hand would have no idea what the right hand was

00:16:25.260 --> 00:16:31.560
doing in real time but Apple did it and

00:16:28.440 --> 00:16:33.899
both AMD and NVIDIA have also talked

00:16:31.560 --> 00:16:39.120
publicly about the possibility of future GPU designs that use multiple chiplets

00:16:36.839 --> 00:16:44.339
with high-speed interconnects in order to economically scale performance it's

00:16:41.820 --> 00:16:50.940
coming back baby but how does this factor in well the thing is back in 1998

00:16:47.759 --> 00:16:53.940
the bandwidth requirements for inter-GPU

00:16:50.940 --> 00:16:56.279
communication weren't as high so your

00:16:53.940 --> 00:17:00.240
Ultra Fusion interface could look a lot more like this

00:16:58.560 --> 00:17:03.660
it's really funny isn't it how the more things change the more they stay the

00:17:01.860 --> 00:17:08.760
same just like I always do the same segue to our sponsor vessi it's

00:17:06.839 --> 00:17:12.000
springtime which means rain can still show up and ruin your day like an

00:17:10.439 --> 00:17:16.100
annoying neighbor but that's fine because hello we have Bessie shoes

00:17:13.860 --> 00:17:21.360
Bessie claims their shoes are 100 waterproof so we can go step into as

00:17:18.600 --> 00:17:25.260
many puddles as we want they're dymatex technology keeps their shoes light and

00:17:23.280 --> 00:17:29.220
breathable while still being stretchy and comfortable also all of their

00:17:27.240 --> 00:17:33.059
products are vegan and cruelty free maybe you're planning a trip to get away

00:17:31.260 --> 00:17:37.500
from some of this miserable weather don't forget to pack a pair of essies

00:17:35.160 --> 00:17:42.299
their stretchiness means they can easily pack into your luggage or carry-on rain

00:17:39.960 --> 00:17:48.240
or shine here or there it's always a great occasion to wear a pair of vessies

00:17:43.980 --> 00:17:50.400
mon frere go to vesi.com Linus Tech tips

00:17:48.240 --> 00:17:54.419
and get 15 off your purchase with code Linus Tech tips if you guys enjoyed this

00:17:52.919 --> 00:18:00.179
video go check out our coverage of the Evans and Sutherlands Sim Fusion 6500

00:17:57.120 --> 00:18:04.100
Cube for more multi-GPU Madness that

00:18:00.179 --> 00:18:04.100
thing is really cool
