WEBVTT

00:00:00.120 --> 00:00:09.480
that is fast this GPU has four NVMe ssds on it it's a

00:00:07.379 --> 00:00:14.340
trick that AMD pulled to make the PCI Express slot go a little further than it

00:00:11.519 --> 00:00:20.279
usually does meet the Radeon Pro SSG the world's first graphics card with storage

00:00:17.340 --> 00:00:26.400
but this came out five years ago and it's still the only graphics card with

00:00:23.160 --> 00:00:29.039
storage so how the heck does it work and

00:00:26.400 --> 00:00:33.420
why didn't this take off first I'll take off to talk about our sponsor smart

00:00:30.900 --> 00:00:38.460
deploy zero touch zero headache PC management for it deploy Windows images

00:00:35.520 --> 00:00:41.760
apps and drivers from the cloud no VPN required get your free subscription

00:00:40.020 --> 00:00:45.540
worth over five hundred dollars at smartdeploy.com Linus

00:00:46.079 --> 00:00:53.160
foreign

00:00:53.160 --> 00:01:00.539
on the surface the Radeon Pro SSG is nothing special basically a Vega 64 with

00:00:58.140 --> 00:01:04.339
double the hbm2 memory but if we take off the cooler we can get a look at its

00:01:02.520 --> 00:01:10.680
real party trick look at that GPU business in the front

00:01:07.260 --> 00:01:14.460
SSD party in the back these guys are

00:01:10.680 --> 00:01:17.159
literally 2280 m.2 NVMe ssds Samsung OEM

00:01:14.460 --> 00:01:21.360
each is 512 gigs and each shows up separately in Windows although the

00:01:19.260 --> 00:01:25.560
default setup is a Windows Dynamic disk stripe which is effectively a two

00:01:23.159 --> 00:01:30.479
terabyte raid zero so it's super cool but uh what will we use it for

00:01:27.860 --> 00:01:35.520
sequential and random rights are far beyond even our PCI Express Gen 4 SSD

00:01:33.240 --> 00:01:40.259
thanks to the 16 Lanes of PCI Express bandwidth available to the storage even

00:01:37.560 --> 00:01:46.020
though I can only use gen 3 speeds but only sequential reads scale this way and

00:01:43.439 --> 00:01:50.820
random reads are far slower than our Gen 4 SSD my best guess for why is that

00:01:49.079 --> 00:01:55.500
unless there's something wrong with the ssds on the card read operations might

00:01:53.040 --> 00:01:58.740
Simply Be redirecting through the GPU itself which is increasing request

00:01:57.180 --> 00:02:03.540
latency and therefore reducing throughput for random but not for

00:02:01.020 --> 00:02:07.439
sequential reads testing latency with performance tests lends credibility to

00:02:05.640 --> 00:02:11.580
that assumption but hey since the data is going through the GPU anyway this

00:02:09.899 --> 00:02:17.280
kind of reminds me of Microsoft's direct storage API which enables an NVMe SSD to

00:02:14.700 --> 00:02:21.599
load data directly to the GPU instead of going through the CPU first now will we

00:02:19.560 --> 00:02:26.879
see a performance Improvement then if we run games straight off of the SSG let's

00:02:24.360 --> 00:02:31.800
give it a shot I've got F1 2021 installed on both the internal SSD and

00:02:30.000 --> 00:02:37.500
the SSG itself so we'll see now if there's any

00:02:34.620 --> 00:02:41.160
Improvement whatsoever give me a sec I gotta we got an Ethernet cable

00:02:41.160 --> 00:02:48.959
as it turns out data loaded from the SSG is still processed by the CPU unless

00:02:46.019 --> 00:02:54.239
it's using AMD's SSG API so we're in fact losing some performance due to the

00:02:51.239 --> 00:02:56.160
SSD array and GPU sharing bandwidth what

00:02:54.239 --> 00:03:01.800
a missed opportunity as for the API itself well maybe we can test that out

00:02:59.040 --> 00:03:06.720
when the SSG first launched AMD touted its ability to work with massive data

00:03:03.900 --> 00:03:11.280
sets and well yeah two terabytes is massive but unfortunately for AMD

00:03:08.720 --> 00:03:15.239
massive data sets that benefit from a direct connection to the GPU like this

00:03:13.260 --> 00:03:19.379
are not super common the example they came up with was running 8K raw footage

00:03:17.280 --> 00:03:22.980
in real time off of the ssg's internal storage and they helpfully created a

00:03:21.120 --> 00:03:27.420
sample video player to handle exactly that it requires processing the video

00:03:25.019 --> 00:03:32.760
files with ffmpeg first to decompress the raw video frames so we'll do that

00:03:29.159 --> 00:03:38.180
now open the terminal here I've already

00:03:32.760 --> 00:03:41.159
converted the files and yes

00:03:38.180 --> 00:03:45.959
what okay let's try directly off of the SSG

00:03:43.260 --> 00:03:48.959
hmm I've tried everything I can think of at this point from running the player

00:03:47.280 --> 00:03:54.060
itself off of the SSG storage to compiling it myself same thing always

00:03:51.900 --> 00:03:57.959
happens looks like something changed between the time they wrote it and today

00:03:55.799 --> 00:04:02.519
and this whole thing is just broken now seeing as the original SDK was already

00:04:00.180 --> 00:04:07.080
tough to find let alone find an updated version you can guess it wasn't too

00:04:04.799 --> 00:04:11.939
popular unlike our brand new cable ties at ltdstore.com check this out we've got

00:04:09.540 --> 00:04:16.440
15 new colors there is one Ray of Hope For Us Adobe

00:04:14.939 --> 00:04:21.120
added support for it in Premiere Pro back in 2018 and as far as I can tell

00:04:18.600 --> 00:04:25.560
they're the only ones so let's run Puget bench's 8K red test and see what we get

00:04:23.340 --> 00:04:30.120
alright so this is straight off of the SSD

00:04:27.060 --> 00:04:32.759
I'm just gonna run the live playback 8K

00:04:30.120 --> 00:04:38.540
and this will take a few minutes oh Adobe why are you so slow

00:04:35.580 --> 00:04:45.240
success look at that playback is 6 nearly 69 times nicer than running from

00:04:42.300 --> 00:04:50.820
our Gen 4 SSD and all I did was copy files to the SSG mind you it's still not

00:04:48.960 --> 00:04:56.639
full speed we're looking at less than 10 FPS at full AK res and less than seven

00:04:53.580 --> 00:04:58.860
at half res but that's way better than

00:04:56.639 --> 00:05:02.880
the less than one FPS we got before it freaking works but why did AMD think to

00:05:01.740 --> 00:05:08.520
do something like this in the first place the Radeon Pro SSG is a solution

00:05:05.639 --> 00:05:14.340
to a uniquely 2017 problem NVIDIA had the quadro p6000 a GPU with 24 gigs of

00:05:11.780 --> 00:05:18.660
gddr5x memory AMD wanted to make a card with even more memory to compete but

00:05:16.259 --> 00:05:23.100
dram prices were going through the roof at the time and vram prices weren't far

00:05:20.580 --> 00:05:27.180
behind their then New Vegas 64 graphics cards launched in August bringing with

00:05:24.840 --> 00:05:31.280
them eight gigabytes of HPM to memory so you might think well how about just

00:05:28.620 --> 00:05:37.500
expanding that well that's tough because hpm2 is connected via ridiculously wide

00:05:34.400 --> 00:05:39.780
2048-bit memory bus so not only is it

00:05:37.500 --> 00:05:43.860
very expensive per giga bite it's also very difficult to scale it without

00:05:41.520 --> 00:05:48.660
making the Silicon itself much larger to accommodate the extra direct connections

00:05:45.539 --> 00:05:51.120
to the GPU enter solid state Graphics

00:05:48.660 --> 00:05:56.039
man Flash the stuff that ssds are made of may not be nearly as fast but it's

00:05:53.699 --> 00:06:01.139
pretty affordable compared to vram with Radeon SSG AMD now had a whole two

00:05:59.039 --> 00:06:07.699
terabytes of memory to work with your move NVIDIA or actually

00:06:04.380 --> 00:06:07.699
our move

00:06:08.780 --> 00:06:16.139
and they even have standard phillips screws although they're not ferrous I

00:06:13.919 --> 00:06:21.900
can't pick it up with a magnetic tip oh well yeah it just comes right out

00:06:18.660 --> 00:06:23.580
and a new one goes right on in now this

00:06:21.900 --> 00:06:27.060
is tricky because it doesn't have a magnetic

00:06:25.199 --> 00:06:31.800
properties but then it goes all right one down

00:06:29.400 --> 00:06:35.419
three to go there we're done

00:06:35.520 --> 00:06:40.860
this is hilarious you run your system without out any

00:06:39.479 --> 00:06:46.139
additional storage as long as you're okay with the overhead of sharing bandwidth with a GPU that's not its

00:06:44.699 --> 00:06:50.880
intended purpose of course putting the ssds so close to the GPU means that like

00:06:48.360 --> 00:06:55.139
with Microsoft's direct storage API it doesn't have to go through the CPU or

00:06:52.620 --> 00:06:59.699
really any other part of the system if it's loading via the API straight to the

00:06:57.539 --> 00:07:04.319
GPU get subscribed by the way because the minute Microsoft gives us something

00:07:01.860 --> 00:07:07.680
to test direct storage with we are on it this Arrangement tremendously speeds up

00:07:06.300 --> 00:07:11.639
access times which is a crucial component to anything a GPU does which

00:07:09.539 --> 00:07:15.539
is why it's such a Hot Topic today even the high band with cache controller AMD

00:07:13.500 --> 00:07:20.340
included on the Vega gpus that allows them to extend vram with main system

00:07:18.240 --> 00:07:25.919
memory is slow since it has to go through the CPU first as we've seen the

00:07:23.039 --> 00:07:30.539
SSG concept is sound but only in a world where that much memory is necessary only

00:07:28.620 --> 00:07:36.000
in a world where people are willing to use the API and most unfortunately for

00:07:33.120 --> 00:07:41.340
AMD in 2017 only in a world where people work in 8k raw and are not used using a

00:07:39.180 --> 00:07:45.840
lower resolution proxy Clips especially considering it was rocking Vega 64 specs

00:07:43.680 --> 00:07:49.800
alongside an asking price of seven thousand dollars when it was new

00:07:47.400 --> 00:07:53.819
although it quickly fell to 4 600 a few months after release because it's such a

00:07:51.900 --> 00:07:57.599
niche within a niche within a niche use case I haven't been able to find any

00:07:55.919 --> 00:08:02.039
examples of it being used in the real world but let's say demand for this

00:07:59.580 --> 00:08:06.479
feature does come back well the promise of direct storage doing basically the

00:08:04.259 --> 00:08:10.199
same thing makes this Arrangement costly and inefficient I mean there's no reason

00:08:08.220 --> 00:08:14.039
DaVinci Resolve or Adobe Creative Cloud couldn't use direct storage if games can

00:08:12.300 --> 00:08:18.960
to its credit you're getting a lot of m.2 slots and a GPU all in one package

00:08:16.560 --> 00:08:23.220
but the bridge chip and controller Hardware do not come cheap and you're

00:08:21.300 --> 00:08:28.259
giving up bandwidth whenever the SSD is accessed by something that's not using

00:08:24.660 --> 00:08:29.699
the API the Radeon Pro SSG is truly a

00:08:28.259 --> 00:08:34.800
product of its time and I doubt we'll see such a beast again

00:08:32.279 --> 00:08:37.979
anytime soon but I know we'll see more of our sponsor vesi thanks to Bessie

00:08:36.659 --> 00:08:43.919
Footwear for sponsoring today's video vessi Footwear is known for being lightweight easy to pack comfortable and

00:08:41.880 --> 00:08:47.399
most importantly waterproof designed to keep you moving vessi released their new

00:08:45.779 --> 00:08:51.360
everyday move shoes with enhanced breathability and added support their

00:08:49.320 --> 00:08:54.720
style is perfect for The Adventurous or those looking for something sportier

00:08:52.920 --> 00:08:58.800
featuring a pull tab to take them off and put them back on with ease vegan

00:08:56.700 --> 00:09:02.700
suede lace cages extra midsole cushioning and the same waterproof

00:09:00.720 --> 00:09:05.640
diamondtex technology you want to wear them everywhere the Dual climate knit

00:09:04.320 --> 00:09:09.480
material keeps your feet warm during winter and cool during the summer stay

00:09:07.560 --> 00:09:14.339
dry and get your messy shoes today at bessie.com Linus Tech tips and get 25

00:09:11.940 --> 00:09:18.060
off using Code Linus Tech tips at checkout thanks for watching guys go

00:09:16.200 --> 00:09:22.380
check out our recent video on the NVIDIA a100 for another example of a high-end

00:09:20.100 --> 00:09:26.300
piece of kit that has questionable practicality it's a good one
