WEBVTT

00:00:00.440 --> 00:00:06.200
behind me right now is the biggest

00:00:03.640 --> 00:00:11.160
supercomputer in the country it'll be serving researchers across Canada

00:00:09.240 --> 00:00:16.440
studying the human genome in bioinformatics particle physics

00:00:13.320 --> 00:00:19.039
materials research even humani research

00:00:16.440 --> 00:00:23.480
it's called Cedar it cost the federal government through the Canadian

00:00:20.439 --> 00:00:26.560
foundation for Innovation over $16

00:00:23.480 --> 00:00:28.960
million and we get to be the first to

00:00:26.560 --> 00:00:31.960
unbox this Beast

00:00:40.320 --> 00:00:49.399
Savage jerky is created without the use of nitrates or preservatives use offer

00:00:46.000 --> 00:00:50.600
code LTT to save 10% at the link in the

00:00:49.399 --> 00:00:59.199
video description so Cedar is a big data

00:00:54.760 --> 00:01:01.640
machine it takes up a quar of the 5,000

00:00:59.199 --> 00:01:07.360
ft Data Center it occupies meaning actually that there's room for it to

00:01:03.480 --> 00:01:11.080
grow but right now it has

00:01:07.360 --> 00:01:18.080
27,000 Intel Zeon processing course

00:01:11.080 --> 00:01:24.000
190 terab of RAM 64 petabytes of storage

00:01:18.080 --> 00:01:26.159
584 gpus and a total power draw of

00:01:24.000 --> 00:01:32.399
560,000 wats though with that said it's

00:01:29.360 --> 00:01:37.200
efficiency is a shocking

00:01:32.399 --> 00:01:39.280
1.07 on the Pue scale where one would be

00:01:37.200 --> 00:01:44.520
perfect and a typical data center would be 1 and 1/2 to two we'll get into how

00:01:42.280 --> 00:01:49.439
they did that a little bit later though so our tour starts right here behind me

00:01:47.079 --> 00:01:55.920
are what they call the high availability racks so everything back there has dual

00:01:53.520 --> 00:02:02.079
power supplies for redundancy with a battery backup for that and a diesel

00:01:58.600 --> 00:02:04.680
generator back up that everything back

00:02:02.079 --> 00:02:11.480
here is Mission critical things like networking login servers and management

00:02:07.880 --> 00:02:14.680
servers are all here and this is also

00:02:11.480 --> 00:02:17.160
where you'll find the bulk of Cedar's

00:02:14.680 --> 00:02:22.959
storage let's get in for a closer look at Cedar's connection to the outside

00:02:19.440 --> 00:02:27.680
world this networking Appliance from

00:02:22.959 --> 00:02:32.599
Huawei has a street price of around a

00:02:27.680 --> 00:02:35.480
million doll W and right here this is

00:02:32.599 --> 00:02:41.800
where it gets really bananas these guys are seedar

00:02:37.400 --> 00:02:44.599
dual 100 gabit connections through

00:02:41.800 --> 00:02:52.080
Vancouver and then as if that wasn't enough these orange ones here are dual

00:02:48.680 --> 00:02:54.680
40 gbit connections through nearby sui

00:02:52.080 --> 00:02:58.400
just in case somebody puts a back ho through one of these other fiber lines

00:02:56.920 --> 00:03:00.879
and they would have otherwise lost their internet connectivity I mean that's

00:03:00.080 --> 00:03:08.000
their Backup backup but Ethernet is not really the

00:03:05.920 --> 00:03:17.799
way you want to connect high performance Computing nodes this this right here is

00:03:12.080 --> 00:03:21.560
the true networking heart of Cedar these

00:03:17.799 --> 00:03:23.319
are 48 Port omn path switches and

00:03:21.560 --> 00:03:32.519
they're configured in what's called an island topology so the island is in

00:03:26.959 --> 00:03:34.760
almost all cases 32 compute nodes each

00:03:32.519 --> 00:03:42.640
of those compute nodes is connected to 32 ports on one of these switches in its

00:03:38.680 --> 00:03:46.720
rack then the remaining 16 ports come

00:03:42.640 --> 00:03:51.239
back to here that means that every

00:03:46.720 --> 00:03:54.519
island gets a dedicated line to each of

00:03:51.239 --> 00:03:57.159
the core switches giving you failover

00:03:54.519 --> 00:04:04.280
and massive bandwidth each one of these fiber links

00:04:00.640 --> 00:04:07.640
right here is capable of 100 gbit per

00:04:04.280 --> 00:04:11.920
second so even though between islands we

00:04:07.640 --> 00:04:13.560
are let's say bottlenecked by our 16

00:04:11.920 --> 00:04:19.720
connections so that's only half the total theoretical speed within an island

00:04:15.959 --> 00:04:22.600
we're still talking over 100 gabyt per

00:04:19.720 --> 00:04:29.280
second so it's not really an issue okay now let's move on to SFU and

00:04:26.360 --> 00:04:35.160
compute Canada's version of petabyte project spoiler alert theirs is better

00:04:32.240 --> 00:04:43.080
in every conceivable way so in the five cabinets behind me we've got Cedar 50

00:04:39.199 --> 00:04:46.240
petabyte IBM tape Library System they

00:04:43.080 --> 00:04:50.560
have a 40 gabit link to the rest of the

00:04:46.240 --> 00:04:53.320
supercomputer and each of the 5,000 10

00:04:50.560 --> 00:05:00.039
tbte magnetic tapes inside can be grabbed out of storage moved with like a

00:04:57.039 --> 00:05:01.960
robotic ARM into a reader and the data

00:05:00.039 --> 00:05:05.680
can then be accessed when needed and this is done

00:05:03.639 --> 00:05:12.000
automatically cool right okay yeah but due to the slowness

00:05:08.840 --> 00:05:15.199
of that swapping process this is still

00:05:12.000 --> 00:05:16.919
what we would consider to be cold or

00:05:15.199 --> 00:05:24.039
archival storage next up here is general purpose

00:05:20.440 --> 00:05:27.080
storage land where any data that's being

00:05:24.039 --> 00:05:30.720
used for any current research project

00:05:27.080 --> 00:05:35.360
would be housed so here here they're

00:05:30.720 --> 00:05:37.080
using offthe shelf five U racks Each of

00:05:35.360 --> 00:05:43.039
which contains let's see if we can crack One open here a total of two kind of

00:05:40.400 --> 00:05:53.120
trays here and 84 8 terab uh let's have a look here

00:05:48.680 --> 00:05:55.639
Enterprise capacity SAS drives from

00:05:53.120 --> 00:06:02.120
Seagate but there's actually more to this system than meets the eye every

00:05:58.720 --> 00:06:05.639
four of of these storage nodes requires

00:06:02.120 --> 00:06:08.039
two nodes of what they're calling object

00:06:05.639 --> 00:06:15.280
storage servers these act as a high-speed cache with their SAS 10,000

00:06:11.080 --> 00:06:18.400
RPM drives as well as as kind of like a

00:06:15.280 --> 00:06:21.919
a a traffic cough for everything behind

00:06:18.400 --> 00:06:24.840
it so every single read or write to

00:06:21.919 --> 00:06:30.759
these hard drives actually goes through these nodes so right now General Storage

00:06:28.360 --> 00:06:38.720
Land is 10 ped byes but in the near to Mid future it will be expanding to 20

00:06:35.080 --> 00:06:41.479
20 now that DIY approach to storage is

00:06:38.720 --> 00:06:45.680
great for scaling up at a low cost but when it comes to Performance they went

00:06:43.440 --> 00:06:54.400
for this data direct network storage Appliance because it has got the real

00:06:50.000 --> 00:06:58.759
Goods now in the rack next to this brain

00:06:54.400 --> 00:07:01.479
you'll find a mere 4 pedabytes of actual

00:06:58.759 --> 00:07:06.000
storage due to its higher cost but thanks to its proprietary Hardware

00:07:03.400 --> 00:07:13.919
custom software and solid state burst buffers this thing can handle up to 40

00:07:10.319 --> 00:07:16.960
gabt per second of sustained throughput

00:07:13.919 --> 00:07:19.759
making it perfect for data intensive

00:07:16.960 --> 00:07:26.960
applications that rely on humongous data sets now let's get into

00:07:23.879 --> 00:07:30.199
compute there are about half a dozen

00:07:26.960 --> 00:07:32.800
different types of compute nodes all

00:07:30.199 --> 00:07:38.080
connected to the same high-speed Omni paath Network backbone that are

00:07:34.879 --> 00:07:41.720
optimized for different types of

00:07:38.080 --> 00:07:45.240
research we'll begin with the base

00:07:41.720 --> 00:07:49.479
compute note there are a whopping

00:07:45.240 --> 00:07:52.159
576 of these each of these is a computer

00:07:49.479 --> 00:07:59.840
so there's actually four in a single toou shell Each of which contains two

00:07:55.039 --> 00:08:02.800
Zeon E5 2683 16 core processors 128 gigs

00:07:59.840 --> 00:08:10.800
of RAM and about a terabyte of raid zero SSD storage for Scratch so each rack

00:08:06.879 --> 00:08:14.599
here contains two islands so that's a

00:08:10.800 --> 00:08:15.840
total of 64 compute nodes giving us a

00:08:14.599 --> 00:08:23.840
whopping 2,48 compute units per rack so these

00:08:20.400 --> 00:08:26.520
nodes are the basic Workhorse of Cedar

00:08:23.840 --> 00:08:32.000
handling everything from Monte Carlo simulations for Material Science to sim

00:08:29.639 --> 00:08:38.719
ating Dynamic processes in nature with a high degree of Randomness like snowfall

00:08:34.880 --> 00:08:42.000
or rainfall they would also be used in

00:08:38.719 --> 00:08:44.480
any highly parallelized workload because

00:08:42.000 --> 00:08:49.480
if you need you know 10,000 CPU cores for one job there

00:08:47.399 --> 00:08:55.800
aren't enough cores in any other class of server to handle that kind of

00:08:52.839 --> 00:09:01.880
load moving right on up we've got the big memory nodes there are 48 of these

00:08:59.079 --> 00:09:09.600
and half of them are just like the basic nodes except with 512 gigs of RAM while

00:09:06.240 --> 00:09:15.200
the other half of them these puppies

00:09:09.600 --> 00:09:18.399
have 1 and half terabytes of system

00:09:15.200 --> 00:09:21.399
memory these ones take up twice as much

00:09:18.399 --> 00:09:24.680
rack space though each of these one use

00:09:21.399 --> 00:09:27.079
is a single dual socket system because

00:09:24.680 --> 00:09:35.320
you know what there just wasn't enough go darn room for old 24 64 gig memory

00:09:32.120 --> 00:09:37.560
modules that are required for that much

00:09:35.320 --> 00:09:43.959
RAM first world problem

00:09:39.760 --> 00:09:47.480
yes these guys are really special these

00:09:43.959 --> 00:09:50.320
are the aptly named 3 tbte nodes there

00:09:47.480 --> 00:09:58.959
are only a handful of them but these are quad socket machines with Zeon

00:09:53.600 --> 00:10:01.640
4809 v4s four of them but wait a tick

00:09:58.959 --> 00:10:06.920
those are only eight core processors these don't even have more processing

00:10:04.560 --> 00:10:13.640
cores than those little tiny ones that take up half of you what's the deal here

00:10:10.600 --> 00:10:16.160
well it turns out that some

00:10:13.640 --> 00:10:22.200
bioinformatics workloads like genome sequencing don't actually scale very

00:10:19.000 --> 00:10:24.720
well with more processors they just need

00:10:22.200 --> 00:10:30.519
massive amounts of memory to hold the data sets that they need to work on so

00:10:28.360 --> 00:10:37.000
while the team here probably isn't super stoked on using up for use just so they

00:10:34.399 --> 00:10:41.720
can stuff more memory into the system until until optane reaches a higher

00:10:39.639 --> 00:10:47.440
level of maturity this is the only choice they have now finally we're

00:10:44.720 --> 00:10:56.800
getting to my favorite nodes the most expensive nodes these are the GPU nodes

00:10:53.720 --> 00:10:59.320
and while they're actually quite similar

00:10:56.800 --> 00:11:04.560
to the base nodes with respect to their CPU and RAM configurations what's got

00:11:02.959 --> 00:11:09.560
the researchers in the fields of molecular Dynamics Ai and machine

00:11:07.360 --> 00:11:16.519
learning all amped up about these are the quad NVIDIA Tesla p100 graphics

00:11:13.399 --> 00:11:19.120
cards that they have cram into each one

00:11:16.519 --> 00:11:24.240
I mean seriously with 1,500 watts of power being consumed by

00:11:22.480 --> 00:11:29.920
each one of these is it it is an engineering Marvel that they've crammed

00:11:27.040 --> 00:11:34.680
enough power and cool to make this whole thing work so

00:11:32.680 --> 00:11:40.680
actually now that you think about it how exactly did they do that so the Keen ey

00:11:38.360 --> 00:11:46.440
among you might have already caught a couple of hints earlier in this video

00:11:42.680 --> 00:11:49.959
but the secret lies in the rear doors on

00:11:46.440 --> 00:11:54.680
the server racks look how thick this is

00:11:49.959 --> 00:11:58.200
yes my friends this entire door is a

00:11:54.680 --> 00:12:00.800
gigantic heat exchanger so their servers

00:11:58.200 --> 00:12:04.880
don't actually have have water blocks that would be more expensive what

00:12:03.000 --> 00:12:08.920
they're doing is they've just got the fronts of the racks all sealed up so

00:12:06.839 --> 00:12:14.519
there's no Backdraft pressure and they've got normal air cooled servers

00:12:12.200 --> 00:12:19.199
that pass the air from the front where they just draw in room temperature air

00:12:16.360 --> 00:12:24.560
in here and it comes out hot like 30 plus de and push it through this heat

00:12:21.800 --> 00:12:29.959
exchanger where it is actually cool to my skin that's how efficient these are

00:12:28.279 --> 00:12:35.440
and that cooling system system is massively expandable too you can

00:12:32.240 --> 00:12:37.920
actually see above me I am standing

00:12:35.440 --> 00:12:46.120
where we've got a blue and green cooling pipe connected to a whole bunch of quick

00:12:42.040 --> 00:12:49.079
release fittings ready to add more racks

00:12:46.120 --> 00:12:52.360
right here but to see what they actually do with the heat we're actually going to

00:12:50.920 --> 00:13:01.880
have to go upstairs where we'll find the final and

00:12:57.279 --> 00:13:05.639
perhaps the uh coolest stop in our tour

00:13:01.880 --> 00:13:10.160
here this is the mechanical room where

00:13:05.639 --> 00:13:13.480
the pumps and these freaking pipes take

00:13:10.160 --> 00:13:17.120
all the water from downstairs and dump

00:13:13.480 --> 00:13:20.120
it into three cooling towers outside the

00:13:17.120 --> 00:13:21.880
building now right now the weather is

00:13:20.120 --> 00:13:28.880
favorable to cooling the ambient temperature is quite low so it's just

00:13:25.040 --> 00:13:31.480
operating as gigantic radiators but get

00:13:28.880 --> 00:13:37.440
this when the conditions become less favorable in the summer they kick things

00:13:34.399 --> 00:13:41.399
into high gear with an automated system

00:13:37.440 --> 00:13:44.320
that sprays water onto the fins of the

00:13:41.399 --> 00:13:48.760
radiators in the cooling towers and if you watched our bong cooling video which

00:13:47.079 --> 00:13:53.920
you can check out right here you'll be familiar with this concept already but

00:13:51.079 --> 00:13:59.639
this is called evaporative cooling and by these means even in ambient

00:13:55.959 --> 00:14:04.279
temperatures up to 30° C

00:13:59.639 --> 00:14:07.560
they can achieve the 17° coolant levels

00:14:04.279 --> 00:14:10.880
that they need to without employing the

00:14:07.560 --> 00:14:15.759
massive Chiller unit that they have over

00:14:10.880 --> 00:14:15.759
on the other side of the

00:14:17.199 --> 00:14:23.839
room Squarespace is the way to build a

00:14:21.519 --> 00:14:29.399
website whether it's for your small business or for your you know uh local

00:14:27.759 --> 00:14:35.000
freaking book club it does doesn't matter if you want a web presence

00:14:31.800 --> 00:14:36.880
affordably and quickly Squarespace gets

00:14:35.000 --> 00:14:42.120
it done for you it starts at just 12 bucks a month and you get a free domain

00:14:39.440 --> 00:14:47.680
if you buy Squarespace for the year you just pick one of their templates and

00:14:44.800 --> 00:14:53.920
boom you upload some pictures you fill in some text it's all cloud-based and

00:14:50.680 --> 00:14:57.440
your website will simple as that look

00:14:53.920 --> 00:14:59.959
great on any device every website comes

00:14:57.440 --> 00:15:05.279
with a free online store and their cover pages feature allows you to set up a

00:15:02.320 --> 00:15:10.759
beautiful one-page online presence in just minutes so start a trial with no

00:15:08.720 --> 00:15:16.560
credit card required and start building your website today then when you decide

00:15:14.120 --> 00:15:23.000
to sign up for Squarespace don't forget head over to squarespace.com

00:15:19.160 --> 00:15:24.399
LTT and use offer code LTT to get 10%

00:15:23.000 --> 00:15:30.759
off your first purchase so a massive thank you to SFU

00:15:27.839 --> 00:15:35.079
and compute Canada for allowing us to uh run a muck in their data center thanks

00:15:33.120 --> 00:15:38.160
to you guys for watching if you dislike this video you know what to do but if

00:15:36.399 --> 00:15:41.839
you liked it hit that like button get subscribed maybe consider checking out

00:15:39.639 --> 00:15:45.399
where to buy the stuff we featured at the link in the video description also

00:15:44.120 --> 00:15:50.759
down there you'll find a link to our merch store which has cool shirts like this one and our community Forum which

00:15:48.720 --> 00:15:53.759
you should totally join
