1
00:00:00,000 --> 00:00:08,960
165,000 CPU cores, 20 million dollars of GPU, and a cool tetebite of RAM.

2
00:00:08,960 --> 00:00:16,000
I wouldn't normally describe myself as a furry, but the new Fur Supercomputer has definitely

3
00:00:16,000 --> 00:00:19,840
awakened some feelings that I can't say I've ever felt before.

4
00:00:19,840 --> 00:00:26,960
Feelings like wanting to go deep inside it to gently remove its panels and maybe some light

5
00:00:27,040 --> 00:00:33,120
screwing? And thanks to our friends here at Simon Fraser University in beautiful British Columbia,

6
00:00:33,120 --> 00:00:39,920
we're going to be doing just that, going deep under the hood of the CPU and GPU compute deployment

7
00:00:39,920 --> 00:00:44,800
that is going to be serving tens of thousands of scientists and researchers in fields all the

8
00:00:44,800 --> 00:00:49,040
way from AI to zoology all over the country for years to come.

9
00:00:49,040 --> 00:00:54,960
This will be our first up close look at a real world deployment that uses direct dye liquid

10
00:00:54,960 --> 00:01:00,640
cooling to increase cooling efficiency from about 30 percent to over 90 percent.

11
00:01:00,640 --> 00:01:03,600
Or at least it'll be the first data center grade deployment.

12
00:01:04,240 --> 00:01:07,760
Mine doesn't count, and it doesn't look nearly as sexy.

13
00:01:08,480 --> 00:01:12,720
But what is sexy is this segue to our sponsor.

14
00:01:12,720 --> 00:01:31,360
In the row behind me is 640 NVIDIA H180 Gigabyte GPUs each with an estimated cost of around

15
00:01:31,360 --> 00:01:33,680
31,000 US dollars.

16
00:01:34,400 --> 00:01:41,200
Even at less than half of the maximum density, just 20 nodes per rack, the team here had to

17
00:01:41,200 --> 00:01:47,280
reroute power from elsewhere in the building and significantly upgrade the building's cooling system

18
00:01:47,280 --> 00:01:52,880
just to accommodate the incredible power requirements of these NVIDIA hoppers.

19
00:01:52,880 --> 00:01:57,600
This is actually a common theme that I hear from basically anyone in the data center space.

20
00:01:57,600 --> 00:02:02,160
I mean we tried to build for the future, but we couldn't have possibly seen this coming,

21
00:02:02,800 --> 00:02:06,560
and there's no sign of things slowing down. We'll get to that later though.

22
00:02:07,440 --> 00:02:15,840
First, the most exciting part of the tour. They pulled one of their spares out of the rack for us to crack open and get up close and personal

23
00:02:15,840 --> 00:02:20,640
width, and oh my god, look at this thing.

24
00:02:21,840 --> 00:02:30,800
It's heavy. I guess when you got this much hopper in you like, wow, it's kind of scary handling it.

25
00:02:30,800 --> 00:02:34,480
I mean this one you know to loan is worth more than my part,

26
00:02:34,480 --> 00:02:36,960
and a rack of these is worth more than my house.

27
00:02:38,320 --> 00:02:41,520
It's a little sketchy, but I want you guys to be able to see it.

28
00:02:41,520 --> 00:02:49,520
The CPUs are Epic Genoa, so last generation Zen 4 based, but Genoa still supports up to 12

29
00:02:49,520 --> 00:02:57,280
channel DDR5 memory and 128 lanes of PCIe Gen5, which is plenty to keep these GPU cores fed.

30
00:02:58,000 --> 00:03:03,600
If more CPU compute is needed, clearly there is support for dual CPU sockets,

31
00:03:03,600 --> 00:03:10,560
but the team at SFU found that 8 CPU cores per GPU was plenty for their purposes,

32
00:03:10,560 --> 00:03:18,240
and they opted for a single 48 core CPU and 1.152 terabytes of RAM in each of their nodes.

33
00:03:19,040 --> 00:03:26,320
Now for a closer look at the GPUs. Unfortunately, I'm not allowed to take the coolers off them, but under each of these

34
00:03:26,320 --> 00:03:37,120
4 cold plates is an NVIDIA H100 SXM5 80 gig GPU giving us a total of 320 gigabytes of VRAM per node.

35
00:03:37,680 --> 00:03:44,240
And guys, that's not just any VRAM, that is HBM3 running on a 5120 bit bus

36
00:03:44,240 --> 00:03:49,840
for a total bandwidth per GPU of 3.36 terabytes per second.

37
00:03:50,400 --> 00:03:58,000
For context, a top of the line consumer card, the RTX 5090, achieves just over half of that bandwidth.

38
00:03:58,720 --> 00:04:03,520
This kind of power does come with drawbacks however, like for example heat.

39
00:04:04,240 --> 00:04:10,960
Each of these is rated for 700 watts of power consumption through the SXM socket that's

40
00:04:10,960 --> 00:04:16,400
underneath them. And that is where the incredible cooling solution in this Lenovo node comes in.

41
00:04:16,400 --> 00:04:21,280
As a liquid cooling nerd, I gotta say guys, this is the coolest part for me.

42
00:04:21,280 --> 00:04:26,960
I mean, did you notice that there isn't a single fan in sight anywhere in this machine?

43
00:04:27,680 --> 00:04:37,040
That is because everything, CPUs, GPUs, VRMs, network interface, SSD caddy, even the system

44
00:04:37,040 --> 00:04:42,960
memory is directly liquid cooled. All of it. This feels a little bit like doing the maze and

45
00:04:42,960 --> 00:04:49,040
highlights magazine. So here's our inlet over here, which splits into two main loops that go

46
00:04:49,040 --> 00:04:54,400
through the system. The primary loop, which we can tell because it has a thicker pipe coming off of it,

47
00:04:54,400 --> 00:05:00,960
goes straight to the middle of our four GPUs, where this manifold splits fresh incoming water

48
00:05:00,960 --> 00:05:07,840
out to our four GPUs. Two of them just fit right back into the outlet here, while the other two

49
00:05:07,920 --> 00:05:14,480
run up to this networking board and then consolidate back to the outlet. That's our primary loop.

50
00:05:14,480 --> 00:05:20,640
Our secondary loop comes through here, handling some of the power delivery, and then carries over to

51
00:05:21,840 --> 00:05:30,960
interesting. It splits out doing the RAM next. I am not 100% sure what to make of that because I

52
00:05:30,960 --> 00:05:36,880
would think RAM would be a tertiary priority in terms of cooling. But that's what they've done.

53
00:05:36,880 --> 00:05:42,560
We go through the RAM, splitting into three different tubes that sit between our dims down

54
00:05:42,560 --> 00:05:49,600
both rows. Then one side handles this network caddy here and the other side handles our SSD caddy.

55
00:05:50,160 --> 00:05:56,960
Then each of those come back to one of the CPUs, which come out into the middle here and then run

56
00:05:56,960 --> 00:06:02,560
back to the outlet here. Not maybe the way I would have laid it out. There's a lot of 90 degree

57
00:06:02,560 --> 00:06:07,520
turns in here, meaning a lot of restriction, but I'm sure the engineers at Lenovo know what

58
00:06:07,520 --> 00:06:12,080
they're doing. There's a ton of other cool stuff to unpack here too. You probably noticed there's

59
00:06:12,080 --> 00:06:18,480
no power supply. That's because it uses these shonky connectors here at the back to plug into a

60
00:06:18,480 --> 00:06:25,040
backplane in the back of the rack. As for the cooling connections, well, according to the manufacturer,

61
00:06:25,040 --> 00:06:32,240
these do have a little bit of natural leakage, but it's on the order of molecules, which is pretty

62
00:06:32,240 --> 00:06:38,800
damn impressive. There are sensors all over the motherboard to detect any kind of leakage,

63
00:06:38,800 --> 00:06:45,280
and there is grounding throughout the system in the form of these little copper, what look like

64
00:06:45,280 --> 00:06:51,040
solder wick pieces. That's to prevent what's called stray current corrosion, which can be

65
00:06:51,040 --> 00:06:57,280
caused by a current that's accidentally induced in the coolant, which can lead to massive corrosion,

66
00:06:57,280 --> 00:07:03,760
which can lead to leaks, which I know from experience. Now, the team here wasn't sure about

67
00:07:03,760 --> 00:07:08,640
the exact chemistry of the coolant they're using, but they did tell me that it has antimicrobial

68
00:07:08,640 --> 00:07:13,440
properties to prevent anything from growing in the loop. There's some other fun stuff.

69
00:07:13,440 --> 00:07:19,680
There's a little stylus in here. Apparently, this is meant to assist in removing memory,

70
00:07:19,680 --> 00:07:22,640
which is great. I'd actually love to see more gaming motherboards come with that.

71
00:07:23,120 --> 00:07:32,080
I thought this lone 7.68 terabyte NVMe drive was interesting too. I mean, the networking is 400

72
00:07:32,080 --> 00:07:39,840
gigabit per second times two to the two petabytes of NVMe storage, not to mention 49 petabytes of

73
00:07:39,840 --> 00:07:45,840
spinning rust that's right over there, but according to the team here, occasionally they need no local

74
00:07:45,840 --> 00:07:51,520
storage to improve GPU performance a little bit. So you'd never boot off of this or anything,

75
00:07:51,520 --> 00:07:57,440
but it's nice to have there as a scratch. Also, the button cell in here is mounted in a vertical

76
00:07:57,440 --> 00:08:04,000
caddy because the density is so high in this one, you know, that they just couldn't give up the space

77
00:08:04,000 --> 00:08:09,840
that it would have taken to mount it parallel to the board. I also spotted a micro SD header.

78
00:08:11,120 --> 00:08:16,000
If anyone out there works in the data center and knows what that's for, I haven't seen it before,

79
00:08:16,000 --> 00:08:22,000
and Gem and I just assumed that I typoed. Oh, there was one other thing we wanted to look at,

80
00:08:22,000 --> 00:08:28,720
these big power bad boys. We couldn't see them until we got that shroud off. So these,

81
00:08:28,720 --> 00:08:34,000
they're just bus bars. They're going from power supply here, which is a DC to DC power supply.

82
00:08:35,520 --> 00:08:41,760
And they're going over to our GPUs. Damn. What's interesting to me that I just noticed

83
00:08:41,760 --> 00:08:47,280
is that there's a clear delineation between the NVIDIA engineered parts of this with the Black

84
00:08:47,280 --> 00:08:53,280
PCB and they're completely separate from everything else and the Lenovo engineered parts of this.

85
00:08:53,280 --> 00:08:58,960
So Lenovo is acting like more of a system integrator around this compute block here.

86
00:08:58,960 --> 00:09:05,520
Like you can even see the silk screening on the PCB is distinctly NVIDIA and Lenovo is just doing

87
00:09:05,520 --> 00:09:13,040
their DC to DC power. So it's just power in here and then PCIe in here in the form of these four

88
00:09:13,040 --> 00:09:20,240
MCIO connectors right here. This is essentially like plugging a GPU into your Legion gaming PC.

89
00:09:21,280 --> 00:09:29,120
The GPU house. With extra steps. Yeah. Before we poke around in one of the 192 core CPU nodes that

90
00:09:29,120 --> 00:09:36,800
they've got, let's take a look at one of the racks that these boys slide into. They're still using a

91
00:09:36,800 --> 00:09:43,280
very similar rear door chilled liquid rack like we saw with their air cooled nodes when we did a tour

92
00:09:44,240 --> 00:09:53,040
over eight years ago. Anywho, the point is that chilled 16 and a half degree cooling comes from

93
00:09:53,040 --> 00:09:59,840
the evaporative cooling towers outside. Then hot air from the power supplies and any of the

94
00:09:59,840 --> 00:10:07,040
networking equipment that's in the rack runs through here and wow is that ever hot. Then it

95
00:10:07,040 --> 00:10:14,320
spits out nice comfortable room temperature air on the other side. Each of these racks is fed by

96
00:10:14,400 --> 00:10:23,440
dual three phase 60 amp feeds for a total of about 70,000 watts per rack. Now if SFU had the power and

97
00:10:23,440 --> 00:10:30,960
cooling in this 1960s bunker, they could juice these up to 180,000 watts per rack, but they don't.

98
00:10:30,960 --> 00:10:38,800
Hence the empty rack space. Since we have this open. Oh wow. That is a big difference between the

99
00:10:38,800 --> 00:10:45,040
cold side and the hot side going into the back of these back planes for the servers. I don't have

100
00:10:45,040 --> 00:10:50,800
to ask which one's the supply. That's the cold side, which since we're on the subject, this is a

101
00:10:50,800 --> 00:10:56,720
perfect time to look at the cooling distribution system. This is the Lieber XTU from Virtus. It

102
00:10:56,720 --> 00:11:04,640
can do 600,000 watts of cooling capacity per one of these cooling distribution units or CDUs. Water

103
00:11:04,640 --> 00:11:11,680
comes in the supply side here. This thick boy. Ha, she's chilly. That's coming from the cooling

104
00:11:11,680 --> 00:11:18,960
towers outside. Then that runs all the way down to the bottom here to the heat exchanger in the front.

105
00:11:18,960 --> 00:11:24,160
This liquid to liquid heat exchanger does exactly what it says on the tin. Taking that cold water

106
00:11:24,160 --> 00:11:30,560
from the primary leak that goes outside and using it to chill the warm water that is coming directly

107
00:11:30,560 --> 00:11:37,760
off of the blocks that are going to our nodes. This unit uses dual redundant pumps and if we go

108
00:11:37,760 --> 00:11:44,400
back or under the back uses these manifolds and valves to control flow to up to six different racks.

109
00:11:44,400 --> 00:11:49,680
And it's very easy to tell which is the cold side that's being chilled in, which is the hot side here.

110
00:11:50,720 --> 00:11:58,160
Wow. I want one. Vaughn, can I have one? Probably the coolest part is this little touchscreen display

111
00:11:58,160 --> 00:12:02,640
on the front that much more succinctly illustrates what I just said. Here's your primary loop,

112
00:12:02,640 --> 00:12:07,360
here's your secondary loop, here's all your flow rates, all your temperatures, and here's an alarm

113
00:12:07,360 --> 00:12:14,720
that they assure me is totally fine. This data is super important because if they accidentally add

114
00:12:14,720 --> 00:12:19,760
water that is too cool going into the servers behind me, then they could end up a condensation,

115
00:12:19,760 --> 00:12:25,200
which hopefully I don't have to explain why that's super, super bad. Everything's hooked up using

116
00:12:25,200 --> 00:12:30,480
aqua-therm tubing from Germany. The admins here spoke with some other facilities that used stainless

117
00:12:30,480 --> 00:12:35,760
steel and one of them got rust in their cooling system. It was a big, big mess. They've been really,

118
00:12:35,760 --> 00:12:40,720
really happy with their aqua-therm. Now let's go check out the CPU mode. Contrary to what NVIDIA

119
00:12:40,720 --> 00:12:46,240
would like everyone to believe, not everything runs best on a GPU even today and that's where these

120
00:12:46,240 --> 00:12:56,960
come in. Each of these 1U racks contains two nodes and each node contains 192 Zen 5

121
00:12:56,960 --> 00:13:06,880
epic Turing cores with 768 gigs of memory. So that's a total of nearly 400 cores in each of these

122
00:13:06,880 --> 00:13:16,720
1Us. Holy freaking... For networking, they actually don't go as heavy on these using 200 gig connections

123
00:13:16,720 --> 00:13:23,200
and NDR to dynamically share that 200 gigabit link between the two nodes depending on their needs.

124
00:13:23,760 --> 00:13:28,720
This approach does have the drawback of meaning that if the primary node goes down,

125
00:13:28,720 --> 00:13:33,440
we lose network connection to the secondary one, but I have to assume that the cost savings

126
00:13:33,520 --> 00:13:39,520
outweigh the disadvantages in this case. In terms of loop layout, this one is much

127
00:13:39,520 --> 00:13:46,400
simpler coming in to both sides and then out of both sides, but just like the GPU nodes,

128
00:13:46,400 --> 00:13:51,840
the goal here is to get a water tube up against pretty much anything in the server that generates

129
00:13:51,840 --> 00:13:58,400
heat because there are no fans whatsoever. One cool thing we missed on the GPU node was we never

130
00:13:58,400 --> 00:14:03,280
got a look under the little cooling plates that the SSDs and network cards sit on. So here's what

131
00:14:03,280 --> 00:14:09,840
it looks like. It pretty much looks like a heat pipe, except instead of being full of what is usually

132
00:14:09,840 --> 00:14:15,200
a vapor and sometimes a liquid that circulates just within itself, it's just full of water or

133
00:14:15,200 --> 00:14:20,320
other coolant that will circulate into an external system. Now let's take a look at the racks that

134
00:14:20,320 --> 00:14:28,000
these live in. Each of these racks contains 72 of the nodes that I just showed you guys, top to

135
00:14:28,080 --> 00:14:39,520
freak and bottom with roughly 13,824 cores. Each polyrack is an island with a non-blocking 800

136
00:14:39,520 --> 00:14:46,720
gig connection between islands, so 41,000 cores can represent a single job with no blocking.

137
00:14:46,720 --> 00:14:51,600
They have some other specialized nodes like the storage ones, including the ones on the other

138
00:14:51,600 --> 00:14:56,560
side of the aisle that hold data for our local particle collider. Try them. We've got a whole

139
00:14:56,560 --> 00:15:02,000
video about that, along with some eight terabyte RAM nodes, which are, I think, pretty self-explanatory.

140
00:15:02,000 --> 00:15:06,880
They're for jobs that would overflow on a regular node, as long as you don't mind them having a few

141
00:15:06,880 --> 00:15:14,800
bugs. And finally, a single AMD MI300X node, too. I don't know what, keeping video on the toes or

142
00:15:16,240 --> 00:15:21,680
we could do it. We could buy more than one of these. You better not charge too much, especially

143
00:15:21,680 --> 00:15:28,160
when you factor in modern security needs. There are six zones of security to get to some of the cages

144
00:15:28,160 --> 00:15:32,720
that actually have biometric locks on them, where not only do you need to know the pin code,

145
00:15:32,720 --> 00:15:38,240
but you have to put your hand under it and it will check if that hand is attached to a living person.

146
00:15:39,120 --> 00:15:43,280
There's cameras everywhere in the data center with full visibility in all directions,

147
00:15:43,280 --> 00:15:47,600
and our tour guide today actually said that there's someone monitoring them so often

148
00:15:47,600 --> 00:15:52,480
that it's become a bit of a game where they'll send non-flattering pictures of him moving around in

149
00:15:52,480 --> 00:15:59,040
the data center just to make sure he knows they're watching. Cooling everything are these evaporative

150
00:15:59,040 --> 00:16:03,920
cooling towers behind me. The three that are closest to the building, they were there the last

151
00:16:03,920 --> 00:16:08,400
time we were here, but I couldn't show them to you for reasons that involve red tape and approvals.

152
00:16:08,400 --> 00:16:13,680
So here they are. I still can't get any closer to them for reasons that involve red tape and

153
00:16:13,680 --> 00:16:19,120
approvals, but hey, we can check out the acoustic damping that's on these ones on the other side.

154
00:16:19,120 --> 00:16:24,080
That's impressive. Here I am right at the intake next to these sound baffles.

155
00:16:24,080 --> 00:16:30,720
And for context, here's the untreated ones. The total cooling capacity is about 4.7 megawatts,

156
00:16:30,720 --> 00:16:36,800
which that is way more than what's needed for the machines inside. But just like power and

157
00:16:36,800 --> 00:16:41,600
storage, you want to have some extra for resiliency in the event of an equipment failure.

158
00:16:41,600 --> 00:16:44,720
Hey, what's one of these worth? Maybe I'll pick one up for the office.

159
00:16:44,720 --> 00:16:51,600
$1.2 million each. Oh, never mind. Frankly, I'd rather have one of these anyway.

160
00:16:52,160 --> 00:16:58,880
To augment the original pumps for cedar, which could do 800 gallons per minute of cooling,

161
00:16:58,880 --> 00:17:06,800
they added these two new ones that do 1500 gallons per minute. They also now have two

162
00:17:06,800 --> 00:17:12,080
mechanical chillers, which can be useful during the times of year that we get outside temperatures

163
00:17:12,080 --> 00:17:17,840
above 33 Celsius. So for maybe six hours a day, they'll switch over to mechanical chilling to

164
00:17:17,840 --> 00:17:21,840
help out their evaporative cooling tower. Probably the coolest thing about this gear, though,

165
00:17:21,840 --> 00:17:27,200
is how smart it is. They've got telemetry capture for things like temperature and flow rates,

166
00:17:27,200 --> 00:17:31,920
and it all feeds into a third party called Kaizen that helps with logging and determining if

167
00:17:31,920 --> 00:17:37,440
something's gone wrong with the system. Fun fact, by the way, the two-foot thick concrete floor

168
00:17:37,440 --> 00:17:43,120
that I'm standing on is so burdened by all of this heavy equipment and coolant that it actually

169
00:17:43,120 --> 00:17:51,680
deflects half an inch in the center. Is that okay? Am I going to break it? I mean, the place used to

170
00:17:51,680 --> 00:17:56,720
be the power distribution center for the southern half of our province, and it's built like a bunker.

171
00:17:57,520 --> 00:18:03,600
But that's not enough. Who paid for it all? FUR had a total budget of about $82 million, which,

172
00:18:04,160 --> 00:18:10,480
oh, I assume, is Canadian ruble deduce. So a little under 60 million US dollars,

173
00:18:10,480 --> 00:18:14,080
and that came from a combination of the Digital Research Alliance of Canada,

174
00:18:14,080 --> 00:18:19,440
BCKDF, and vendor-in-kind contributions, which I just learned are a vendor giving

175
00:18:19,440 --> 00:18:23,680
significant discounts. How do I get signed up for that program? Is that only for educational

176
00:18:23,680 --> 00:18:29,040
institutions? Anyway, they did ask us to shout out a couple of companies who helped them out.

177
00:18:29,040 --> 00:18:35,040
Lenovo, DDN, and Vertiv on the cooling side. And they didn't ask us to shout these guys out,

178
00:18:35,040 --> 00:18:39,680
but we're going to do it anyway. A shout out for our sponsor. If you guys enjoyed this video,

179
00:18:39,680 --> 00:18:44,000
why not check out the tour we did of Triumph, the particle accelerator that is just down the road.

180
00:18:45,280 --> 00:18:51,200
Down the hill, down the really long road. Canada's only road. It's pretty long.
