1
00:00:00,000 --> 00:00:06,200
When I was a young tinkerer, one of my favorite tools was super pi, a benchmark and stability test

2
00:00:06,200 --> 00:00:10,440
that calculates pi. No, not this kind of pi.

3
00:00:10,440 --> 00:00:15,440
But rather, the mathematical constant that describes the ratio between a circle's circumference

4
00:00:15,440 --> 00:00:19,840
and its diameter. As far as we know, pi is an irrational number,

5
00:00:19,840 --> 00:00:22,880
meaning that there is an infinite number of digits

6
00:00:22,880 --> 00:00:26,960
after the decimal place. And with my overclocked athelon,

7
00:00:26,960 --> 00:00:30,320
I could calculate 32 million of those digits

8
00:00:30,320 --> 00:00:33,320
in a half an hour or so. Girls dug it.

9
00:00:33,320 --> 00:00:38,000
But I dreamed of more. I wanted to calculate more digits of pi

10
00:00:38,000 --> 00:00:41,400
than any man had ever calculated before. And why shouldn't I?

11
00:00:41,400 --> 00:00:46,720
Online is motherfucking tech tips. So the current record, set by Jordan Reines last year,

12
00:00:46,720 --> 00:00:49,760
is 202 trillion digits. Trillion?

13
00:00:49,760 --> 00:00:53,360
Yeah, you say. That's a lot of athelons. It's more than that.

14
00:00:53,360 --> 00:00:56,520
See, at a certain point, it's not even just the computation

15
00:00:56,560 --> 00:01:00,000
that becomes the problem. But rather, it's the storage.

16
00:01:00,000 --> 00:01:03,760
Fortunately for us, both have gotten a little beefier.

17
00:01:03,760 --> 00:01:08,320
And thanks to our friends over at Kyoksia, who sponsored this entire project,

18
00:01:08,320 --> 00:01:12,840
providing over two petabytes of Gen 4 NVMe storage,

19
00:01:12,840 --> 00:01:20,640
we were able to smash that record, calculating nearly 100 trillion more digits.

20
00:01:21,120 --> 00:01:24,480
The process of getting this was an absolute cluster.

21
00:01:24,480 --> 00:01:29,600
But hey, that's content, baby. And here it is, our verified Guinness World Record

22
00:01:29,600 --> 00:01:34,200
for a calculating pi to an astonishing 300 trillion digits.

23
00:01:34,200 --> 00:01:37,840
Holy shit. Let's talk about how we got here.

24
00:01:37,840 --> 00:01:42,840
The scale of 300 trillion digits into perspective.

25
00:01:50,480 --> 00:01:53,840
At size four aerial font, which is still readable,

26
00:01:53,840 --> 00:01:58,520
but pushing the limits, you can fit around 25 and a half thousand digits

27
00:01:58,520 --> 00:02:03,800
on a normal sheet of paper. That means that my childhood dreams of 32 million digits

28
00:02:03,800 --> 00:02:08,240
could fit on around 1,250 sheets of paper.

29
00:02:08,240 --> 00:02:13,560
But if we wanted to print out the number of digits that we just calculated.

30
00:02:13,560 --> 00:02:19,480
We should totally do that. No, why not? Because it would be literally billions of pages.

31
00:02:19,480 --> 00:02:23,400
It's only paper, Linus. What could it cost, $10?

32
00:02:24,040 --> 00:02:28,920
No. Let's take a look at the hardware.

33
00:02:28,920 --> 00:02:34,160
Yes, my friends. This is the super secret project that we teased

34
00:02:34,160 --> 00:02:38,360
last time we were working on thermal management for the million dollar PC.

35
00:02:38,360 --> 00:02:42,440
You've seen this six server cluster here before,

36
00:02:42,440 --> 00:02:45,760
but while originally it had about a petabyte

37
00:02:45,760 --> 00:02:49,560
of stupid fast, Keopsia Gen 4 NVMe storage,

38
00:02:49,560 --> 00:02:52,960
it has thrown a little bit since then.

39
00:02:52,960 --> 00:02:57,040
The 72 15 terabyte drives that made up the original pool

40
00:02:57,040 --> 00:03:01,960
would have like almost been enough space to reach my original goal of 200 trillion digits.

41
00:03:01,960 --> 00:03:05,640
That was double the standing record at the time set by Emma at Google.

42
00:03:05,640 --> 00:03:10,400
But it wasn't anywhere near enough to beat the 202 trillion digit record

43
00:03:10,400 --> 00:03:16,200
that popped up a few months into the planning and testing for this project. So I had to call in like just a couple favors.

44
00:03:16,200 --> 00:03:21,280
Specifically from our friends over at Gigabyte, who sent this lovely 1U DualSocket Epic chassis,

45
00:03:21,680 --> 00:03:25,280
our 183 from Keopsia, who went to us an older tie-in server

46
00:03:25,280 --> 00:03:30,520
from their test feed lab. And finally from AMD, who provided this tight mate reference platform

47
00:03:30,520 --> 00:03:34,520
for the launch of Epic Genoa. That gave us a total of nine servers.

48
00:03:34,520 --> 00:03:38,200
Why so many servers? I mean, couldn't we just pack all the storage in one?

49
00:03:38,200 --> 00:03:42,120
I mean, yeah, but here's the thing. We already had the million dollar PC,

50
00:03:42,120 --> 00:03:46,400
and those nodes, they're already full. So if we wanted to expand the storage,

51
00:03:46,400 --> 00:03:50,680
we either needed to throw away that existing petabyte we already had,

52
00:03:50,680 --> 00:03:53,720
or we need to expand the cluster with more machines.

53
00:03:53,720 --> 00:03:58,640
It's definitely not the most power or space-efficient way to get two petabytes of NVMe,

54
00:03:58,640 --> 00:04:01,720
but you work with what you got. That's why we ended up with

55
00:04:01,720 --> 00:04:06,440
kind of a mix of drives as well. See, every one of our servers

56
00:04:06,440 --> 00:04:09,640
needs to have the same amount of storage space.

57
00:04:09,640 --> 00:04:15,760
Otherwise, it's lowest common denominator, and you're gonna waste any of the extra capacity

58
00:04:15,760 --> 00:04:20,760
that's in the higher capacity nodes. But the issue is that some of our machines

59
00:04:20,760 --> 00:04:26,520
are expansion slot-challenged. So Kyoksia had to send over a bucket

60
00:04:26,520 --> 00:04:33,600
of the 30 terabyte drives, allowing us to stuff a whopping 245 terabytes

61
00:04:33,760 --> 00:04:41,200
in each of these nine servers, totaling 2.2 petabytes of raw combined Gen4 storage.

62
00:04:41,400 --> 00:04:46,320
So that takes care of the capacity, but if my count is correct,

63
00:04:46,320 --> 00:04:50,720
there's another server here that we haven't mentioned yet, or at least it looks like a server,

64
00:04:50,720 --> 00:04:56,400
but IDK, half the fronts are missing. So those are speed holes, brother.

65
00:04:56,400 --> 00:05:00,440
It's for performance. All right, let's get it out of here and take a look.

66
00:05:00,440 --> 00:05:04,440
No problem to get this removed. I just got to battle the crack in here.

67
00:05:04,440 --> 00:05:06,920
Do you want to mark any of these? Nope, doesn't matter.

68
00:05:08,920 --> 00:05:13,320
This is the compute node, the machine that actually did the crunching.

69
00:05:13,320 --> 00:05:19,040
This Gigabyte R283 Z96 has been through some modifications, let's go.

70
00:05:19,040 --> 00:05:23,520
It started life as a 24-base storage server, actually.

71
00:05:23,520 --> 00:05:26,960
So most of the 160-inch PCIe Gen5 lanes

72
00:05:26,960 --> 00:05:30,320
from its dual 96-core Epic processors

73
00:05:30,320 --> 00:05:35,200
were allocated to storage upfront, but our machine, it doesn't really need storage anymore.

74
00:05:35,200 --> 00:05:40,880
All that storage is in everything else. What it needs now, or needed anyways,

75
00:05:40,880 --> 00:05:45,960
was networking a lot of it. Specifically, four of these NVIDIA Melanox

76
00:05:45,960 --> 00:05:49,200
200 gigabit Kinect X7 network cards.

77
00:05:49,200 --> 00:05:54,800
These things are wild. Oh yeah. They have two of those 200 gigabit ports per card,

78
00:05:54,800 --> 00:05:57,840
and thanks to the insane bandwidth

79
00:05:57,840 --> 00:06:02,240
of their 16X PCIe Gen5 slot, which is good for, I don't know,

80
00:06:02,240 --> 00:06:08,320
64 gigabytes a second both ways, these can saturate both of those ports at the same time.

81
00:06:08,320 --> 00:06:13,520
So if you put them together, that's... It's 1.6 terabits of throughput.

82
00:06:13,520 --> 00:06:17,120
That's a lot of terabits. Yeah, it's actually around 100 gigabytes per second

83
00:06:17,120 --> 00:06:20,440
to each of our 96-core CPUs, which yes,

84
00:06:20,440 --> 00:06:25,000
can actually handle that, thanks to the 24128 gig sticks

85
00:06:25,000 --> 00:06:29,640
of DDR5 ECC that Micron sent over. That's three terabytes of RAM.

86
00:06:29,640 --> 00:06:33,160
The more the better. This is the moreest I could get. As for the storage nodes,

87
00:06:33,160 --> 00:06:36,480
those are using dual Kinect X6 200 gig cards

88
00:06:36,480 --> 00:06:41,880
from the original setup. So each storage server has 400 gig,

89
00:06:41,880 --> 00:06:45,720
or about a dual layer Blu-ray per second.

90
00:06:45,720 --> 00:06:49,000
But how does the whole thing work together? Well, we gotta put it back in the rack before

91
00:06:49,000 --> 00:06:53,320
I can show you that. Yeah, that'll probably help me. Okay, there we go.

92
00:06:53,320 --> 00:06:56,980
All right, it should be back up. With the magic of Weka FS,

93
00:06:56,980 --> 00:07:00,600
which is the same clustered file system that we've been using ever since we first set up

94
00:07:00,600 --> 00:07:05,080
the million dollar PC, we're able to use, all that network speed we just talked about

95
00:07:05,080 --> 00:07:10,560
to run one combined file system off of all nine of our storage servers,

96
00:07:10,560 --> 00:07:14,720
completely transparently to any application, including Y Cruncher,

97
00:07:14,720 --> 00:07:19,700
the application we're using to calculate Pi. When I say the same though, I don't mean the same, same.

98
00:07:19,700 --> 00:07:25,600
I mean, the same old array did work, but that version of Weka was years out of date

99
00:07:25,600 --> 00:07:29,120
and running on an operating system that is now completely end of life.

100
00:07:29,120 --> 00:07:34,360
Luckily for us, the Weka folks helped us nuke the old installs and install their custom image,

101
00:07:34,360 --> 00:07:39,080
which comes with everything pretty much ready to roll. Massive shout out to Josh and Bob, you guys rock.

102
00:07:39,080 --> 00:07:42,960
Thank you for helping us achieve this silly goal. And the rest of the folks at Weka

103
00:07:42,960 --> 00:07:47,480
for allowing us to use your software in a very, very unsupported unconventional way.

104
00:07:47,480 --> 00:07:51,240
Oh, oh yeah. I basically had to trick Weka into thinking

105
00:07:51,240 --> 00:07:57,040
that each server is actually two servers. That way we could use the most space efficient stripe width,

106
00:07:57,040 --> 00:08:01,440
which for Weka means each chunk of data gets split into 16 pieces

107
00:08:01,440 --> 00:08:06,760
with two pieces of parity data calculated. 16 plus two is 18, which is also what nine servers

108
00:08:06,760 --> 00:08:12,280
times two instances of Weka gets you. For reference, in a supported configuration,

109
00:08:12,280 --> 00:08:15,280
we would have needed 19 discrete servers

110
00:08:15,280 --> 00:08:21,000
in order to accomplish this stripe width, including two for parity and one as a hot spare,

111
00:08:21,000 --> 00:08:26,600
which is fantastic for an enterprise environment like where Weka is meant to be deployed,

112
00:08:26,600 --> 00:08:29,960
but very expensive for us. Enough to ever gather.

113
00:08:29,960 --> 00:08:32,920
How fast is it? We haven't even tuned it yet, what do you buy?

114
00:08:33,320 --> 00:08:36,880
Okay, well we can tune it. First we can tune it. The biggest hurdle on the storage side

115
00:08:36,880 --> 00:08:41,280
was finding a way to limit the amount of data that flowed between the two CPUs,

116
00:08:41,280 --> 00:08:46,320
which may be a bit counterintuitive, but as soon as say the left CPU wants to send data

117
00:08:46,320 --> 00:08:49,360
via the network cards that are connected to the right CPU,

118
00:08:49,360 --> 00:08:54,440
that's a ton of latency. That's a lot of hops. And on top of that, there's a limited amount of bandwidth.

119
00:08:54,440 --> 00:08:58,320
And since this is such a memory intensive calculation, that's why we need so much RAM,

120
00:08:58,320 --> 00:09:02,640
and that's why we need all this storage, we don't wanna waste memory bandwidth.

121
00:09:02,680 --> 00:09:07,600
So we set up two Weka client containers, which is just their application that runs on a computer

122
00:09:07,600 --> 00:09:11,760
and allows you to access the storage. Each of those containers got 12 cores assigned to it,

123
00:09:11,760 --> 00:09:14,760
one per chiplet on our giant CPUs.

124
00:09:14,760 --> 00:09:19,240
So we can maximize the turbo speed? No, actually the reason for that is the cache.

125
00:09:19,240 --> 00:09:24,120
So those are 3DV cache CPUs. That gives us a certain amount of cache per chiplet,

126
00:09:24,120 --> 00:09:28,080
and we didn't want the buffers of Y Cruncher, which is like the amount of space it uses

127
00:09:28,080 --> 00:09:31,640
to like hold stuff in flight to spill out of that cache.

128
00:09:31,640 --> 00:09:36,440
Because as soon as you do, now it's in memory, more memory copies, more wasted bandwidth.

129
00:09:36,440 --> 00:09:42,400
And I tested a lot, which we'll get into a bit. But first, why don't we look at how fast it goes?

130
00:09:42,400 --> 00:09:47,520
Final setup, underscore final, underscore for real. These are just scripts to like make the Weka containers.

131
00:09:47,520 --> 00:09:51,200
Look how the cores, those ones are at 100% usage, those individual ones,

132
00:09:51,200 --> 00:09:57,040
those are all Weka IO cores basically. It's a lot of compute that needs to be reserved.

133
00:09:57,040 --> 00:10:00,400
But when you're talking like 100 plus gigabytes a second,

134
00:10:00,440 --> 00:10:03,480
which theoretically we are, but you haven't actually shown me that yet.

135
00:10:03,480 --> 00:10:07,800
Here's our little script. The interesting thing about those cores being used

136
00:10:07,800 --> 00:10:11,200
is that while you can just run an app and hope that it ignores them,

137
00:10:11,200 --> 00:10:15,520
the Linux scheduler, not always the best for that. So there's this command called task set,

138
00:10:15,520 --> 00:10:19,280
which allows you to like map whatever command or application you're running

139
00:10:19,280 --> 00:10:23,000
to only run on specific cores. Core one is a Weka core, we're skipping that one.

140
00:10:23,000 --> 00:10:26,640
Core nine, we're skipping that one. And then this is running two separate tasks,

141
00:10:26,640 --> 00:10:30,240
one for each of our mounts. And you can see it only has the CPU cores

142
00:10:30,240 --> 00:10:34,280
from CPU one or CPU two, dependent on the map folder we're using.

143
00:10:34,280 --> 00:10:37,640
Let me run over to Weka. Cute little dashboard.

144
00:10:37,640 --> 00:10:40,800
Woo! It's not a hundred, but it is pretty nice.

145
00:10:40,800 --> 00:10:44,160
That's writing. This is a write. That's right. That's just setting.

146
00:10:44,160 --> 00:10:46,640
You said you were doing read. It is a retest, but it's setting up the files.

147
00:10:48,240 --> 00:10:53,960
I was telling Jake as we were working on the review for this script, I was like, man, I've gotten kind of numb to these numbers.

148
00:10:53,960 --> 00:10:57,720
You know, after all the iterations of one, it can all that. You know, a hundred gigabytes a second,

149
00:10:58,440 --> 00:11:01,480
this is over the network. It never gets old.

150
00:11:01,480 --> 00:11:06,320
Actually, you know, you're numb to the numbers until the numbers you're looking at are like 200 gigabytes a second or something.

151
00:11:06,320 --> 00:11:10,320
But like, no, but dude, like the first time we cracked a hundred gigabytes a second.

152
00:11:10,320 --> 00:11:15,040
It was all installed locally. And that was no file system, all local.

153
00:11:15,040 --> 00:11:19,200
This is over a network. With a file system. With a functioning file system.

154
00:11:19,200 --> 00:11:22,440
Real ass actual copying data. Yeah.

155
00:11:22,440 --> 00:11:27,280
That's crazy. It is crazy. And look at the read. The latency is two milliseconds.

156
00:11:27,280 --> 00:11:31,120
It's because I'm like oversaturating this. So down here, you see the front end usage.

157
00:11:31,120 --> 00:11:34,120
That's the cores on this machine. They're being utilized a hundred percent.

158
00:11:34,120 --> 00:11:39,120
I have the system set to have four NUMA nodes per CPU because that made Y Cruncher a little bit happier.

159
00:11:39,120 --> 00:11:45,280
If I turn that off and do one NUMA node per socket, I was able to get this up to like 150 gigabytes a second.

160
00:11:45,280 --> 00:11:50,400
At the time, I actually set the record for the fastest single client usage.

161
00:11:50,400 --> 00:11:54,680
According to the WECA guys, they since have broken that with like GPU direct storage or whatever.

162
00:11:54,680 --> 00:11:58,360
But this isn't even with RDMA. This is just good code. Built for NVMe.

163
00:11:58,360 --> 00:12:02,160
It's also good SSDs. Oh brother. Yeah. Look at this.

164
00:12:02,160 --> 00:12:05,560
The average usage of the drives in the array right now is 23%.

165
00:12:05,560 --> 00:12:09,400
Wait, nothing. Shout out Kyokesia. We're running a mix of their CD

166
00:12:09,400 --> 00:12:13,120
and CM series Gen 4 drives. These things are super fast

167
00:12:13,120 --> 00:12:18,760
with individual drive read speeds that are in excess of five gigabytes per second.

168
00:12:18,760 --> 00:12:25,120
And that's not even the fastest they have. You step up to their Gen 5 drives and you're talking like 12, 13, 14 gigabytes a second.

169
00:12:25,120 --> 00:12:29,520
They're available in self-encrypting SKUs. They have Dyke failure recovery, power loss protection.

170
00:12:29,520 --> 00:12:32,640
They're perfect for your next server or data center deployment.

171
00:12:32,640 --> 00:12:38,120
Yeah. And this entire time running this application, I didn't have a single drive, had a single issue.

172
00:12:38,120 --> 00:12:42,040
You got a spreadsheet for tuning Y Cruncher? Dude, dude.

173
00:12:42,040 --> 00:12:45,680
When he adjust the glasses, you know, getting real. Okay. Here's Y Cruncher.

174
00:12:45,680 --> 00:12:49,200
Let's just do a normal pie run. 32 million, 25.

175
00:12:49,200 --> 00:12:53,880
Let's go. So this would have taken about half an hour on my old past one.

176
00:12:53,880 --> 00:12:57,600
It took 0.2 seconds to compute. Really? Yeah.

177
00:12:57,600 --> 00:13:01,080
What? For the uninitiated, Y Cruncher is the software we use to do this run.

178
00:13:01,080 --> 00:13:05,760
It was developed by a guy named Alexander Yee. Super nice guy, helped us do some messing about.

179
00:13:05,760 --> 00:13:10,000
Also, didn't help me that much. Honestly, I asked a lot of questions he didn't answer,

180
00:13:10,000 --> 00:13:13,880
but I figured it out anyways, I guess. To be clear, the storage, we've talked about,

181
00:13:13,880 --> 00:13:18,520
oh man, we need a lot of storage. It's because Y Cruncher uses the storage like RAM

182
00:13:18,520 --> 00:13:23,440
because the output of digits, like that 300 trillion, is only about 120 terabytes compressed.

183
00:13:23,440 --> 00:13:27,360
Wow, it's not that much. Could we make that available to people? Oh, God.

184
00:13:27,360 --> 00:13:31,000
I don't wanna think about that. We'll try. Maybe we'll do a torrent or something. Oh, God.

185
00:13:31,000 --> 00:13:34,360
But when you're setting up for a run, it actually tells you how much storage you need

186
00:13:34,360 --> 00:13:37,880
for this swap space, which is just basically like RAM plus.

187
00:13:37,880 --> 00:13:42,160
That's slower. That's what it's using it for. For us, it was like, yeah, we're probably gonna use

188
00:13:42,160 --> 00:13:45,600
like a 1.5 petabytes of space at peak.

189
00:13:45,600 --> 00:13:48,640
It's pretty crazy. Okay. But what did you tune, Jake?

190
00:13:48,640 --> 00:13:52,600
You wanna see the tuning? Oh boy. So this is like some of the tests I did.

191
00:13:52,600 --> 00:13:56,600
So why Cruncher was built for direct attached storage?

192
00:13:56,600 --> 00:14:01,720
And in fact, it doesn't even want you to use like a RAID controller or software RAID.

193
00:14:01,720 --> 00:14:05,280
It does its own internal RAID. And then on top of that,

194
00:14:05,280 --> 00:14:10,560
it also has things you can tune like, what multi-threading algorithm do you use?

195
00:14:10,560 --> 00:14:14,360
And like how many threads? And what size are your IO buffers?

196
00:14:14,360 --> 00:14:19,460
How much memory? How much memory and how many bytes can we read per seek?

197
00:14:19,460 --> 00:14:22,880
Ideally, because if you're using hard drives or SSDs, it's different.

198
00:14:22,880 --> 00:14:27,160
Got it. It was built in an older time and the code base is huge.

199
00:14:27,160 --> 00:14:30,240
And Alex just does it in his spare time as far as I'm aware.

200
00:14:30,240 --> 00:14:34,040
So no shade, super cool project. But at some point in the future,

201
00:14:34,040 --> 00:14:37,760
technology has gotten good enough that we can just rely on the operating system to do this.

202
00:14:37,760 --> 00:14:41,400
Like that Weka speed test we just did. Let's hope for that. One day.

203
00:14:41,400 --> 00:14:45,000
Anyway, with everything dialed in on August 1st, 2024.

204
00:14:45,000 --> 00:14:49,760
Yes. It was a while ago. Yes. Jake finally hit enter on his command prompt

205
00:14:49,760 --> 00:14:53,200
and began our glorious journey to nerd glory.

206
00:14:53,200 --> 00:14:57,080
Yeah, for 12 days. And then it stopped thanks to a multi-day power outage

207
00:14:57,080 --> 00:15:00,680
while I was on vacation. And it was so early in the process that I said,

208
00:15:00,680 --> 00:15:06,120
f*** that s***. Let's just start it again. I want to get a clean run with no outages.

209
00:15:06,120 --> 00:15:09,480
But it was smooth sailing from then on.

210
00:15:09,480 --> 00:15:13,520
No, it wasn't. See, even with a cluster this chonk,

211
00:15:13,520 --> 00:15:16,580
calculations like this take a lot of time.

212
00:15:16,580 --> 00:15:21,320
The previous 202 trillion digit record took a hundred days just to compute.

213
00:15:21,320 --> 00:15:24,360
And whether it's bad luck or user error.

214
00:15:24,360 --> 00:15:27,720
I think there's a little bit of user error. Finding a space in our facilities

215
00:15:27,720 --> 00:15:30,960
where a machine like that can operate completely uninterrupted.

216
00:15:30,960 --> 00:15:33,960
I have no idea what I just done plugged.

217
00:15:33,960 --> 00:15:37,880
How's your edit going? I'm holding your server. What's the challenge?

218
00:15:37,880 --> 00:15:43,200
At first things were pretty okay in the lab server room here. We had our air conditioning working to keep things cool.

219
00:15:43,200 --> 00:15:46,200
We had our battery backup to keep the digits flowing

220
00:15:46,200 --> 00:15:50,360
during a short outage or a brownout. It's just that over the course of this run,

221
00:15:50,360 --> 00:15:54,160
we had multiple other power outages and none of them were small.

222
00:15:54,160 --> 00:15:57,720
So each time our calculation had to stop and restart.

223
00:15:57,720 --> 00:16:01,600
And the same goes for when the cooling failed multiple times.

224
00:16:01,600 --> 00:16:06,040
It's pretty mid now though. The AC is fixed and that with our sick water door

225
00:16:06,040 --> 00:16:10,480
that is definitely not going to leak has room at around 22, 23 degrees.

226
00:16:10,480 --> 00:16:14,440
And the cluster is still running. Fortunately, Y Cruncher makes checkpoints

227
00:16:14,440 --> 00:16:20,340
which allows resuming the calculation. But it does mean our record could have been done much faster.

228
00:16:20,340 --> 00:16:26,060
Like based on the log somewhere in the neighborhood of like 30, 40, 50 days faster.

229
00:16:26,060 --> 00:16:29,100
The 300 trillionth digit of pie is five.

230
00:16:29,100 --> 00:16:33,940
Really? Ha ha ha ha ha. It's done baby.

231
00:16:33,940 --> 00:16:38,580
Wow. It only took way longer than it should have.

232
00:16:38,580 --> 00:16:43,620
190 days. Speaking of the logs, Jake's got them here right now.

233
00:16:43,620 --> 00:16:48,420
100 gigabytes a second read and then you're writing for like 30, 40 gigabytes a second.

234
00:16:48,420 --> 00:16:52,660
So that's as it's, what? Pulling in data from the swap space

235
00:16:52,660 --> 00:16:56,700
which is our NVMe drives and bringing it into the three terabytes of RAM

236
00:16:56,700 --> 00:17:00,060
that are in the system. So the crunch, crunch, crunch, crunch, crunch, crunch, crunch

237
00:17:00,060 --> 00:17:03,820
and huck it back over there. Write some data over there, yeah. What I don't see in the logs here

238
00:17:03,820 --> 00:17:07,220
is how much power this consumed. How much did this cost?

239
00:17:07,220 --> 00:17:12,340
I actually haven't done the math on that. I think it roughly draws around 8,000 watts

240
00:17:12,380 --> 00:17:15,100
which means 24 hours a day for a year.

241
00:17:16,300 --> 00:17:19,100
Yeah, it was like 10 grand. That's Canadian.

242
00:17:20,140 --> 00:17:23,900
You know, that's, are you kidding me right now? No. Like just CPUs.

243
00:17:23,900 --> 00:17:29,940
We don't even have GPUs in this thing. It's like 1,500 Watts in SSDs alone.

244
00:17:29,940 --> 00:17:34,700
Okay, but hey, that means our record should be safe for a while then, right?

245
00:17:34,700 --> 00:17:37,820
Well, it's possible, maybe even probable

246
00:17:37,820 --> 00:17:41,580
that someone is already working on a run that would beat this record

247
00:17:42,020 --> 00:17:45,020
and they could probably even do it on a single machine. They totally could.

248
00:17:45,020 --> 00:17:49,260
But that's how it is with computing and they can never take that piece of paper away.

249
00:17:49,260 --> 00:17:52,740
I can, I'm taking this one home. And don't forget about the other pieces of paper.

250
00:17:52,740 --> 00:17:56,140
Okay, real talk though. In school, we're taught that two digits of pi

251
00:17:56,140 --> 00:18:01,020
is enough to approximate most calculations but obviously, depending what you're doing,

252
00:18:01,020 --> 00:18:06,020
you could need a few more. Is there, in your mind, any practical use

253
00:18:06,020 --> 00:18:10,020
for 300 trillion digits? No, I mean other than for this.

254
00:18:10,020 --> 00:18:14,260
But it was fun. It's about the journey, not the destination Linus.

255
00:18:14,260 --> 00:18:19,180
It's about doing something cool with the help of Kyoksia who builds high quality, high performance storage

256
00:18:19,180 --> 00:18:24,580
for the data center and who will have link down below. It's about Weka and their crazy software.

257
00:18:24,580 --> 00:18:27,740
It's about Y Cruncher. It's about because we fucking could.

258
00:18:27,740 --> 00:18:30,900
Because we fucking can't. Just like we could also shout out

259
00:18:30,900 --> 00:18:35,100
some of the other folks who helped us. Yeah, Josh and Bob again. Thank you so much from Weka.

260
00:18:35,100 --> 00:18:39,180
Gigabyte for sending us that server and I haven't made content about it in like four years.

261
00:18:39,180 --> 00:18:43,500
Just, just thank you. AMD, AMD sent the CPUs for the compute node

262
00:18:43,500 --> 00:18:48,180
like three years ago. Finally, thank you. I swore I was gonna make this video

263
00:18:48,180 --> 00:18:51,940
and it happened. It just took longer than I thought. Thank you, James, the writing manager

264
00:18:51,940 --> 00:18:56,580
for being patient with this project. And hey, if you guys wanna check out

265
00:18:56,580 --> 00:19:01,060
more Linus and Jake shenanigans, how about the high availability cheapo computers?

266
00:19:01,060 --> 00:19:04,420
That was fun. Cheapo computers? Yeah, I remember. Oh, that was cool.

267
00:19:04,420 --> 00:19:07,500
Yeah, that was super cool. I don't know if we didn't actually do the cheapo one. We did it, we just did the demo.

268
00:19:07,500 --> 00:19:10,020
Just for the intro. I know, but it was cool. Yeah, yeah, that was cool.

269
00:19:10,700 --> 00:19:14,380
Yeah, this was fun. I don't think I ever wanna do this again.

270
00:19:14,380 --> 00:19:15,220
And cut.
