1
00:00:00,080 --> 00:00:08,360
When I was a young tinkerer, one of my favorite tools was Super Pi, a benchmark

2
00:00:05,200 --> 00:00:11,360
and stability test that calculates

3
00:00:08,360 --> 00:00:13,519
pi. Not this kind of pie, but rather the

4
00:00:11,360 --> 00:00:18,480
mathematical constant that describes the ratio between a circle's circumference

5
00:00:15,440 --> 00:00:20,560
and its diameter. As far as we know, pi

6
00:00:18,480 --> 00:00:25,600
is an irrational number, meaning that there is an infinite number of digits

7
00:00:22,960 --> 00:00:30,960
after the decimal place. And with my overclocked Athlon, I could calculate 32

8
00:00:28,960 --> 00:00:37,040
million of those digits in a half an hour or so. Girls dug it. But I dreamed

9
00:00:34,079 --> 00:00:41,440
of more. I wanted to calculate more digits of pi than any man had ever

10
00:00:39,360 --> 00:00:46,160
calculated before. And why shouldn't I? I'm minus motherucking tech tips. Uh so

11
00:00:44,000 --> 00:00:51,680
the current record set by Jordan Reis last year is 202 trillion digits.

12
00:00:48,960 --> 00:00:55,760
Trillion? Yeah, you say that's a lot of athlons. It's more than that. See, at a

13
00:00:54,079 --> 00:01:00,559
certain point, it's not even just the computation that becomes the problem,

14
00:00:57,680 --> 00:01:05,360
but rather it's the storage. Fortunately for us, both have gotten a little

15
00:01:03,199 --> 00:01:10,960
beefier. And thanks to our friends over at Kioxia, who sponsored this entire

16
00:01:07,760 --> 00:01:14,240
project, providing over two pabytes of

17
00:01:10,960 --> 00:01:17,479
Gen 4 NVMe storage, we were able to

18
00:01:14,240 --> 00:01:21,920
smash that record, calculating nearly

19
00:01:17,479 --> 00:01:24,479
100 trillion more digits. The process of

20
00:01:21,920 --> 00:01:29,200
getting this was an absolute cluster. But hey, that's content, baby. And here

21
00:01:26,880 --> 00:01:35,000
it is. Our verified Guinness World Record for a calculating pi to an

22
00:01:31,079 --> 00:01:40,280
astonishing 300 trillion digit.

23
00:01:35,000 --> 00:01:40,280
Holy. Let's talk about how we got here.

24
00:01:46,720 --> 00:01:54,880
To put the scale of 300 trillion digits into perspective, at size 4 aerial font,

25
00:01:52,399 --> 00:01:59,360
which is still readable, but pushing the limits, you can fit around

26
00:01:57,079 --> 00:02:04,479
25,000 digits on a normal sheet of paper. That means that my childhood

27
00:02:01,759 --> 00:02:09,840
dreams of 32 million digits could fit on around

28
00:02:06,119 --> 00:02:11,840
1,250 sheets of paper. But if we wanted

29
00:02:09,840 --> 00:02:16,760
to print out the number of digits that we just calculated, we should totally do

30
00:02:14,560 --> 00:02:23,520
that. No. Why not? because it would be literally billions of pages. It's only

31
00:02:20,160 --> 00:02:25,400
paper, Lionus. What could it cost? $10?

32
00:02:23,520 --> 00:02:32,080
Take. No. Let's take a look at the hardware.

33
00:02:28,959 --> 00:02:34,720
Yes, my friends. This is the super

34
00:02:32,080 --> 00:02:40,080
secret project that we teased last time we were working on thermal management

35
00:02:36,480 --> 00:02:43,040
for the milliondoll PC. You've seen this

36
00:02:40,080 --> 00:02:48,800
six server cluster here before. But while originally it had about a pabyte

37
00:02:45,760 --> 00:02:52,239
of stupid fast Kioxia Gen 4 NVMe

38
00:02:48,800 --> 00:02:55,599
storage, it has uh grown a little bit

39
00:02:52,239 --> 00:02:57,360
since then. The 72 15 TBTE drives that

40
00:02:55,599 --> 00:03:02,000
made up the original pool would have like almost been enough space to reach

41
00:02:59,840 --> 00:03:06,000
my original goal of 200 trillion digits. That was double the standing record at

42
00:03:03,680 --> 00:03:10,959
the time set by Emma at Google. but it wasn't anywhere near enough to beat the

43
00:03:08,159 --> 00:03:13,760
202 trilliondigit record that popped up a few months into the planning and

44
00:03:12,239 --> 00:03:17,760
testing for this project. So, I had to call in like just a couple favors,

45
00:03:16,159 --> 00:03:23,599
specifically from our friends over at Gigabyte, who sent this lovely WU dual

46
00:03:20,080 --> 00:03:25,440
socket epic chassis, a R183 from Kyozia,

47
00:03:23,599 --> 00:03:29,360
who lent us an older Cyan server from their test feed lab, and finally from

48
00:03:27,599 --> 00:03:33,840
AMD, who provided this Titanite reference platform for the launch of

49
00:03:31,040 --> 00:03:37,920
Epic Genoa. That gave us a total of nine servers. Why so many servers? I mean,

50
00:03:36,239 --> 00:03:42,159
couldn't we just pack all the storage in one? I mean, yeah, but here's the thing.

51
00:03:40,000 --> 00:03:46,400
We already had the million-dollar PC, and those nodes, they're already full.

52
00:03:44,720 --> 00:03:51,280
So, if we wanted to expand the storage, we either needed to throw away that

53
00:03:48,640 --> 00:03:55,040
existing pabyte we already had, or we need to expand the cluster with more

54
00:03:52,879 --> 00:03:59,360
machines. It's definitely not the most power or space efficient way to get two

55
00:03:57,120 --> 00:04:04,879
pabytes of NVMe, but you work with what you got. That's why we ended up with

56
00:04:01,680 --> 00:04:07,360
kind of a mix of drives as well. See,

57
00:04:04,879 --> 00:04:11,519
every one of our servers needs to have the same amount of storage space.

58
00:04:09,680 --> 00:04:16,320
Otherwise, it's lowest common denominator, and you're going to waste

59
00:04:13,680 --> 00:04:22,160
any of the extra capacity that's in the higher capacity nodes. But the issue is

60
00:04:19,120 --> 00:04:25,120
that some of our machines are expansion

61
00:04:22,160 --> 00:04:31,800
slot challenged. So, Kiozia had to send over a bucket of 30 terbte drives,

62
00:04:28,639 --> 00:04:36,160
allowing us to stuff a whopping

63
00:04:31,800 --> 00:04:40,720
245 tab in each of these nine servers

64
00:04:36,160 --> 00:04:42,960
totaling 2.2 pabytes of raw combined Gen

65
00:04:40,720 --> 00:04:47,520
4 storage. So, that takes care of the capacity. But um if my counter is

66
00:04:45,520 --> 00:04:51,840
correct, there's another server here that we haven't mentioned yet. Or at

67
00:04:49,199 --> 00:04:57,240
least it looks like a server, but IDK, half the fronts are missing. So those

68
00:04:54,560 --> 00:05:01,280
are speed holes, brother. It's for performance. All right, let's get it out

69
00:04:59,520 --> 00:05:04,960
of here and take a look. No problem to get this removed. I just got to battle

70
00:05:03,040 --> 00:05:11,919
the crack in here. Do you want to mark any of these or No, doesn't matter.

71
00:05:08,880 --> 00:05:13,520
This is the compute node, the machine

72
00:05:11,919 --> 00:05:20,080
that actually did the crunching. This Gigabyte R283 Z96 has been through some

73
00:05:17,520 --> 00:05:26,560
modifications. Let's It started life as a 24-base storage server. Actually,

74
00:05:22,880 --> 00:05:29,520
wait. So, most of the 160in PCIe Gen 5

75
00:05:26,560 --> 00:05:34,080
lanes from its dual 96 core epic processors were allocated to storage up

76
00:05:31,600 --> 00:05:37,759
front. But our machine, it doesn't really need storage anymore. All that

77
00:05:35,520 --> 00:05:43,840
storage is in everything else. What it needs now or needed anyways was

78
00:05:41,199 --> 00:05:49,680
networking. A lot of it specifically four of these NVIDIA Melanox 200 GB

79
00:05:47,520 --> 00:05:55,759
Connect X7 network cards. These things are wild. Oh yeah. They have two of

80
00:05:52,240 --> 00:05:59,039
those 200 GB ports per card. And thanks

81
00:05:55,759 --> 00:06:01,680
to the insane bandwidth of their 16x

82
00:05:59,039 --> 00:06:07,280
PCIe Gen 5 slot, which is good for, I don't know, 64 GB a second both ways,

83
00:06:04,960 --> 00:06:12,880
these can saturate both of those ports at the same time. So, if you put them

84
00:06:09,039 --> 00:06:14,639
together, that's it's 1.6 terabs of

85
00:06:12,880 --> 00:06:19,680
throughput. That's a lot of terabytes. Yeah, it's actually around 100 GB per

86
00:06:16,800 --> 00:06:25,759
second to each of our 96 core CPUs, which yes, can actually handle that

87
00:06:21,680 --> 00:06:28,960
thanks to the 24 128 gig sticks of DDR5

88
00:06:25,759 --> 00:06:30,720
ECC that Micron sent over. That's 3 terb

89
00:06:28,960 --> 00:06:35,600
of RAM. The more the better. This is the morrest I could get. As for the storage

90
00:06:32,479 --> 00:06:38,000
nodes, those are using dual Connect 6

91
00:06:35,600 --> 00:06:45,680
200 gig cards from the original setup. So each storage server has 400 gig or

92
00:06:42,560 --> 00:06:46,960
about a dual layer Blu-ray per second.

93
00:06:45,680 --> 00:06:52,400
But how does the whole thing work together? Well, we got to put it back in the rack before I can show you that.

94
00:06:49,600 --> 00:06:57,120
Yeah, that'll probably help me. Okay, there we go. All right, it should be

95
00:06:54,160 --> 00:07:00,400
back up. With the magic of Wet FS, which is the same clustered file system that

96
00:06:58,800 --> 00:07:04,479
we've been using ever since we first set up the million-dollar PC, we're able to

97
00:07:02,800 --> 00:07:09,919
use all that network speed we just talked about to run one combined file

98
00:07:07,039 --> 00:07:14,880
system off of all nine of our storage servers completely transparently to any

99
00:07:12,400 --> 00:07:18,319
application, including YC Cruncher, the application we're using to calculate pi.

100
00:07:16,720 --> 00:07:23,599
When I say the same though, I don't mean the same same. I mean, the same old

101
00:07:21,120 --> 00:07:28,000
array did work, but uh that version of Weta was years out of date and running

102
00:07:26,240 --> 00:07:31,840
on an operating system that is now completely end of life. Luckily for us,

103
00:07:29,840 --> 00:07:35,759
the Weta folks helped us nuke the old installs and install their custom image,

104
00:07:34,319 --> 00:07:39,440
which comes with everything pretty much ready to roll. Massive shout out to Josh

105
00:07:37,599 --> 00:07:43,039
and Bob. You guys rock. Thank you for helping us achieve this silly goal. And

106
00:07:41,520 --> 00:07:47,280
the rest of the folks at Weta for allowing us to use your software in a

107
00:07:44,960 --> 00:07:51,680
very, very unsupported, unconventional way. Oh. Oh, yeah. I basically had to

108
00:07:49,680 --> 00:07:56,400
trick Weta into thinking that each server is actually two servers. That way

109
00:07:54,160 --> 00:08:01,360
we could use the most space efficient stripe width, which for Weta means each

110
00:07:58,720 --> 00:08:05,680
chunk of data gets split into 16 pieces with two pieces of parody data

111
00:08:02,960 --> 00:08:10,800
calculated. 16 + 2 is 18, which is also what nine servers times two instances of

112
00:08:07,919 --> 00:08:15,759
WA gets you. For reference, in a supported configuration, we would have

113
00:08:12,720 --> 00:08:18,080
needed 19 discrete servers in order to

114
00:08:15,759 --> 00:08:23,280
accomplish this stripe width, including two for par and one as a hot spare,

115
00:08:21,039 --> 00:08:28,479
which is fantastic for an enterprise environment like where Weta is meant to

116
00:08:25,280 --> 00:08:31,520
be deployed, but very expensive for us.

117
00:08:28,479 --> 00:08:33,599
Enough diver. How fast is it? We haven't

118
00:08:31,520 --> 00:08:36,640
even tuned it yet. What do you Oh, okay. Well, we can tune it first. We can tune

119
00:08:35,200 --> 00:08:42,399
it. The biggest hurdle on the storage side was finding a way to limit the amount of data that flowed between the

120
00:08:40,560 --> 00:08:46,800
two CPUs, which may be a bit counterintuitive, but as soon as say the

121
00:08:44,560 --> 00:08:52,080
left CPU wants to send data via the network cards that are connected to the

122
00:08:48,320 --> 00:08:53,680
right CPU, that's a ton of lats and on

123
00:08:52,080 --> 00:08:57,440
top of that there's a limited amount of bandwidth. And since this is such a

124
00:08:55,519 --> 00:09:01,200
memory intensive calculation, that's why we need so much RAM and that's why we

125
00:08:58,880 --> 00:09:05,760
need all this storage. We don't want to waste memory bandwidth. So we set up two

126
00:09:03,360 --> 00:09:09,600
WA client containers which is just their application that runs on a computer and

127
00:09:07,760 --> 00:09:13,519
allows you to access the storage. Each of those containers got 12 cores

128
00:09:11,200 --> 00:09:18,160
assigned to it. One per chip lit on our giant CPUs so we can maximize the turbo

129
00:09:16,480 --> 00:09:22,880
speed. Uh no actually the reason for that is the cache. So those are 3D

130
00:09:20,240 --> 00:09:26,560
vcache CPUs. That gives us a certain amount of cash per chiplet. And we

131
00:09:24,480 --> 00:09:30,480
didn't want the buffers of YC Cruncher, which is like the amount of space it

132
00:09:27,760 --> 00:09:34,880
uses to like hold stuff in flight to spill out of that cache cuz as soon as

133
00:09:32,240 --> 00:09:38,399
you do, now it's in memory, more memory copies, more wasted bandwidth. And I

134
00:09:36,800 --> 00:09:43,320
tested a lot, which we'll get into a bit. But first, why don't we look at how

135
00:09:41,360 --> 00:09:47,600
fast it goes? Sure. Final setup_final real. Yeah. These are just

136
00:09:45,519 --> 00:09:51,519
scripts to like make the WA containers. Look how the cores, those ones are at

137
00:09:49,519 --> 00:09:56,160
100% usage. Those individual ones, those are all Weta IO cores. Basically, it's a

138
00:09:54,640 --> 00:10:00,800
lot of compute that needs to be reserved. But when you're talking like

139
00:09:58,399 --> 00:10:03,839
100 plus GB a second, which theoretically we are, but you haven't

140
00:10:02,399 --> 00:10:10,000
actually shown me that yet. Here's our little script. The interesting thing about those cores being used is that

141
00:10:08,160 --> 00:10:14,160
while you can just run an app and hope that it ignores them, the Linuxer not

142
00:10:12,560 --> 00:10:18,079
always the best for that. So there's this command called task set which

143
00:10:15,680 --> 00:10:22,079
allows you to like map whatever command or application you're running to only

144
00:10:19,839 --> 00:10:24,959
run on specific cores. Core one is a WA core. We're skipping that one. Core 9,

145
00:10:23,440 --> 00:10:28,880
we're skipping that one. And then this is running two separate tasks. One for

146
00:10:27,120 --> 00:10:33,839
each of our mounts. And you can see it only has the CPU cores from CPU 1 or CPU

147
00:10:31,760 --> 00:10:39,519
2 dependent on the map folder we're using. Let me run over to WA. Cute

148
00:10:36,240 --> 00:10:40,959
little dashboard. There we go. Woo. It's

149
00:10:39,519 --> 00:10:43,440
not 100, but it is pretty nice. That's writing. This is a write. That's right.

150
00:10:42,560 --> 00:10:49,839
That's just setting up. I thought you said you were doing read. It is a retest, but it's setting up the files.

151
00:10:48,160 --> 00:10:53,360
I was telling Jake as we were working on the review for this script. I was like,

152
00:10:51,600 --> 00:10:56,240
man, I've gotten kind of numb to these numbers, you know, after all the

153
00:10:54,640 --> 00:11:00,560
iterations of Wic and all that, you know, 100 GB a second. You're like,

154
00:10:58,000 --> 00:11:03,600
whatever. This is over the network. It never gets old. Actually, I, you know,

155
00:11:02,240 --> 00:11:08,160
you're numb to the numbers until the numbers you're looking at are like 200 gigabytes a second or something like But

156
00:11:06,480 --> 00:11:12,560
like, no, but dude, like the first time we cracked 100 gigabytes a second, it

157
00:11:10,480 --> 00:11:17,920
was all installed locally and that was no file system. All local. This is over

158
00:11:15,920 --> 00:11:23,200
a network with a file system with a functioning file system. Like real ass

159
00:11:20,079 --> 00:11:25,360
actual copying data. Yeah, that's crazy.

160
00:11:23,200 --> 00:11:29,680
It is crazy. And look at the read. The latency is 2 milliseconds. It's because

161
00:11:27,680 --> 00:11:33,200
I'm like oversaturating this. So down here you see the front end usage. That's

162
00:11:31,360 --> 00:11:37,120
the cores on this machine. They're being utilized 100%. I have the system set to

163
00:11:35,120 --> 00:11:40,399
have four NUMA nodes per CPU because that made Y cruncher a little bit

164
00:11:38,560 --> 00:11:45,680
happier. If I turn that off and do one NUMA node per socket, I was able to get

165
00:11:42,399 --> 00:11:48,160
this up to like 150 GB a second. At the

166
00:11:45,680 --> 00:11:52,480
time, I actually set the record for the fastest single client usage according to

167
00:11:50,720 --> 00:11:56,160
the Wacka guys. They since have broken that with like GPU direct storage or

168
00:11:54,399 --> 00:12:00,959
whatever, but this isn't even with RDMA. This is just good code built for NVMe.

169
00:11:58,320 --> 00:12:05,760
It's also good SSDs. Oh, brother. Yeah, look at this. The average usage of the

170
00:12:03,519 --> 00:12:11,040
drives in the array right now is 23%. Nothing. Shout out Kyokia. We're running

171
00:12:07,760 --> 00:12:13,440
a mix of their CD and CM series Gen 4

172
00:12:11,040 --> 00:12:18,880
drives. These things are super fast with individual drive read speeds that are in

173
00:12:15,600 --> 00:12:20,160
excess of 5 gigabytes per second. And

174
00:12:18,880 --> 00:12:26,079
that's not even the fastest they have. You step up to their Gen 5 drives and you're talking like 12, 13, 14 GB a

175
00:12:24,800 --> 00:12:29,839
second. They're available in self- encrypting SKs. They have die failure

176
00:12:28,000 --> 00:12:33,680
recovery, power loss protection. They're perfect for your next server or data

177
00:12:31,839 --> 00:12:38,160
center deployment. Yeah. And this entire time running this application, I didn't

178
00:12:35,839 --> 00:12:42,639
have a single drive. Had a single issue. You got a spreadsheet for tuning Y

179
00:12:40,160 --> 00:12:46,639
cruncher. Dude. Dude, when he adjusts the glasses, you know, getting real.

180
00:12:44,480 --> 00:12:51,839
Okay, here's Y Cruncher. Let's just do a normal pie run. 32 million25. Let's go.

181
00:12:50,399 --> 00:12:58,160
So, this would have taken about half an hour on my old It took 2 seconds to

182
00:12:55,440 --> 00:13:01,279
compute. Really? Yeah. What? for the uninitiated. Why Cruncher is the

183
00:12:59,680 --> 00:13:04,880
software we use to do this run? It was developed by a guy named Alexander E.

184
00:13:02,720 --> 00:13:08,560
Super nice guy. Helped us do some some messing about. Also, didn't help me that

185
00:13:07,120 --> 00:13:12,240
much. Honestly, I asked a lot of questions you didn't answer, but I

186
00:13:10,320 --> 00:13:15,600
figured it out. Anyways, I guess to be clear, the storage we've talked about,

187
00:13:13,839 --> 00:13:19,680
oh man, we need a lot of storage. It's because YC Cruncher uses the storage

188
00:13:17,680 --> 00:13:24,240
like RAM because the output of digits like that 300 trillion is only about 120

189
00:13:22,560 --> 00:13:27,839
tab compressed. Oh, it's not that much. Could we make that available to people?

190
00:13:25,680 --> 00:13:31,200
Oh god, I don't want to think about that. We'll try. Maybe we'll do a

191
00:13:29,200 --> 00:13:34,560
torrent or something. Oh god. But when you're setting up for a run, it actually

192
00:13:32,560 --> 00:13:39,200
tells you how much storage you need for this swap space, which is just basically

193
00:13:36,800 --> 00:13:42,720
like RAM plus that's slower. That's what it's using it for. For us, it was like,

194
00:13:40,959 --> 00:13:48,000
yeah, we're probably going to use like 1.5 pabytes of space at peak. It's uh

195
00:13:46,079 --> 00:13:51,839
it's pretty crazy. Okay, but what did you tune with? Did you see the tuning?

196
00:13:49,680 --> 00:13:56,959
Oh boy. So, this is like some of the tests I did. So YC Cruncher was built

197
00:13:54,079 --> 00:14:01,760
for direct attached storage. And in fact, it doesn't even want you to use

198
00:13:59,279 --> 00:14:06,560
like a RAID controller or software RAID. It does its own internal RAID. And then

199
00:14:04,240 --> 00:14:11,519
on top of that, it also has things you can tune like what multi-threading

200
00:14:08,800 --> 00:14:15,760
algorithm do you use and like how many threads and what size are your IO

201
00:14:13,760 --> 00:14:21,120
buffers, how much memory, how much memory and how many bytes can we read

202
00:14:18,800 --> 00:14:25,680
per seek ideally because if you're using hard drives or SSDs, it's different. Got

203
00:14:23,040 --> 00:14:29,600
it. It was built in an older time and the code base is huge and Alex just does

204
00:14:28,000 --> 00:14:34,079
it in his spare time as far as I'm aware. So, no shade. Super cool project.

205
00:14:32,560 --> 00:14:38,720
But at some point in the future, technology has gotten good enough that we can just rely on the operating system

206
00:14:36,959 --> 00:14:42,880
to do this. Like that Wacka speed test we just did. Let's hope for that one

207
00:14:40,720 --> 00:14:48,480
day. Anyway, with everything dialed in on August 1st, 2024. Yes. Which was a

208
00:14:45,680 --> 00:14:53,440
while ago. Yes. Jake finally hit enter on his command prompt and began our

209
00:14:50,480 --> 00:14:57,519
glorious journey to nerd glory. Yeah. For 12 days. And then it stopped thanks

210
00:14:55,680 --> 00:15:02,240
to a multi-day power outage while I was on vacation. And it was so early in the

211
00:14:59,600 --> 00:15:07,680
process that I said, "Fuck that. Let's just start it again. I want to get a

212
00:15:04,320 --> 00:15:09,920
clean run with no outages. But it was

213
00:15:07,680 --> 00:15:15,600
smooth sailing from then on. No, it wasn't. See, even with a cluster this

214
00:15:13,040 --> 00:15:21,360
chunk, calculations like this take a lot of time. The previous 202 trilliondigit

215
00:15:18,480 --> 00:15:25,279
record took a 100 days just to compute. And whether it's bad luck or user error,

216
00:15:24,240 --> 00:15:29,680
I think there's a little bit of user error. Finding a space in our facilities

217
00:15:27,600 --> 00:15:34,720
where a machine like that can operate completely uninterrupted. I have no idea

218
00:15:32,399 --> 00:15:38,800
what I just unplugged. How's your edit going? I'm holding your server. What's a

219
00:15:36,320 --> 00:15:42,560
challenge? At first, things were pretty okay in the lab server room here. We had

220
00:15:41,040 --> 00:15:46,639
our air conditioning working to keep things cool. We had our battery back up

221
00:15:44,320 --> 00:15:51,120
to keep the digits flowing during a short outage or a brown out. It's just

222
00:15:48,560 --> 00:15:55,199
that over the course of this run, we had multiple other power outages and none of

223
00:15:53,440 --> 00:15:59,360
them were small. So each time our calculation had to stop and restart. And

224
00:15:57,839 --> 00:16:04,240
the same goes for when the cooling failed. multiple times. It's pretty mint

225
00:16:02,240 --> 00:16:08,240
now though. The AC is fixed and that with our sick water door that is

226
00:16:06,480 --> 00:16:13,440
definitely not going to leak has the room at around 22 23° and the cluster is

227
00:16:11,120 --> 00:16:17,839
still running. Fortunately, Ycuncher makes checkpoints which allows resuming

228
00:16:15,759 --> 00:16:22,399
the calculation but it does mean our record could have been done much faster

229
00:16:20,320 --> 00:16:28,240
like based on the log somewhere in the neighborhood of like 30 40 50 days

230
00:16:25,279 --> 00:16:31,600
faster. The 300 trillionth digit of pi is five. Really?

231
00:16:31,920 --> 00:16:38,920
It's done, baby. Wow. It only

232
00:16:36,040 --> 00:16:44,079
took way longer than it should have. 190 days. Speaking of the logs,

233
00:16:42,079 --> 00:16:48,320
Jake's got them here right now. 100 gigabytes a second read and then you're

234
00:16:46,639 --> 00:16:52,639
writing for like 30 40 gigabytes a second. So that's as it

235
00:16:50,880 --> 00:16:57,440
pulling in data from the swap space which is our NVMe drives and bringing it

236
00:16:55,199 --> 00:17:00,800
into the 3 TB of RAM that are in the system. So, crunch crunch crunch crunch

237
00:16:59,040 --> 00:17:04,319
crunch crunch crunch and huck it back on. Write some data over there. Yeah.

238
00:17:02,399 --> 00:17:08,000
What I don't see in the logs here is how much power this consumed. How much did

239
00:17:06,559 --> 00:17:13,679
this cost? I actually haven't done the math on that. I think it roughly draws

240
00:17:10,400 --> 00:17:16,160
around 8,000 watts, which means 24 hours

241
00:17:13,679 --> 00:17:20,079
a day for a year. Yeah, it was like 10 grand. That's

242
00:17:18,400 --> 00:17:24,079
Canadian in electric. Are you kidding me right

243
00:17:21,919 --> 00:17:31,280
now? No. Like just CPUs. We don't even have GPUs in this thing. It's like 1,500

244
00:17:27,760 --> 00:17:32,880
watts in SSDs alone. Okay. But hey, that

245
00:17:31,280 --> 00:17:38,640
means our record should be safe for a while then, right? Well, it's possible,

246
00:17:36,559 --> 00:17:42,720
maybe even probable, that someone is already working on a run that would beat

247
00:17:40,880 --> 00:17:45,760
this record. And they could probably even do it on a single machine. They

248
00:17:44,400 --> 00:17:50,000
totally could. But that's how it is with computing. And they can never take that

249
00:17:48,080 --> 00:17:53,200
piece of paper away. I can't. I'm taking this one home. And don't forget about

250
00:17:51,360 --> 00:17:56,720
the other pieces of paper. Okay, real talk though. In school, we're taught

251
00:17:54,799 --> 00:18:01,039
that two digits of pi is enough to approximate most calculations. But

252
00:17:59,039 --> 00:18:06,799
obviously, depending what you're doing, you you could need a few more. Is there

253
00:18:04,080 --> 00:18:12,080
in your mind any practical use for 300 trillion digits? No. I mean, other than

254
00:18:09,039 --> 00:18:14,240
for this, but it was fun. It's about the

255
00:18:12,080 --> 00:18:18,240
journey, not the destination, Lionus. It's about doing something cool with the

256
00:18:16,000 --> 00:18:21,200
help of Kioxia, who builds highquality, high performance storage for the data

257
00:18:19,679 --> 00:18:25,840
center, and who we'll have linked down below. It's about Weta and their crazy

258
00:18:24,160 --> 00:18:30,080
software. It's about YC Cruncher. It's about because we could. Because

259
00:18:28,080 --> 00:18:33,440
we can. Just like we could. Also, shout out some of the other folks

260
00:18:31,600 --> 00:18:36,799
who helped us. Oh yeah. Josh and Bob again. Thank you so much from Wacka.

261
00:18:35,120 --> 00:18:41,520
Gigabyte for sending us that server and I haven't made content about it in like

262
00:18:38,160 --> 00:18:43,919
4 years. Just just thank you AMD. AMD

263
00:18:41,520 --> 00:18:48,480
sent the CPUs for the compute node like 3 years ago. Finally. Thank you. I swore

264
00:18:46,880 --> 00:18:53,880
I was going to make this video and it happened. It just took longer than I thought. Thank you, James. the writing

265
00:18:51,600 --> 00:18:57,440
manager for being patient with this project. And uh hey, if you guys want to

266
00:18:56,240 --> 00:19:03,120
check out more Linus and Jake shenanigans, how about the high availability uh cheapo computers? That

267
00:19:01,200 --> 00:19:07,440
was fun. Cheapo computer. Yeah, remember we Oh, that was cool. I don't know. But

268
00:19:05,039 --> 00:19:09,679
we didn't actually do the cheapo one just for the intro. I know, but it was

269
00:19:08,559 --> 00:19:13,039
cool. Yeah. Yeah, that was cool. It works on it. I um Yeah, this is this was

270
00:19:11,760 --> 00:19:17,039
fun. I don't think I ever want to do this again. And cut.
