WEBVTT

00:00:00.000 --> 00:00:06.200
When I was a young tinkerer, one of my favorite tools was super pi, a benchmark and stability test

00:00:06.200 --> 00:00:10.440
that calculates pi. No, not this kind of pi.

00:00:10.440 --> 00:00:15.440
But rather, the mathematical constant that describes the ratio between a circle's circumference

00:00:15.440 --> 00:00:19.840
and its diameter. As far as we know, pi is an irrational number,

00:00:19.840 --> 00:00:22.880
meaning that there is an infinite number of digits

00:00:22.880 --> 00:00:26.960
after the decimal place. And with my overclocked athelon,

00:00:26.960 --> 00:00:30.320
I could calculate 32 million of those digits

00:00:30.320 --> 00:00:33.320
in a half an hour or so. Girls dug it.

00:00:33.320 --> 00:00:38.000
But I dreamed of more. I wanted to calculate more digits of pi

00:00:38.000 --> 00:00:41.400
than any man had ever calculated before. And why shouldn't I?

00:00:41.400 --> 00:00:46.720
Online is motherfucking tech tips. So the current record, set by Jordan Reines last year,

00:00:46.720 --> 00:00:49.760
is 202 trillion digits. Trillion?

00:00:49.760 --> 00:00:53.360
Yeah, you say. That's a lot of athelons. It's more than that.

00:00:53.360 --> 00:00:56.520
See, at a certain point, it's not even just the computation

00:00:56.560 --> 00:01:00.000
that becomes the problem. But rather, it's the storage.

00:01:00.000 --> 00:01:03.760
Fortunately for us, both have gotten a little beefier.

00:01:03.760 --> 00:01:08.320
And thanks to our friends over at Kyoksia, who sponsored this entire project,

00:01:08.320 --> 00:01:12.840
providing over two petabytes of Gen 4 NVMe storage,

00:01:12.840 --> 00:01:20.640
we were able to smash that record, calculating nearly 100 trillion more digits.

00:01:21.120 --> 00:01:24.480
The process of getting this was an absolute cluster.

00:01:24.480 --> 00:01:29.600
But hey, that's content, baby. And here it is, our verified Guinness World Record

00:01:29.600 --> 00:01:34.200
for a calculating pi to an astonishing 300 trillion digits.

00:01:34.200 --> 00:01:37.840
Holy shit. Let's talk about how we got here.

00:01:37.840 --> 00:01:42.840
The scale of 300 trillion digits into perspective.

00:01:50.480 --> 00:01:53.840
At size four aerial font, which is still readable,

00:01:53.840 --> 00:01:58.520
but pushing the limits, you can fit around 25 and a half thousand digits

00:01:58.520 --> 00:02:03.800
on a normal sheet of paper. That means that my childhood dreams of 32 million digits

00:02:03.800 --> 00:02:08.240
could fit on around 1,250 sheets of paper.

00:02:08.240 --> 00:02:13.560
But if we wanted to print out the number of digits that we just calculated.

00:02:13.560 --> 00:02:19.480
We should totally do that. No, why not? Because it would be literally billions of pages.

00:02:19.480 --> 00:02:23.400
It's only paper, Linus. What could it cost, $10?

00:02:24.040 --> 00:02:28.920
No. Let's take a look at the hardware.

00:02:28.920 --> 00:02:34.160
Yes, my friends. This is the super secret project that we teased

00:02:34.160 --> 00:02:38.360
last time we were working on thermal management for the million dollar PC.

00:02:38.360 --> 00:02:42.440
You've seen this six server cluster here before,

00:02:42.440 --> 00:02:45.760
but while originally it had about a petabyte

00:02:45.760 --> 00:02:49.560
of stupid fast, Keopsia Gen 4 NVMe storage,

00:02:49.560 --> 00:02:52.960
it has thrown a little bit since then.

00:02:52.960 --> 00:02:57.040
The 72 15 terabyte drives that made up the original pool

00:02:57.040 --> 00:03:01.960
would have like almost been enough space to reach my original goal of 200 trillion digits.

00:03:01.960 --> 00:03:05.640
That was double the standing record at the time set by Emma at Google.

00:03:05.640 --> 00:03:10.400
But it wasn't anywhere near enough to beat the 202 trillion digit record

00:03:10.400 --> 00:03:16.200
that popped up a few months into the planning and testing for this project. So I had to call in like just a couple favors.

00:03:16.200 --> 00:03:21.280
Specifically from our friends over at Gigabyte, who sent this lovely 1U DualSocket Epic chassis,

00:03:21.680 --> 00:03:25.280
our 183 from Keopsia, who went to us an older tie-in server

00:03:25.280 --> 00:03:30.520
from their test feed lab. And finally from AMD, who provided this tight mate reference platform

00:03:30.520 --> 00:03:34.520
for the launch of Epic Genoa. That gave us a total of nine servers.

00:03:34.520 --> 00:03:38.200
Why so many servers? I mean, couldn't we just pack all the storage in one?

00:03:38.200 --> 00:03:42.120
I mean, yeah, but here's the thing. We already had the million dollar PC,

00:03:42.120 --> 00:03:46.400
and those nodes, they're already full. So if we wanted to expand the storage,

00:03:46.400 --> 00:03:50.680
we either needed to throw away that existing petabyte we already had,

00:03:50.680 --> 00:03:53.720
or we need to expand the cluster with more machines.

00:03:53.720 --> 00:03:58.640
It's definitely not the most power or space-efficient way to get two petabytes of NVMe,

00:03:58.640 --> 00:04:01.720
but you work with what you got. That's why we ended up with

00:04:01.720 --> 00:04:06.440
kind of a mix of drives as well. See, every one of our servers

00:04:06.440 --> 00:04:09.640
needs to have the same amount of storage space.

00:04:09.640 --> 00:04:15.760
Otherwise, it's lowest common denominator, and you're gonna waste any of the extra capacity

00:04:15.760 --> 00:04:20.760
that's in the higher capacity nodes. But the issue is that some of our machines

00:04:20.760 --> 00:04:26.520
are expansion slot-challenged. So Kyoksia had to send over a bucket

00:04:26.520 --> 00:04:33.600
of the 30 terabyte drives, allowing us to stuff a whopping 245 terabytes

00:04:33.760 --> 00:04:41.200
in each of these nine servers, totaling 2.2 petabytes of raw combined Gen4 storage.

00:04:41.400 --> 00:04:46.320
So that takes care of the capacity, but if my count is correct,

00:04:46.320 --> 00:04:50.720
there's another server here that we haven't mentioned yet, or at least it looks like a server,

00:04:50.720 --> 00:04:56.400
but IDK, half the fronts are missing. So those are speed holes, brother.

00:04:56.400 --> 00:05:00.440
It's for performance. All right, let's get it out of here and take a look.

00:05:00.440 --> 00:05:04.440
No problem to get this removed. I just got to battle the crack in here.

00:05:04.440 --> 00:05:06.920
Do you want to mark any of these? Nope, doesn't matter.

00:05:08.920 --> 00:05:13.320
This is the compute node, the machine that actually did the crunching.

00:05:13.320 --> 00:05:19.040
This Gigabyte R283 Z96 has been through some modifications, let's go.

00:05:19.040 --> 00:05:23.520
It started life as a 24-base storage server, actually.

00:05:23.520 --> 00:05:26.960
So most of the 160-inch PCIe Gen5 lanes

00:05:26.960 --> 00:05:30.320
from its dual 96-core Epic processors

00:05:30.320 --> 00:05:35.200
were allocated to storage upfront, but our machine, it doesn't really need storage anymore.

00:05:35.200 --> 00:05:40.880
All that storage is in everything else. What it needs now, or needed anyways,

00:05:40.880 --> 00:05:45.960
was networking a lot of it. Specifically, four of these NVIDIA Melanox

00:05:45.960 --> 00:05:49.200
200 gigabit Kinect X7 network cards.

00:05:49.200 --> 00:05:54.800
These things are wild. Oh yeah. They have two of those 200 gigabit ports per card,

00:05:54.800 --> 00:05:57.840
and thanks to the insane bandwidth

00:05:57.840 --> 00:06:02.240
of their 16X PCIe Gen5 slot, which is good for, I don't know,

00:06:02.240 --> 00:06:08.320
64 gigabytes a second both ways, these can saturate both of those ports at the same time.

00:06:08.320 --> 00:06:13.520
So if you put them together, that's... It's 1.6 terabits of throughput.

00:06:13.520 --> 00:06:17.120
That's a lot of terabits. Yeah, it's actually around 100 gigabytes per second

00:06:17.120 --> 00:06:20.440
to each of our 96-core CPUs, which yes,

00:06:20.440 --> 00:06:25.000
can actually handle that, thanks to the 24128 gig sticks

00:06:25.000 --> 00:06:29.640
of DDR5 ECC that Micron sent over. That's three terabytes of RAM.

00:06:29.640 --> 00:06:33.160
The more the better. This is the moreest I could get. As for the storage nodes,

00:06:33.160 --> 00:06:36.480
those are using dual Kinect X6 200 gig cards

00:06:36.480 --> 00:06:41.880
from the original setup. So each storage server has 400 gig,

00:06:41.880 --> 00:06:45.720
or about a dual layer Blu-ray per second.

00:06:45.720 --> 00:06:49.000
But how does the whole thing work together? Well, we gotta put it back in the rack before

00:06:49.000 --> 00:06:53.320
I can show you that. Yeah, that'll probably help me. Okay, there we go.

00:06:53.320 --> 00:06:56.980
All right, it should be back up. With the magic of Weka FS,

00:06:56.980 --> 00:07:00.600
which is the same clustered file system that we've been using ever since we first set up

00:07:00.600 --> 00:07:05.080
the million dollar PC, we're able to use, all that network speed we just talked about

00:07:05.080 --> 00:07:10.560
to run one combined file system off of all nine of our storage servers,

00:07:10.560 --> 00:07:14.720
completely transparently to any application, including Y Cruncher,

00:07:14.720 --> 00:07:19.700
the application we're using to calculate Pi. When I say the same though, I don't mean the same, same.

00:07:19.700 --> 00:07:25.600
I mean, the same old array did work, but that version of Weka was years out of date

00:07:25.600 --> 00:07:29.120
and running on an operating system that is now completely end of life.

00:07:29.120 --> 00:07:34.360
Luckily for us, the Weka folks helped us nuke the old installs and install their custom image,

00:07:34.360 --> 00:07:39.080
which comes with everything pretty much ready to roll. Massive shout out to Josh and Bob, you guys rock.

00:07:39.080 --> 00:07:42.960
Thank you for helping us achieve this silly goal. And the rest of the folks at Weka

00:07:42.960 --> 00:07:47.480
for allowing us to use your software in a very, very unsupported unconventional way.

00:07:47.480 --> 00:07:51.240
Oh, oh yeah. I basically had to trick Weka into thinking

00:07:51.240 --> 00:07:57.040
that each server is actually two servers. That way we could use the most space efficient stripe width,

00:07:57.040 --> 00:08:01.440
which for Weka means each chunk of data gets split into 16 pieces

00:08:01.440 --> 00:08:06.760
with two pieces of parity data calculated. 16 plus two is 18, which is also what nine servers

00:08:06.760 --> 00:08:12.280
times two instances of Weka gets you. For reference, in a supported configuration,

00:08:12.280 --> 00:08:15.280
we would have needed 19 discrete servers

00:08:15.280 --> 00:08:21.000
in order to accomplish this stripe width, including two for parity and one as a hot spare,

00:08:21.000 --> 00:08:26.600
which is fantastic for an enterprise environment like where Weka is meant to be deployed,

00:08:26.600 --> 00:08:29.960
but very expensive for us. Enough to ever gather.

00:08:29.960 --> 00:08:32.920
How fast is it? We haven't even tuned it yet, what do you buy?

00:08:33.320 --> 00:08:36.880
Okay, well we can tune it. First we can tune it. The biggest hurdle on the storage side

00:08:36.880 --> 00:08:41.280
was finding a way to limit the amount of data that flowed between the two CPUs,

00:08:41.280 --> 00:08:46.320
which may be a bit counterintuitive, but as soon as say the left CPU wants to send data

00:08:46.320 --> 00:08:49.360
via the network cards that are connected to the right CPU,

00:08:49.360 --> 00:08:54.440
that's a ton of latency. That's a lot of hops. And on top of that, there's a limited amount of bandwidth.

00:08:54.440 --> 00:08:58.320
And since this is such a memory intensive calculation, that's why we need so much RAM,

00:08:58.320 --> 00:09:02.640
and that's why we need all this storage, we don't wanna waste memory bandwidth.

00:09:02.680 --> 00:09:07.600
So we set up two Weka client containers, which is just their application that runs on a computer

00:09:07.600 --> 00:09:11.760
and allows you to access the storage. Each of those containers got 12 cores assigned to it,

00:09:11.760 --> 00:09:14.760
one per chiplet on our giant CPUs.

00:09:14.760 --> 00:09:19.240
So we can maximize the turbo speed? No, actually the reason for that is the cache.

00:09:19.240 --> 00:09:24.120
So those are 3DV cache CPUs. That gives us a certain amount of cache per chiplet,

00:09:24.120 --> 00:09:28.080
and we didn't want the buffers of Y Cruncher, which is like the amount of space it uses

00:09:28.080 --> 00:09:31.640
to like hold stuff in flight to spill out of that cache.

00:09:31.640 --> 00:09:36.440
Because as soon as you do, now it's in memory, more memory copies, more wasted bandwidth.

00:09:36.440 --> 00:09:42.400
And I tested a lot, which we'll get into a bit. But first, why don't we look at how fast it goes?

00:09:42.400 --> 00:09:47.520
Final setup, underscore final, underscore for real. These are just scripts to like make the Weka containers.

00:09:47.520 --> 00:09:51.200
Look how the cores, those ones are at 100% usage, those individual ones,

00:09:51.200 --> 00:09:57.040
those are all Weka IO cores basically. It's a lot of compute that needs to be reserved.

00:09:57.040 --> 00:10:00.400
But when you're talking like 100 plus gigabytes a second,

00:10:00.440 --> 00:10:03.480
which theoretically we are, but you haven't actually shown me that yet.

00:10:03.480 --> 00:10:07.800
Here's our little script. The interesting thing about those cores being used

00:10:07.800 --> 00:10:11.200
is that while you can just run an app and hope that it ignores them,

00:10:11.200 --> 00:10:15.520
the Linux scheduler, not always the best for that. So there's this command called task set,

00:10:15.520 --> 00:10:19.280
which allows you to like map whatever command or application you're running

00:10:19.280 --> 00:10:23.000
to only run on specific cores. Core one is a Weka core, we're skipping that one.

00:10:23.000 --> 00:10:26.640
Core nine, we're skipping that one. And then this is running two separate tasks,

00:10:26.640 --> 00:10:30.240
one for each of our mounts. And you can see it only has the CPU cores

00:10:30.240 --> 00:10:34.280
from CPU one or CPU two, dependent on the map folder we're using.

00:10:34.280 --> 00:10:37.640
Let me run over to Weka. Cute little dashboard.

00:10:37.640 --> 00:10:40.800
Woo! It's not a hundred, but it is pretty nice.

00:10:40.800 --> 00:10:44.160
That's writing. This is a write. That's right. That's just setting.

00:10:44.160 --> 00:10:46.640
You said you were doing read. It is a retest, but it's setting up the files.

00:10:48.240 --> 00:10:53.960
I was telling Jake as we were working on the review for this script, I was like, man, I've gotten kind of numb to these numbers.

00:10:53.960 --> 00:10:57.720
You know, after all the iterations of one, it can all that. You know, a hundred gigabytes a second,

00:10:58.440 --> 00:11:01.480
this is over the network. It never gets old.

00:11:01.480 --> 00:11:06.320
Actually, you know, you're numb to the numbers until the numbers you're looking at are like 200 gigabytes a second or something.

00:11:06.320 --> 00:11:10.320
But like, no, but dude, like the first time we cracked a hundred gigabytes a second.

00:11:10.320 --> 00:11:15.040
It was all installed locally. And that was no file system, all local.

00:11:15.040 --> 00:11:19.200
This is over a network. With a file system. With a functioning file system.

00:11:19.200 --> 00:11:22.440
Real ass actual copying data. Yeah.

00:11:22.440 --> 00:11:27.280
That's crazy. It is crazy. And look at the read. The latency is two milliseconds.

00:11:27.280 --> 00:11:31.120
It's because I'm like oversaturating this. So down here, you see the front end usage.

00:11:31.120 --> 00:11:34.120
That's the cores on this machine. They're being utilized a hundred percent.

00:11:34.120 --> 00:11:39.120
I have the system set to have four NUMA nodes per CPU because that made Y Cruncher a little bit happier.

00:11:39.120 --> 00:11:45.280
If I turn that off and do one NUMA node per socket, I was able to get this up to like 150 gigabytes a second.

00:11:45.280 --> 00:11:50.400
At the time, I actually set the record for the fastest single client usage.

00:11:50.400 --> 00:11:54.680
According to the WECA guys, they since have broken that with like GPU direct storage or whatever.

00:11:54.680 --> 00:11:58.360
But this isn't even with RDMA. This is just good code. Built for NVMe.

00:11:58.360 --> 00:12:02.160
It's also good SSDs. Oh brother. Yeah. Look at this.

00:12:02.160 --> 00:12:05.560
The average usage of the drives in the array right now is 23%.

00:12:05.560 --> 00:12:09.400
Wait, nothing. Shout out Kyokesia. We're running a mix of their CD

00:12:09.400 --> 00:12:13.120
and CM series Gen 4 drives. These things are super fast

00:12:13.120 --> 00:12:18.760
with individual drive read speeds that are in excess of five gigabytes per second.

00:12:18.760 --> 00:12:25.120
And that's not even the fastest they have. You step up to their Gen 5 drives and you're talking like 12, 13, 14 gigabytes a second.

00:12:25.120 --> 00:12:29.520
They're available in self-encrypting SKUs. They have Dyke failure recovery, power loss protection.

00:12:29.520 --> 00:12:32.640
They're perfect for your next server or data center deployment.

00:12:32.640 --> 00:12:38.120
Yeah. And this entire time running this application, I didn't have a single drive, had a single issue.

00:12:38.120 --> 00:12:42.040
You got a spreadsheet for tuning Y Cruncher? Dude, dude.

00:12:42.040 --> 00:12:45.680
When he adjust the glasses, you know, getting real. Okay. Here's Y Cruncher.

00:12:45.680 --> 00:12:49.200
Let's just do a normal pie run. 32 million, 25.

00:12:49.200 --> 00:12:53.880
Let's go. So this would have taken about half an hour on my old past one.

00:12:53.880 --> 00:12:57.600
It took 0.2 seconds to compute. Really? Yeah.

00:12:57.600 --> 00:13:01.080
What? For the uninitiated, Y Cruncher is the software we use to do this run.

00:13:01.080 --> 00:13:05.760
It was developed by a guy named Alexander Yee. Super nice guy, helped us do some messing about.

00:13:05.760 --> 00:13:10.000
Also, didn't help me that much. Honestly, I asked a lot of questions he didn't answer,

00:13:10.000 --> 00:13:13.880
but I figured it out anyways, I guess. To be clear, the storage, we've talked about,

00:13:13.880 --> 00:13:18.520
oh man, we need a lot of storage. It's because Y Cruncher uses the storage like RAM

00:13:18.520 --> 00:13:23.440
because the output of digits, like that 300 trillion, is only about 120 terabytes compressed.

00:13:23.440 --> 00:13:27.360
Wow, it's not that much. Could we make that available to people? Oh, God.

00:13:27.360 --> 00:13:31.000
I don't wanna think about that. We'll try. Maybe we'll do a torrent or something. Oh, God.

00:13:31.000 --> 00:13:34.360
But when you're setting up for a run, it actually tells you how much storage you need

00:13:34.360 --> 00:13:37.880
for this swap space, which is just basically like RAM plus.

00:13:37.880 --> 00:13:42.160
That's slower. That's what it's using it for. For us, it was like, yeah, we're probably gonna use

00:13:42.160 --> 00:13:45.600
like a 1.5 petabytes of space at peak.

00:13:45.600 --> 00:13:48.640
It's pretty crazy. Okay. But what did you tune, Jake?

00:13:48.640 --> 00:13:52.600
You wanna see the tuning? Oh boy. So this is like some of the tests I did.

00:13:52.600 --> 00:13:56.600
So why Cruncher was built for direct attached storage?

00:13:56.600 --> 00:14:01.720
And in fact, it doesn't even want you to use like a RAID controller or software RAID.

00:14:01.720 --> 00:14:05.280
It does its own internal RAID. And then on top of that,

00:14:05.280 --> 00:14:10.560
it also has things you can tune like, what multi-threading algorithm do you use?

00:14:10.560 --> 00:14:14.360
And like how many threads? And what size are your IO buffers?

00:14:14.360 --> 00:14:19.460
How much memory? How much memory and how many bytes can we read per seek?

00:14:19.460 --> 00:14:22.880
Ideally, because if you're using hard drives or SSDs, it's different.

00:14:22.880 --> 00:14:27.160
Got it. It was built in an older time and the code base is huge.

00:14:27.160 --> 00:14:30.240
And Alex just does it in his spare time as far as I'm aware.

00:14:30.240 --> 00:14:34.040
So no shade, super cool project. But at some point in the future,

00:14:34.040 --> 00:14:37.760
technology has gotten good enough that we can just rely on the operating system to do this.

00:14:37.760 --> 00:14:41.400
Like that Weka speed test we just did. Let's hope for that. One day.

00:14:41.400 --> 00:14:45.000
Anyway, with everything dialed in on August 1st, 2024.

00:14:45.000 --> 00:14:49.760
Yes. It was a while ago. Yes. Jake finally hit enter on his command prompt

00:14:49.760 --> 00:14:53.200
and began our glorious journey to nerd glory.

00:14:53.200 --> 00:14:57.080
Yeah, for 12 days. And then it stopped thanks to a multi-day power outage

00:14:57.080 --> 00:15:00.680
while I was on vacation. And it was so early in the process that I said,

00:15:00.680 --> 00:15:06.120
f*** that s***. Let's just start it again. I want to get a clean run with no outages.

00:15:06.120 --> 00:15:09.480
But it was smooth sailing from then on.

00:15:09.480 --> 00:15:13.520
No, it wasn't. See, even with a cluster this chonk,

00:15:13.520 --> 00:15:16.580
calculations like this take a lot of time.

00:15:16.580 --> 00:15:21.320
The previous 202 trillion digit record took a hundred days just to compute.

00:15:21.320 --> 00:15:24.360
And whether it's bad luck or user error.

00:15:24.360 --> 00:15:27.720
I think there's a little bit of user error. Finding a space in our facilities

00:15:27.720 --> 00:15:30.960
where a machine like that can operate completely uninterrupted.

00:15:30.960 --> 00:15:33.960
I have no idea what I just done plugged.

00:15:33.960 --> 00:15:37.880
How's your edit going? I'm holding your server. What's the challenge?

00:15:37.880 --> 00:15:43.200
At first things were pretty okay in the lab server room here. We had our air conditioning working to keep things cool.

00:15:43.200 --> 00:15:46.200
We had our battery backup to keep the digits flowing

00:15:46.200 --> 00:15:50.360
during a short outage or a brownout. It's just that over the course of this run,

00:15:50.360 --> 00:15:54.160
we had multiple other power outages and none of them were small.

00:15:54.160 --> 00:15:57.720
So each time our calculation had to stop and restart.

00:15:57.720 --> 00:16:01.600
And the same goes for when the cooling failed multiple times.

00:16:01.600 --> 00:16:06.040
It's pretty mid now though. The AC is fixed and that with our sick water door

00:16:06.040 --> 00:16:10.480
that is definitely not going to leak has room at around 22, 23 degrees.

00:16:10.480 --> 00:16:14.440
And the cluster is still running. Fortunately, Y Cruncher makes checkpoints

00:16:14.440 --> 00:16:20.340
which allows resuming the calculation. But it does mean our record could have been done much faster.

00:16:20.340 --> 00:16:26.060
Like based on the log somewhere in the neighborhood of like 30, 40, 50 days faster.

00:16:26.060 --> 00:16:29.100
The 300 trillionth digit of pie is five.

00:16:29.100 --> 00:16:33.940
Really? Ha ha ha ha ha. It's done baby.

00:16:33.940 --> 00:16:38.580
Wow. It only took way longer than it should have.

00:16:38.580 --> 00:16:43.620
190 days. Speaking of the logs, Jake's got them here right now.

00:16:43.620 --> 00:16:48.420
100 gigabytes a second read and then you're writing for like 30, 40 gigabytes a second.

00:16:48.420 --> 00:16:52.660
So that's as it's, what? Pulling in data from the swap space

00:16:52.660 --> 00:16:56.700
which is our NVMe drives and bringing it into the three terabytes of RAM

00:16:56.700 --> 00:17:00.060
that are in the system. So the crunch, crunch, crunch, crunch, crunch, crunch, crunch

00:17:00.060 --> 00:17:03.820
and huck it back over there. Write some data over there, yeah. What I don't see in the logs here

00:17:03.820 --> 00:17:07.220
is how much power this consumed. How much did this cost?

00:17:07.220 --> 00:17:12.340
I actually haven't done the math on that. I think it roughly draws around 8,000 watts

00:17:12.380 --> 00:17:15.100
which means 24 hours a day for a year.

00:17:16.300 --> 00:17:19.100
Yeah, it was like 10 grand. That's Canadian.

00:17:20.140 --> 00:17:23.900
You know, that's, are you kidding me right now? No. Like just CPUs.

00:17:23.900 --> 00:17:29.940
We don't even have GPUs in this thing. It's like 1,500 Watts in SSDs alone.

00:17:29.940 --> 00:17:34.700
Okay, but hey, that means our record should be safe for a while then, right?

00:17:34.700 --> 00:17:37.820
Well, it's possible, maybe even probable

00:17:37.820 --> 00:17:41.580
that someone is already working on a run that would beat this record

00:17:42.020 --> 00:17:45.020
and they could probably even do it on a single machine. They totally could.

00:17:45.020 --> 00:17:49.260
But that's how it is with computing and they can never take that piece of paper away.

00:17:49.260 --> 00:17:52.740
I can, I'm taking this one home. And don't forget about the other pieces of paper.

00:17:52.740 --> 00:17:56.140
Okay, real talk though. In school, we're taught that two digits of pi

00:17:56.140 --> 00:18:01.020
is enough to approximate most calculations but obviously, depending what you're doing,

00:18:01.020 --> 00:18:06.020
you could need a few more. Is there, in your mind, any practical use

00:18:06.020 --> 00:18:10.020
for 300 trillion digits? No, I mean other than for this.

00:18:10.020 --> 00:18:14.260
But it was fun. It's about the journey, not the destination Linus.

00:18:14.260 --> 00:18:19.180
It's about doing something cool with the help of Kyoksia who builds high quality, high performance storage

00:18:19.180 --> 00:18:24.580
for the data center and who will have link down below. It's about Weka and their crazy software.

00:18:24.580 --> 00:18:27.740
It's about Y Cruncher. It's about because we fucking could.

00:18:27.740 --> 00:18:30.900
Because we fucking can't. Just like we could also shout out

00:18:30.900 --> 00:18:35.100
some of the other folks who helped us. Yeah, Josh and Bob again. Thank you so much from Weka.

00:18:35.100 --> 00:18:39.180
Gigabyte for sending us that server and I haven't made content about it in like four years.

00:18:39.180 --> 00:18:43.500
Just, just thank you. AMD, AMD sent the CPUs for the compute node

00:18:43.500 --> 00:18:48.180
like three years ago. Finally, thank you. I swore I was gonna make this video

00:18:48.180 --> 00:18:51.940
and it happened. It just took longer than I thought. Thank you, James, the writing manager

00:18:51.940 --> 00:18:56.580
for being patient with this project. And hey, if you guys wanna check out

00:18:56.580 --> 00:19:01.060
more Linus and Jake shenanigans, how about the high availability cheapo computers?

00:19:01.060 --> 00:19:04.420
That was fun. Cheapo computers? Yeah, I remember. Oh, that was cool.

00:19:04.420 --> 00:19:07.500
Yeah, that was super cool. I don't know if we didn't actually do the cheapo one. We did it, we just did the demo.

00:19:07.500 --> 00:19:10.020
Just for the intro. I know, but it was cool. Yeah, yeah, that was cool.

00:19:10.700 --> 00:19:14.380
Yeah, this was fun. I don't think I ever wanna do this again.

00:19:14.380 --> 00:19:15.220
And cut.
