WEBVTT

00:00:00.080 --> 00:00:07.279
back when we installed a petabyte worth of hard drives in our server closet we

00:00:04.400 --> 00:00:12.160
were sure that with that much storage we'd be good for a long time and in

00:00:10.000 --> 00:00:14.960
fairness i guess two years worth of red footage

00:00:13.200 --> 00:00:20.560
is pretty good but it finally happened we are

00:00:18.080 --> 00:00:25.519
critically low on space i've got less than five percent available on our main

00:00:22.800 --> 00:00:31.279
editing server but with only 20 terabytes available on the vault there

00:00:27.680 --> 00:00:33.920
is nowhere to dump it to fortunately

00:00:31.279 --> 00:00:38.160
i've got a band-aid solution the very best kind of solution

00:00:35.600 --> 00:00:42.719
seagate actually sent over these 15

00:00:39.360 --> 00:00:46.879
12 terabyte iron wolf pro drives for a

00:00:42.719 --> 00:00:49.760
totally unrelated project that um well

00:00:46.879 --> 00:00:53.520
didn't actually go very well so instead of using them for that we're

00:00:51.520 --> 00:00:57.840
going to use them to add more capacity to the vault

00:00:55.680 --> 00:01:02.320
speaking of the vault i keep all my segways in a vault ridge wallet is the

00:01:00.320 --> 00:01:06.479
sleek way to keep wallet bulge down with its compact frame and rfid blocking

00:01:04.239 --> 00:01:11.560
inner plates use the offer code LTT september to save 10

00:01:08.240 --> 00:01:19.600
and get free worldwide shipping

00:01:19.600 --> 00:01:26.159
all right so while i wait for Anthony to come down i'm gonna run through how this

00:01:24.240 --> 00:01:31.920
whole thing is going to work with some really crude diagrams here our petabyte

00:01:28.960 --> 00:01:37.119
cluster uses a file system it's open source and it's called gluster fs that's

00:01:34.720 --> 00:01:43.280
what allows these two independent servers here to present to the rest of

00:01:39.360 --> 00:01:45.759
the network as a single large share now

00:01:43.280 --> 00:01:49.119
we could increase our capacity by adding more server boxes

00:01:48.000 --> 00:01:53.520
but that's a project for another day

00:01:51.280 --> 00:01:58.960
literally that's a project for another day we've got a video coming with a full

00:01:56.000 --> 00:02:03.280
petabyte of storage in a single server rather than two so make sure you're

00:02:00.719 --> 00:02:09.119
subscribed so you don't miss that today we're going to take the empty 15 bays in

00:02:06.640 --> 00:02:15.599
server delta 2 down here and we're going to expand our storage with another

00:02:12.200 --> 00:02:18.640
180 terabytes of raw capacity we don't

00:02:15.599 --> 00:02:20.480
get to use all of this capacity though

00:02:18.640 --> 00:02:24.400
our gloucester fs implementation is geared towards raw capacity

00:02:22.400 --> 00:02:30.239
rather than redundancy so the only fail-safe built into our local network

00:02:26.480 --> 00:02:31.360
here is the raid z2v devs in each of our

00:02:30.239 --> 00:02:37.280
machines what that means is that out of our 15

00:02:33.599 --> 00:02:40.400
drives only 13 of them count towards our

00:02:37.280 --> 00:02:43.360
capacity with the rest these two taken

00:02:40.400 --> 00:02:48.319
up by parody data that protects us from data loss in the event of a physical

00:02:45.360 --> 00:02:51.519
drive failure or a cable failure speaking of which

00:02:49.680 --> 00:02:56.560
how about we replace that uh bad cable in delta one oh that's gonna be quite a

00:02:53.840 --> 00:03:01.840
project uh guys uh the vault is going offline for

00:02:59.440 --> 00:03:06.319
probably about two hours tm just vault so wanaka is up and these

00:03:04.159 --> 00:03:10.480
servers are so heavy so we're gonna have to empty all the drives out of them onto

00:03:09.040 --> 00:03:14.560
one of these carts then take the server out put it on the

00:03:12.560 --> 00:03:18.159
cart and wheel it over to the island where we can work on it i'm still super

00:03:16.319 --> 00:03:22.080
proud of this cabinet door holding open innovation we drilled a hole in the side

00:03:20.159 --> 00:03:26.000
of the keyboard tray and now we can get the servers out super easily you know

00:03:24.159 --> 00:03:29.920
you can just take the door off right no i let the door has a filter

00:03:28.800 --> 00:03:33.920
like that's the purpose of the door for us anyway

00:03:32.319 --> 00:03:37.680
yes i know i could take the door off i mean with the sides off the back's off

00:03:36.159 --> 00:03:41.360
clearly we figured that out thank you youtube comments

00:03:39.280 --> 00:03:46.640
it's quick to label a drive it's slow to label 60 drives and this actually has to

00:03:44.000 --> 00:03:48.959
be legible that's important why don't you actually work your way from the

00:03:47.840 --> 00:03:53.360
front and i'll work my way from the back

00:03:51.200 --> 00:03:58.080
okay and we'll do this like uh you know lady and the eating the spaghetti

00:03:55.120 --> 00:04:01.599
style uh that's a strange mental image but okay speaking of

00:04:00.080 --> 00:04:05.680
terrible segues sure is cold in this server room good

00:04:03.200 --> 00:04:08.879
thing i'm wearing my LTT swacket it's a sweater it's a jacket LTT Store.com

00:04:07.599 --> 00:04:14.879
don't worry about it i don't think most people feel bad for me struggling to

00:04:12.560 --> 00:04:19.440
stack all my hard drives that are very time consuming to stack

00:04:17.040 --> 00:04:23.919
why can't i stack all these hard drives if you could grab this end though and

00:04:21.759 --> 00:04:27.759
help me up onto the cart that would be swell

00:04:25.840 --> 00:04:32.240
oh it's hooked on my it's hooked on my fly okay i got it that's fine

00:04:30.560 --> 00:04:37.520
one thing we want to do as we're carting this over to the kitchen there is be

00:04:35.280 --> 00:04:40.320
really really gentle with the way that we're moving this

00:04:38.800 --> 00:04:45.360
this is about what

00:04:42.120 --> 00:04:47.440
350 terabytes of our company's valuable

00:04:45.360 --> 00:04:51.040
data on here right now and i remember patrick from serve the home telling me

00:04:49.040 --> 00:04:55.120
that yahoo had an incident where they moved their data center like across the

00:04:52.720 --> 00:04:59.440
parking lot rolling hard drives on carts not unlike this one and all the

00:04:57.120 --> 00:05:03.680
vibration killed like a significant portion of their drives so we could lose

00:05:01.120 --> 00:05:07.199
up to eight of them depending on oh seven because one of them is already

00:05:05.280 --> 00:05:10.080
degraded but but we don't want to do that so i guess this is the part of the

00:05:08.639 --> 00:05:15.280
video where we explain what happens when a cable fails in a storenator now most

00:05:13.520 --> 00:05:18.800
bulk storage servers with like lots and lots of hard drives use what's called a

00:05:16.800 --> 00:05:25.440
backplane so they'll take fewer connections off of your

00:05:21.440 --> 00:05:27.360
SATA or your sas adapter or your raid

00:05:25.440 --> 00:05:31.120
card or whatever the case may be and then they will take those and they will

00:05:29.199 --> 00:05:36.000
split that bandwidth across multiple drives 45 drives takes a bit of a

00:05:33.360 --> 00:05:40.479
different approach so they wire every single drive up individually across less

00:05:39.039 --> 00:05:44.880
of a back plane and more of an underplane the advantage is you get the

00:05:42.560 --> 00:05:49.360
full bandwidth another advantage is that in the event that you a connection fails

00:05:47.680 --> 00:05:55.039
you're not replacing an entire costly back plane but the disadvantage is that if a cable

00:05:53.759 --> 00:05:59.919
fails you are digging this entire apparatus

00:05:58.080 --> 00:06:05.520
out to replace one flaky table so there were two drives

00:06:03.120 --> 00:06:09.280
that were dead uh one of them it was fine when we replaced it it was it came

00:06:07.280 --> 00:06:12.479
back up everything rebuilt and everything was rosy but then there was

00:06:10.960 --> 00:06:17.680
the other one we replaced it a couple times we actually replaced the controller cards itself and tried

00:06:16.080 --> 00:06:20.960
different ports on different controllers that we knew worked

00:06:19.440 --> 00:06:25.120
but it still had the same issue and what's weird is sometimes it would kind

00:06:23.039 --> 00:06:28.319
of work and we could start rebuilding the data on it but like what kind of

00:06:26.639 --> 00:06:32.639
data speeds were we getting it would start out at like you know 300

00:06:30.639 --> 00:06:36.400
400 megabytes per second which is kind of low but fine but then it would go

00:06:34.720 --> 00:06:42.080
down to like 10. yeah and the eta was like a year

00:06:39.280 --> 00:06:42.080
yeah like

00:06:42.560 --> 00:06:49.840
come on by the way evidence that my dust filter works just

00:06:47.199 --> 00:06:53.360
great yeah it looks brand new there is one little bit here that i

00:06:51.199 --> 00:06:57.840
noticed but that's it okay so let's not hate on my filtered front cabinet door

00:06:55.840 --> 00:07:03.919
there okay okay so this guy needs to come out now this

00:07:01.599 --> 00:07:09.360
is a little tricky does that entire plane need to come out i hope not yeah

00:07:06.960 --> 00:07:12.639
cause like i'm looking at this and this needs to go up under there in order to

00:07:10.880 --> 00:07:18.240
get screwed in there and in order to do that this like straight sucks

00:07:16.479 --> 00:07:21.039
this is exactly why i haven't had time to do it until now

00:07:22.080 --> 00:07:27.360
how long did we tell the editors this was going to be down for two hours

00:07:26.400 --> 00:07:31.520
um one thing i did notice though is that

00:07:29.520 --> 00:07:35.919
with how tightly integrated it is into the bottom of the case i

00:07:34.080 --> 00:07:38.800
like i don't know how i can't really get it very well

00:07:37.360 --> 00:07:43.199
okay they're shooting tech link now so we're going to have to do asmr server

00:07:40.400 --> 00:07:48.240
upgrade so i pulled these out and i can see where the cable goes so

00:07:45.840 --> 00:07:54.080
yes i will in fact have to pull out this and this i don't see anything obviously

00:07:51.199 --> 00:07:54.080
defective about it

00:07:54.879 --> 00:08:02.160
that's an angry episode of teclint someone must have removed some headphone

00:07:58.400 --> 00:08:02.160
jacks i hope it wasn't samsung

00:08:03.759 --> 00:08:11.440
okay so this is it we begin the funeral procession again and this way

00:08:08.720 --> 00:08:15.039
it's an open casket oh yeah should we close this server

00:08:13.919 --> 00:08:17.919
maybe like now is as good a time as any to do

00:08:16.960 --> 00:08:23.199
that uh if we're gonna do full yellow and assume

00:08:21.039 --> 00:08:26.479
that we have everything right then yes otherwise no

00:08:24.400 --> 00:08:30.319
otherwise no let's go now let's put some hard drives in

00:08:28.319 --> 00:08:34.959
yeah having to keep track of everything sucks should have just got a jellyfish

00:08:33.200 --> 00:08:40.000
so that drive with the rubbed off label that i wasn't 100 sure about i got to

00:08:37.279 --> 00:08:44.640
the end and the only thing left is 131 which is clearly not that unless i have

00:08:41.839 --> 00:08:48.000
a wicked case of dyslexia and i put 131 over here so i think it's time for a

00:08:46.320 --> 00:08:51.600
sanity check here we go ladies and gentlemen

00:08:49.839 --> 00:08:55.600
okay uh Anthony you got the drives do you want me to turn it on first should

00:08:53.760 --> 00:08:59.839
we do it yeah it's not hot swap if it's not on all

00:08:58.240 --> 00:09:05.040
right we'll hot swap it we'll hot swap it i'm turning it on

00:09:02.720 --> 00:09:09.839
okay here we go uh Anthony you want to take the wheel here sure all right

00:09:11.200 --> 00:09:16.800
let's see z pool status okay so guys you

00:09:14.720 --> 00:09:24.080
can actually see here what was going on with one of our raid z2s so each of the

00:09:20.080 --> 00:09:28.399
15 drives is a raid z2 v-dev so this

00:09:24.080 --> 00:09:30.399
raid z2 raids 2-0 is online you can see

00:09:28.399 --> 00:09:35.440
the whole z-pool is degraded though that's because raid z2 won here drive

00:09:33.279 --> 00:09:38.640
117 the one that we just replaced the cable for

00:09:36.560 --> 00:09:42.800
is unavailable and then these are all the previous attempts at rebuilding it

00:09:40.640 --> 00:09:48.160
with different drives now we're going to try again but with a new cable so we

00:09:45.600 --> 00:09:52.720
fixed 117 but now we've got four drives offline oh balls that's like way down

00:09:51.200 --> 00:09:56.560
there you want to let me know if anything changed five six seven and

00:09:55.040 --> 00:10:01.839
eight disappeared they're unavailable so that's the wrong one then yep

00:09:58.640 --> 00:10:03.760
damn it okay so we're back

00:10:01.839 --> 00:10:09.200
and all the drives are here but they're not showing up with their um

00:10:06.880 --> 00:10:13.200
like their 45 drive storinator friendly ids here and also

00:10:11.200 --> 00:10:17.040
five of them are re-silvering that seems bad how do you re-silver five drives

00:10:15.279 --> 00:10:21.760
these are re-silvering because they got cut without being offline first uh in

00:10:19.440 --> 00:10:26.079
the meantime should we add the other 15 drives to delta ii

00:10:23.519 --> 00:10:29.519
what could go wrong what could go wrong so do we need to make a brick

00:10:29.920 --> 00:10:35.519
because that's gluster's crap no we don't need to do that with buster it's

00:10:33.200 --> 00:10:37.920
all in under slash z pool pretty sure okay

00:10:36.560 --> 00:10:42.959
i don't see anything here that looks like just making a v dab

00:10:40.000 --> 00:10:47.760
yeah me neither oh okay so we just do z pull create z pool

00:10:45.360 --> 00:10:53.360
raid z2 and then the paths to the disks but if our disks don't show up

00:10:50.640 --> 00:10:58.160
okay hold on so let's do this and it is online

00:10:55.360 --> 00:11:01.519
so we might need to restart oh okay so maybe they're not hot

00:10:59.600 --> 00:11:05.600
swappable it should be but maybe not

00:11:04.079 --> 00:11:10.959
i don't know if they're configured the server to not find them yeah their special driver

00:11:08.800 --> 00:11:15.519
might actually not do that okay well let's see

00:11:12.959 --> 00:11:20.320
in the meantime we can check in on delta one and see if it's re-silvering a

00:11:17.360 --> 00:11:24.000
little faster now it is not so now what we've replaced literally

00:11:22.480 --> 00:11:28.560
everything the drive the cable the controller i

00:11:26.959 --> 00:11:32.959
mean do we want to just pop the drive out and pop it in one more

00:11:30.320 --> 00:11:35.200
time and see what happens we can try it okay

00:11:36.000 --> 00:11:43.200
brand new drive okay so delta ii is rebooted now and

00:11:41.120 --> 00:11:48.959
we've got 1.1 okay so i guess

00:11:45.760 --> 00:11:50.399
throw them all in yeah and see if we get

00:11:48.959 --> 00:11:56.320
them all and if so we'll create the raid z2 then at least

00:11:53.200 --> 00:11:58.800
it may be degraded but it's bigger

00:11:56.320 --> 00:12:02.640
so all 15 drives for the expansion are in delta ii now and i switched back over

00:12:01.200 --> 00:12:06.880
to delta one and i have good news

00:12:04.640 --> 00:12:10.000
our re-silvering is going at 1.91 gigabytes a second

00:12:08.560 --> 00:12:14.880
which is pretty sweet that means it should only take a few days

00:12:11.920 --> 00:12:17.519
so 117 it's not there is it it's there okay but

00:12:16.639 --> 00:12:22.800
but when i try to replace it new device is a different optimal sector size what the

00:12:21.120 --> 00:12:28.240
crap so i need to figure out what that is well hold on hold on that's not like a

00:12:26.000 --> 00:12:31.279
4k sector drive is it

00:12:29.600 --> 00:12:35.200
it might be because that would probably i don't i don't know this for a fact

00:12:33.680 --> 00:12:40.880
but i suspect mixing 4k sector drives with

00:12:37.760 --> 00:12:43.279
512 sector drives in in some kind of an

00:12:40.880 --> 00:12:46.480
array is probably super terrible so do you want it to soft or it's offline it's

00:12:44.800 --> 00:12:50.639
already offline yet okay did we accidentally buy the wrong drives

00:12:48.000 --> 00:12:55.440
Anthony oh my god that couldn't be like the problem could

00:12:52.880 --> 00:12:58.560
it they're not are these advanced format yeah yeah

00:12:57.360 --> 00:13:03.600
wait you have got to be kidding me

00:13:01.519 --> 00:13:07.920
is that why these replace operations haven't been working possibly you know

00:13:06.240 --> 00:13:11.680
what we can check the original video unboxing the petabyte project it's a

00:13:09.760 --> 00:13:15.360
good thing our entire life exists on youtube at least mine does

00:13:13.519 --> 00:13:19.440
okay where's some b-roll of a hard drive here oh no they are advanced format so

00:13:17.760 --> 00:13:22.800
they're all there but they're not there

00:13:26.480 --> 00:13:34.880
there we go new raid z2 raid z23

00:13:31.360 --> 00:13:37.040
includes 15 drives all online 117 is now

00:13:34.880 --> 00:13:41.440
back online re-silvering and it's doing it at almost 900 megabytes per second

00:13:39.839 --> 00:13:45.760
so we're good so

00:13:43.519 --> 00:13:49.760
uh i did notice curiously

00:13:47.200 --> 00:13:52.560
that the vault is still not any bigger right

00:13:50.959 --> 00:13:55.839
i believe the gluster at best service either needs to be restarted or i might

00:13:54.320 --> 00:13:59.680
need to adjust the config real quick okay uh so let me just quickly look at

00:13:58.880 --> 00:14:05.519
this in delta wait is it one of these uh disc.conf is it this dot config here oh

00:14:03.760 --> 00:14:11.440
yeah there it is hey i helped

00:14:07.600 --> 00:14:11.440
did i help on a Linux thing yes or no

00:14:12.480 --> 00:14:18.000
i did usually i'm not nearly as helpful when

00:14:15.600 --> 00:14:23.360
it comes to server stuff okay so final update guys it was just a matter of

00:14:20.160 --> 00:14:26.160
getting volume four's brick integrated

00:14:23.360 --> 00:14:30.720
into gluster so you can see everything's here volume one two three one two three

00:14:29.199 --> 00:14:35.279
four four sure whatever doesn't matter no big deal

00:14:33.360 --> 00:14:38.399
so now we're gonna fire over to our other server this is wanic server it

00:14:36.800 --> 00:14:42.240
runs Windows whatever want to fight about it

00:14:40.240 --> 00:14:45.839
we've got our z drive which is actually even tighter for storage than it was

00:14:44.160 --> 00:14:52.399
before we have less than a terabyte of space left but that's no problem the

00:14:48.800 --> 00:14:54.959
vault 163 terabytes ready to freaking go

00:14:52.399 --> 00:15:01.120
and as for delta one it is re-silvering at 375 megabytes a second i can push

00:14:58.160 --> 00:15:05.600
this back in very ever so slowly and gently and we're done bud by the way if

00:15:03.839 --> 00:15:10.240
you guys liked this video we actually did a video rebuilding our water cooled

00:15:07.920 --> 00:15:13.279
render server down here uh you can check that out up there

00:15:11.680 --> 00:15:17.760
and if you're not into that you can check out our sponsor for today's video

00:15:15.519 --> 00:15:22.240
ifixit the ifixit essential electronics toolkit is compact so it can go anywhere

00:15:19.760 --> 00:15:25.360
you can and help you fix almost anything it includes their most popular precision

00:15:23.839 --> 00:15:29.279
bits and they're held in place with high density foam so you can throw it around

00:15:27.519 --> 00:15:34.160
without any of the bits falling out and of course it comes with ifixit's

00:15:31.199 --> 00:15:38.320
lifetime warranty it's just 24.99 at ifixit.com forward slash Linus so go

00:15:36.800 --> 00:15:43.199
check it out today that's it for this video i will see you guys next time when

00:15:40.959 --> 00:15:45.680
we're installing the single box petabyte project
