All of our data is GONE!
Linus Tech Tips
·Linus Tech Tips
·2016-05-06
·
3,524 words · ~17 min read
0:00
Well, hello YouTube
0:01
We have a bit of a problem today on this section of the moving, but we've been moved in for months moving vlog
0:09
Something is going on with the server
0:11
and we haven't been able to work on a project for more than like 30 minutes at a time before the server crashes and
0:17
Our computers all freeze and then we're all sad because we can't work on them. Most of the sh** we do is on the server
0:22
Well, I'm doing I'm actually doing some writing right now because I can do that
0:26
I'd love to be working on my next video. Anything that doesn't need the server we can do now
0:30
No, that's not much, but it always seems to go down
0:34
Yeah, at the worst when there's nothing else to do. No kidding, but you know YOLO you only
0:41
Liao once
0:45
See let's see the stress that Linus is going through right now
0:48
I'm only getting about 25 minutes per second, 30 minutes per second
0:53
It's probably gonna crash like while you're doing this
0:57
Yes
0:59
How do we fix servers? How do you fix it?
1:02
Tell Linus he will fix it
1:05
That's your go-to. Yeah
1:07
Whether you're running a small business or a
1:17
One billion dollar or even multi-billion dollar enterprise Rackspace has your dedicated storage environments covered
1:24
Check out the link in the video description to learn more
1:27
All right
1:28
So here's the situation
1:29
Wonix server over the course of the last couple of days has been spontaneously going offline
1:35
And turning off in spite of my best efforts to turn it back on and
1:41
Complete the backup that I'm in the middle of trying to do to our new unraid vault
1:47
While I was standing in front of it. I
1:50
observed
1:53
One of my raid controllers giving up the ghost
1:56
So as a reminder this server is running three raid fives
2:00
Striped together in Windows of a total of 24 SSDs
2:04
So if one of the raid fives drops out in time
2:07
Entirely all the data is gone only about 10% of the backup that I was just in the middle of
2:14
remaking is
2:16
saved
2:18
so
2:19
It's time to
2:22
Investigate. All right. So here's wonix server
2:26
So far troubleshooting steps that I've tried you never want to you never want to rebuild something like this if there's a possibility
2:33
You're gonna have to go to a data recovery service because that can make it much much worse
2:38
so
2:39
In terms of troubleshooting steps, I tried transplanting it into another case so that I could use a different
2:47
SATA backplane that didn't work. I tried a different power supply a better power supply
2:54
That didn't work
2:55
None of that fixed it and I even went as far as to put in
2:58
Another LSI raid card that I had and try to import the array
3:02
But while it did detect my drives all of them as unconfigured good and a foreign array
3:08
It was not able to import it
3:11
So there is some good news there though while I'm getting firmware errors kicked out by this card right here when I try to boot
3:18
All the drives are detected by another card
3:21
So hopefully it didn't write too much garbled data to the array
3:29
Okay, so this is basically what I've been doing for the last
3:32
14 hours is sending emails contacting different
3:36
Data recovery services
3:39
There's some that are local
3:40
there's some that are not local, but I'm not ruling those ones out. In fact, right
3:46
now the most likely solution looks like WeRecoverData.com. They've got some
3:52
custom tools that they've created that will allow them to SSH into the system
3:58
and actually potentially recover the missing RAID 5 and then the
4:05
other two RAID 5s and assemble them all together and export them as one
4:11
gigantic pile of data to one of our servers, all without me actually having
4:17
to send the drives away, which would be pretty ideal. Now obviously an approach
4:22
like that would not work in the case of physical damage to the drives, but
4:26
because it is only a RAID controller issue, they're saying, hey, there's a shot
4:32
at this, so let's give it a try.
4:34
So I just
4:36
downloaded their remote recovery client thing here. So right now I'm waiting for
4:44
their custom Linux-based tool to load on a bootable external SSD so that they can
4:51
SSH into the server. I'm not having a good couple days. Right now I can't get
5:00
the utility to detect my USB drive. It's called tech support. Oh bloody hell, I just
5:10
figured it out.
5:11
This is Linus calling, by the way. I just realized I have to run the WRD disk
5:16
image as administrator. You guys should put that in the documentation. It never
5:25
fails. This is why it's a good idea to call tech support, because the second you
5:29
call them, you'll solve the problem on your own.
5:32
Alright, so in theory we now have a bootable drive. Okay, so step one is
5:45
getting this unnecessary extraneous server out of the way. I'm grabbing a UPS to use.
5:53
Now, in a perfect world, I would have a motherboard or an HBA card, a host bus
6:03
adapter card, that could allow me to plug in all 24 of the drives for them to do
6:09
their data recovery magic on. But what we're gonna have to settle for is having
6:14
eight of them plugged into the motherboard at a time as each RAID 5 is
6:19
rebuilt. RAID cards do all kinds of funny, funny nonsense and they don't give other
6:24
software
6:25
direct access to the drives at a bit level in some cases. So we need to remove
6:30
all these drives and plug them in directly to the motherboard. Okay, now it
6:38
should boot to that, in theory. Let's just make sure all the drives are detected in
6:43
the BIOS and everything. Oh, that's a problem. I need to pull out the other
6:47
RAID controllers as well, otherwise they're gonna be looking for their
6:50
drives and I don't have them powered up right now. Okay. Oh, actually that doesn't
6:57
look very good.
6:59
DCI-E port errors. What? So let's make sure all of our drives are even showing
7:14
up. Uh-oh. That is extremely worrying. Oh, that is extremely worrying. Oh, that's not
7:32
good. That means potentially three of the drives in a RAID 5 are dead, which means
7:38
this data is not coming back. None of it.
7:40
Not this particular RAID?
7:44
No, none of it.
7:46
None of the...
7:48
But I thought we have three RAID 5s.
7:49
None of it will be coming back if we lost three drives from one of the RAID 5s.
8:00
So let's try moving things around. I've never tested the SATA ports on this board.
8:06
It's possible three of them are dead, however extraordinarily unlikely that
8:11
would be. Five, six. Okay, we've got six now. Okay, that gives some hope. I'm just
8:21
moving SATA ports around. I was just gonna say right now I'm even having difficulty
8:26
in this system to post. So I'm at a whole new level of concern. Yeah, I'll let you
8:34
know. I'll send an email. Okay. So I'm at the point now where I don't want to put the
8:42
drives on this system. Let's go with this one. I didn't like all those PCI Express errors
8:48
that the bootable USB was kicking out. I don't like that some of these drives aren't showing
8:53
up. I'm starting to wonder if that's what corrupted the LSI card. This royal pain in
8:58
the ass.
8:59
A thousand things I can think of that I would rather be doing right now. Okay, I got seven
9:06
right now. Really? Yeah. Wow, maybe it was bad PCIe. Linus speaking. Well, I can tell
9:13
you what's happening so far is I put it on the system that I was running the RAID card
9:18
and the drives in and that baby started spitting out PCIe errors all over the place. So I got
9:26
everything pulled out of that system and I have plugged all the drives now into a different
9:32
disk bench and I'm trying to see because I wasn't even getting all the drives showing
9:34
up. Two, three, four, five, six, seven, eight. Okay. Hey, you guys are on the phone for the
9:40
magic moment where all eight drives are detected by a new motherboard. So I'm beginning to
9:44
think there may be a motherboard issue that caused the firmware on that LSI controller
9:48
to corrupt in the first place. So I'm actually pressing enter and I am booting up to your
9:52
USB and it is loading correctly on this. I think that board is dead. Okay. Well, that's
10:01
I don't know if that makes it worse.
10:03
Probably does, doesn't it? You can tell me. I can take it. It says waiting up to 60 more
10:08
seconds for network configuration. Good timing with the follow-up call. If you called one
10:14
minute earlier, I've been cursing you for dialing my number. So I have to answer my
10:17
phone because I was in the middle of thinking only five of my drives were being detected.
10:21
You guys know what you're doing, right? I had a local shop that I was going to physically
10:25
take it to and they basically, they were like, they kind of freaked out when I said I was
10:30
having someone use their remote tool. They were like, well, unless these guys really
10:33
know what they're doing.
10:33
That's pretty dangerous. And I was like, okay, well I'll, I'll check with them. I'll make
10:36
sure they, they know what they're doing. So, so you guys know what you're doing. So basically
10:41
they're explaining that they're going to virtually Mount the volumes. So we're not doing any
10:46
writing to my discs. Um, so even in the event that we cannot recover it by this means, uh,
10:53
that still leaves other options open to us. Okay. They're in. Okay. Well, that's, that's
11:05
very wonderful to hear, but I'm trying not to get too excited right now. They sound pretty
11:09
confident. This is like the ultimate roller coaster right now. So to put this in the appropriate
11:15
context for the viewers right now, one X server has several in progress projects stored on
11:23
it that do not have a backup anywhere else, including a fully shot channel, super fun
11:27
that involved a thousand dollars in equipment rental, um, including, uh, lots of footage
11:33
from Linus tech tips videos. Like it is going to be a huge, huge, devastating loss for us.
11:39
To lose this. I mean, it's got a lot of our, our templates for editing, uh, editing videos
11:45
for, for scripting templates for employee reviews, like all kinds of stuff. Like this
11:50
is our main working server. No, the offsite backup server has not ever been built yet.
11:55
The vault was the backup server. If I ever have to like act in a, in a video where I
12:00
have to look for Lauren, I'm going to think back to this moment. Okay. So I do need to
12:04
reboot it once in order to add that eight terabyte drive. So anyway, right. Okay. Thank
12:09
you very much, gentlemen.
12:09
I hope to speak to you, uh, with good news very soon, 12 SATA connectors. Ah, ah, ah.
12:18
Okay. So let's make sure everything is showing up 1, 2, 3, 4, 5, 6, 7, 8 of those are boot
12:26
device and our eight terabyte recovery media. We are ready to rock. Okay. So within about
12:34
10 minutes, they contacted me back. They said, okay, yeah, we're ready to have all the drives
12:38
in there. So that means we are going to have to do some janky ass.
12:44
Right now. So basically what you're looking at here is these drives are going to be powered
12:53
off this power supply with a jumper. These cards and all these drives are going to be
12:59
powered off of this rig. So now we can power all those drives without powering up this
13:03
motherboard or any of that nonsense.
13:05
Okay? So we have a grand total of 25 drives connected to this test bench. 16 of which
13:13
are running in Bay's.
13:14
of this enclosure and eight of which are spread out here.
13:21
So there we go.
13:22
There's one of my LSI controllers.
13:24
There's its eight drives.
13:26
There's my other one.
13:28
Whoa, where are your virtual drives?
13:34
I think I remember a troubleshooting step
13:37
where I tried swapping all the drives to different bays
13:40
where I was trying to see if it was the back plane
13:42
or the cables.
13:43
Wow, panic moment there, but it's okay.
13:48
So now we should see two virtual drives here on the screen.
13:54
That's one set.
13:56
Show me the second set.
13:59
Oh, I thought we lost another RAID 5 there.
14:06
Okay, it's good.
14:09
Feeling good.
14:11
Reality TV ain't got nothing on IT work.
14:14
Show me all the devices.
14:17
Two RAID adapters, one eight terabyte drive,
14:19
one, two, three, four, five, six, seven,
14:21
eight Kingston drives.
14:23
Boom.
14:25
Okay.
14:26
I will let...
14:27
the data recovery specialists know.
14:30
All right.
14:31
So all that's left now then I think,
14:32
I got in touch with them and I asked if we can put
14:35
a 10 gigabit network card in
14:37
so we can accelerate the offloading of all of our files.
14:41
So all we're waiting for now
14:42
is for them to reboot the machine
14:44
and set me up with a network share that I can access
14:47
and fingers crossed,
14:50
just start pulling off all the footage
14:52
and well, everything else.
14:54
All right.
14:55
So check this out here, around here.
14:56
So I've got the instructions
14:58
for how to access our data.
15:02
So I go into the Tritium share
15:08
and there it is.
15:09
Bunch of EULAs, a bunch of stuff.
15:13
We've got what looks like the file structure.
15:18
Let's test it.
15:20
How about one that I,
15:22
that we haven't finished releasing yet?
15:24
Like something from the, the melee battle here.
15:26
Let's just, let's watch it.
15:29
Hmm.
15:31
Okay. XA footage for sure.
15:36
Oh, okay.
15:48
Here we can copy one of these.
15:49
Let's copy one of these clips to my local machine
15:52
and see if that helps.
15:54
No.
15:55
Oh boy.
16:00
Let's have a look.
16:01
I wonder if everything's even here.
16:03
Like I don't even see the Linus tech tips folder.
16:05
One and two are missing.
16:09
Useful applications is empty.
16:12
That should be full of things.
16:14
Okay. Well, I guess we'd better get in touch with them.
16:20
So thanks for watching.
16:21
Subscribe, follow all that good stuff and see you next time.
16:26
I just got an email back from them.
16:28
Please stand by addressing your emails with the engineer.
16:31
It's possible the RAID definitions were changed
16:34
from one reboot to the next
16:36
because of drive letter assignments in Linux.
16:39
It's 26, so it's 26 drive limit.
16:41
We have, I think we have more than 26 devices in there.
16:45
I got another email back.
16:46
Linus, we recreated the block devices.
16:49
Please go ahead and check the videos in context.
16:50
That you were looking at again.
16:52
No way.
16:56
Oh, the suspense.
16:59
I'm not enjoying this.
17:01
Is nothing there?
17:02
I'll let them know the chair's empty again.
17:05
This is just like the wildest roller coaster ever.
17:10
I don't even, yeah.
17:11
I don't even have anything else to say actually.
17:14
It's almost funny.
17:16
I hope to some of you it's funny.
17:19
Here's the email I got 17 minutes ago.
17:21
I was stuck on a call.
17:22
We got everything remounted.
17:24
So you can now go ahead and view the data.
17:25
Once more.
17:26
This is the moment.
17:30
Do we not have a two?
17:32
I don't think we ever had a two.
17:33
Oh, okay.
17:38
Okay, so let's go to that Channel Super Fun video
17:41
that we were looking at.
17:42
Yes, yes, Swagway jousting.
17:44
Okay, and it was the delivery.
17:51
I'm getting like tingling.
17:54
Okay, okay, you gotta check the source files though
17:56
because that was what we know was corrupted.
17:58
Okay, okay, okay, okay.
17:59
So untranscoded footage.
18:01
XA, they did it.
18:10
It's back.
18:13
My God.
18:15
It's back, here's another file.
18:16
It's back.
18:17
Do we have everything?
18:20
In theory, yes.
18:22
Really?
18:24
Holy .
18:27
Oh.
18:29
I think it's finally over.
18:31
Woo hoo.
18:34
That is fantastic.
18:37
Hey guys, guess what happened with the recovery?
18:42
We got everything back.
18:43
Oh, thank goodness.
18:45
Woo hoo.
18:46
Like it's just been such a stressful few weeks
18:49
and that was just such a huge, huge thing.
18:53
All of our pitches and proposals and stuff are on that server.
18:58
Okay, Wanik's back.
19:02
Bully?
19:02
Really?
19:03
Everything is back as far as I can tell.
19:06
I have played back video files.
19:08
They are not corrupted.
19:11
Wanik is officially back.
19:13
Swagway jousting footage for the melee battle should be fine.
19:17
And we'll have a better backup scheme in the future, I promise.
19:21
That is all.
19:22
Yay.
19:23
Yay.
19:24
Yay.
19:25
Video shot.
19:26
We should physically print all of our videos, every frame of every video on that paper.
19:30
I don't think so.
19:31
Did you hear the good news?
19:33
That Wanik's back?
19:34
Wanik is back.
19:35
Yay for hiring other people to do our jobs.
19:37
No, I don't think it's back, but it is.
19:41
It's back.
19:42
What are you doing?
19:43
I'm just letting everyone know that Taren, it's back.
19:45
The server is back.
19:46
It's back.
19:47
The server is back.
19:48
Okay?
19:49
So it's not always that our sponsors tie so perfectly into our content, but today it happened.
20:03
Today's sponsor is Rackspace, the top tier managed cloud computing company.
20:07
They pride themselves on best in class service across all platforms.
20:11
They've got over 300,000 customers in 120 countries with 10 worldwide data centers.
20:17
They've got Red Hat, Cisco, Microsoft, VMware certifications, and all of that amounts to,
20:23
whether you're running a small business or a billion dollar enterprise, whatever your
20:27
needs are in terms of capacity storage or flash-based high performance storage, they
20:33
can take care of it without you going gray or bald because you did something wrong and
20:42
screwed up a backup and went and lost a RAID array or whatever else.
20:46
They've got dedicated storage to meet your performance, security, network capacity, and
20:52
compliance needs.
20:53
Everything from direct attached storage, so that gives you the flexibility and scalability
20:57
of just a simple, okay, let's attach some drives to this thing, it's redundant, cool,
21:03
off you go to the races, to SAN, so that's high availability and reliability, fully redundant
21:08
for business, to even NAS with support for demanding workloads like virtualization, file
21:13
sharing, and rich media.
21:15
In fact, their NAS stuff can scale up to 20 petabytes of capacity.
21:20
They've got public cloud and private cloud options.
21:23
As well, you can get your own server, a scalable private cloud in the data center, in your
21:28
data center, in their data center, and it's all supported by Rackspace and VMware.
21:33
They call their support fanatical because these people are available 24 hours a day,
21:38
seven days a week, 365 days a year, and they've got industry leading service level agreements,
21:44
both managed and intensive.
21:46
So if you need dedicated storage, they got you covered.
21:51
Go to rackspace.com.
21:52
de to learn more, and not go through what I did.
21:58
I just, yeah, I don't really have anything to say
22:04
Thanks for watching guys.
22:05
If you just like this video, then hit the dislike button, I guess.
22:09
But come on.
22:10
This is about as real as it gets around here.
22:12
If you liked the video, hit that like button.
22:15
Get subscribed, maybe even consider checking out our Amazon affiliate, instructions for
22:20
how to use it.
22:21
Whenever you buy stuff there, instructions are up there.
22:22
Buy and share with worksheets.
22:23
That's it.
22:24
cool shirt like this one or even giving us a monthly contribution through our
22:28
forum which gives you a little contributor badge now that you're done
22:31
doing all that stuff hey maybe you want to check out that Channel Super Fun
22:34
video that we just discovered is back and that'll definitely be worth your
22:39
while we joust with each other on swagways it's pretty awesome so see you
22:45
next time