All of our data is GONE!

Linus Tech Tips ·Linus Tech Tips ·2016-05-06 · 3,524 words · ~17 min read
Floatplane YouTube

Transcript

JSON SRT VTT 354
0:00 Well, hello YouTube
0:01 We have a bit of a problem today on this section of the moving, but we've been moved in for months moving vlog
0:09 Something is going on with the server
0:11 and we haven't been able to work on a project for more than like 30 minutes at a time before the server crashes and
0:17 Our computers all freeze and then we're all sad because we can't work on them. Most of the sh** we do is on the server
0:22 Well, I'm doing I'm actually doing some writing right now because I can do that
0:26 I'd love to be working on my next video. Anything that doesn't need the server we can do now
0:30 No, that's not much, but it always seems to go down
0:34 Yeah, at the worst when there's nothing else to do. No kidding, but you know YOLO you only
0:41 Liao once
0:45 See let's see the stress that Linus is going through right now
0:48 I'm only getting about 25 minutes per second, 30 minutes per second
0:53 It's probably gonna crash like while you're doing this
0:57 Yes
0:59 How do we fix servers? How do you fix it?
1:02 Tell Linus he will fix it
1:05 That's your go-to. Yeah
1:07 Whether you're running a small business or a
1:17 One billion dollar or even multi-billion dollar enterprise Rackspace has your dedicated storage environments covered
1:24 Check out the link in the video description to learn more
1:27 All right
1:28 So here's the situation
1:29 Wonix server over the course of the last couple of days has been spontaneously going offline
1:35 And turning off in spite of my best efforts to turn it back on and
1:41 Complete the backup that I'm in the middle of trying to do to our new unraid vault
1:47 While I was standing in front of it. I
1:50 observed
1:53 One of my raid controllers giving up the ghost
1:56 So as a reminder this server is running three raid fives
2:00 Striped together in Windows of a total of 24 SSDs
2:04 So if one of the raid fives drops out in time
2:07 Entirely all the data is gone only about 10% of the backup that I was just in the middle of
2:14 remaking is
2:16 saved
2:18 so
2:19 It's time to
2:22 Investigate. All right. So here's wonix server
2:26 So far troubleshooting steps that I've tried you never want to you never want to rebuild something like this if there's a possibility
2:33 You're gonna have to go to a data recovery service because that can make it much much worse
2:38 so
2:39 In terms of troubleshooting steps, I tried transplanting it into another case so that I could use a different
2:47 SATA backplane that didn't work. I tried a different power supply a better power supply
2:54 That didn't work
2:55 None of that fixed it and I even went as far as to put in
2:58 Another LSI raid card that I had and try to import the array
3:02 But while it did detect my drives all of them as unconfigured good and a foreign array
3:08 It was not able to import it
3:11 So there is some good news there though while I'm getting firmware errors kicked out by this card right here when I try to boot
3:18 All the drives are detected by another card
3:21 So hopefully it didn't write too much garbled data to the array
3:29 Okay, so this is basically what I've been doing for the last
3:32 14 hours is sending emails contacting different
3:36 Data recovery services
3:39 There's some that are local
3:40 there's some that are not local, but I'm not ruling those ones out. In fact, right
3:46 now the most likely solution looks like WeRecoverData.com. They've got some
3:52 custom tools that they've created that will allow them to SSH into the system
3:58 and actually potentially recover the missing RAID 5 and then the
4:05 other two RAID 5s and assemble them all together and export them as one
4:11 gigantic pile of data to one of our servers, all without me actually having
4:17 to send the drives away, which would be pretty ideal. Now obviously an approach
4:22 like that would not work in the case of physical damage to the drives, but
4:26 because it is only a RAID controller issue, they're saying, hey, there's a shot
4:32 at this, so let's give it a try.
4:34 So I just
4:36 downloaded their remote recovery client thing here. So right now I'm waiting for
4:44 their custom Linux-based tool to load on a bootable external SSD so that they can
4:51 SSH into the server. I'm not having a good couple days. Right now I can't get
5:00 the utility to detect my USB drive. It's called tech support. Oh bloody hell, I just
5:10 figured it out.
5:11 This is Linus calling, by the way. I just realized I have to run the WRD disk
5:16 image as administrator. You guys should put that in the documentation. It never
5:25 fails. This is why it's a good idea to call tech support, because the second you
5:29 call them, you'll solve the problem on your own.
5:32 Alright, so in theory we now have a bootable drive. Okay, so step one is
5:45 getting this unnecessary extraneous server out of the way. I'm grabbing a UPS to use.
5:53 Now, in a perfect world, I would have a motherboard or an HBA card, a host bus
6:03 adapter card, that could allow me to plug in all 24 of the drives for them to do
6:09 their data recovery magic on. But what we're gonna have to settle for is having
6:14 eight of them plugged into the motherboard at a time as each RAID 5 is
6:19 rebuilt. RAID cards do all kinds of funny, funny nonsense and they don't give other
6:24 software
6:25 direct access to the drives at a bit level in some cases. So we need to remove
6:30 all these drives and plug them in directly to the motherboard. Okay, now it
6:38 should boot to that, in theory. Let's just make sure all the drives are detected in
6:43 the BIOS and everything. Oh, that's a problem. I need to pull out the other
6:47 RAID controllers as well, otherwise they're gonna be looking for their
6:50 drives and I don't have them powered up right now. Okay. Oh, actually that doesn't
6:57 look very good.
6:59 DCI-E port errors. What? So let's make sure all of our drives are even showing
7:14 up. Uh-oh. That is extremely worrying. Oh, that is extremely worrying. Oh, that's not
7:32 good. That means potentially three of the drives in a RAID 5 are dead, which means
7:38 this data is not coming back. None of it.
7:40 Not this particular RAID?
7:44 No, none of it.
7:46 None of the...
7:48 But I thought we have three RAID 5s.
7:49 None of it will be coming back if we lost three drives from one of the RAID 5s.
8:00 So let's try moving things around. I've never tested the SATA ports on this board.
8:06 It's possible three of them are dead, however extraordinarily unlikely that
8:11 would be. Five, six. Okay, we've got six now. Okay, that gives some hope. I'm just
8:21 moving SATA ports around. I was just gonna say right now I'm even having difficulty
8:26 in this system to post. So I'm at a whole new level of concern. Yeah, I'll let you
8:34 know. I'll send an email. Okay. So I'm at the point now where I don't want to put the
8:42 drives on this system. Let's go with this one. I didn't like all those PCI Express errors
8:48 that the bootable USB was kicking out. I don't like that some of these drives aren't showing
8:53 up. I'm starting to wonder if that's what corrupted the LSI card. This royal pain in
8:58 the ass.
8:59 A thousand things I can think of that I would rather be doing right now. Okay, I got seven
9:06 right now. Really? Yeah. Wow, maybe it was bad PCIe. Linus speaking. Well, I can tell
9:13 you what's happening so far is I put it on the system that I was running the RAID card
9:18 and the drives in and that baby started spitting out PCIe errors all over the place. So I got
9:26 everything pulled out of that system and I have plugged all the drives now into a different
9:32 disk bench and I'm trying to see because I wasn't even getting all the drives showing
9:34 up. Two, three, four, five, six, seven, eight. Okay. Hey, you guys are on the phone for the
9:40 magic moment where all eight drives are detected by a new motherboard. So I'm beginning to
9:44 think there may be a motherboard issue that caused the firmware on that LSI controller
9:48 to corrupt in the first place. So I'm actually pressing enter and I am booting up to your
9:52 USB and it is loading correctly on this. I think that board is dead. Okay. Well, that's
10:01 I don't know if that makes it worse.
10:03 Probably does, doesn't it? You can tell me. I can take it. It says waiting up to 60 more
10:08 seconds for network configuration. Good timing with the follow-up call. If you called one
10:14 minute earlier, I've been cursing you for dialing my number. So I have to answer my
10:17 phone because I was in the middle of thinking only five of my drives were being detected.
10:21 You guys know what you're doing, right? I had a local shop that I was going to physically
10:25 take it to and they basically, they were like, they kind of freaked out when I said I was
10:30 having someone use their remote tool. They were like, well, unless these guys really
10:33 know what they're doing.
10:33 That's pretty dangerous. And I was like, okay, well I'll, I'll check with them. I'll make
10:36 sure they, they know what they're doing. So, so you guys know what you're doing. So basically
10:41 they're explaining that they're going to virtually Mount the volumes. So we're not doing any
10:46 writing to my discs. Um, so even in the event that we cannot recover it by this means, uh,
10:53 that still leaves other options open to us. Okay. They're in. Okay. Well, that's, that's
11:05 very wonderful to hear, but I'm trying not to get too excited right now. They sound pretty
11:09 confident. This is like the ultimate roller coaster right now. So to put this in the appropriate
11:15 context for the viewers right now, one X server has several in progress projects stored on
11:23 it that do not have a backup anywhere else, including a fully shot channel, super fun
11:27 that involved a thousand dollars in equipment rental, um, including, uh, lots of footage
11:33 from Linus tech tips videos. Like it is going to be a huge, huge, devastating loss for us.
11:39 To lose this. I mean, it's got a lot of our, our templates for editing, uh, editing videos
11:45 for, for scripting templates for employee reviews, like all kinds of stuff. Like this
11:50 is our main working server. No, the offsite backup server has not ever been built yet.
11:55 The vault was the backup server. If I ever have to like act in a, in a video where I
12:00 have to look for Lauren, I'm going to think back to this moment. Okay. So I do need to
12:04 reboot it once in order to add that eight terabyte drive. So anyway, right. Okay. Thank
12:09 you very much, gentlemen.
12:09 I hope to speak to you, uh, with good news very soon, 12 SATA connectors. Ah, ah, ah.
12:18 Okay. So let's make sure everything is showing up 1, 2, 3, 4, 5, 6, 7, 8 of those are boot
12:26 device and our eight terabyte recovery media. We are ready to rock. Okay. So within about
12:34 10 minutes, they contacted me back. They said, okay, yeah, we're ready to have all the drives
12:38 in there. So that means we are going to have to do some janky ass.
12:44 Right now. So basically what you're looking at here is these drives are going to be powered
12:53 off this power supply with a jumper. These cards and all these drives are going to be
12:59 powered off of this rig. So now we can power all those drives without powering up this
13:03 motherboard or any of that nonsense.
13:05 Okay? So we have a grand total of 25 drives connected to this test bench. 16 of which
13:13 are running in Bay's.
13:14 of this enclosure and eight of which are spread out here.
13:21 So there we go.
13:22 There's one of my LSI controllers.
13:24 There's its eight drives.
13:26 There's my other one.
13:28 Whoa, where are your virtual drives?
13:34 I think I remember a troubleshooting step
13:37 where I tried swapping all the drives to different bays
13:40 where I was trying to see if it was the back plane
13:42 or the cables.
13:43 Wow, panic moment there, but it's okay.
13:48 So now we should see two virtual drives here on the screen.
13:54 That's one set.
13:56 Show me the second set.
13:59 Oh, I thought we lost another RAID 5 there.
14:06 Okay, it's good.
14:09 Feeling good.
14:11 Reality TV ain't got nothing on IT work.
14:14 Show me all the devices.
14:17 Two RAID adapters, one eight terabyte drive,
14:19 one, two, three, four, five, six, seven,
14:21 eight Kingston drives.
14:23 Boom.
14:25 Okay.
14:26 I will let...
14:27 the data recovery specialists know.
14:30 All right.
14:31 So all that's left now then I think,
14:32 I got in touch with them and I asked if we can put
14:35 a 10 gigabit network card in
14:37 so we can accelerate the offloading of all of our files.
14:41 So all we're waiting for now
14:42 is for them to reboot the machine
14:44 and set me up with a network share that I can access
14:47 and fingers crossed,
14:50 just start pulling off all the footage
14:52 and well, everything else.
14:54 All right.
14:55 So check this out here, around here.
14:56 So I've got the instructions
14:58 for how to access our data.
15:02 So I go into the Tritium share
15:08 and there it is.
15:09 Bunch of EULAs, a bunch of stuff.
15:13 We've got what looks like the file structure.
15:18 Let's test it.
15:20 How about one that I,
15:22 that we haven't finished releasing yet?
15:24 Like something from the, the melee battle here.
15:26 Let's just, let's watch it.
15:29 Hmm.
15:31 Okay. XA footage for sure.
15:36 Oh, okay.
15:48 Here we can copy one of these.
15:49 Let's copy one of these clips to my local machine
15:52 and see if that helps.
15:54 No.
15:55 Oh boy.
16:00 Let's have a look.
16:01 I wonder if everything's even here.
16:03 Like I don't even see the Linus tech tips folder.
16:05 One and two are missing.
16:09 Useful applications is empty.
16:12 That should be full of things.
16:14 Okay. Well, I guess we'd better get in touch with them.
16:20 So thanks for watching.
16:21 Subscribe, follow all that good stuff and see you next time.
16:26 I just got an email back from them.
16:28 Please stand by addressing your emails with the engineer.
16:31 It's possible the RAID definitions were changed
16:34 from one reboot to the next
16:36 because of drive letter assignments in Linux.
16:39 It's 26, so it's 26 drive limit.
16:41 We have, I think we have more than 26 devices in there.
16:45 I got another email back.
16:46 Linus, we recreated the block devices.
16:49 Please go ahead and check the videos in context.
16:50 That you were looking at again.
16:52 No way.
16:56 Oh, the suspense.
16:59 I'm not enjoying this.
17:01 Is nothing there?
17:02 I'll let them know the chair's empty again.
17:05 This is just like the wildest roller coaster ever.
17:10 I don't even, yeah.
17:11 I don't even have anything else to say actually.
17:14 It's almost funny.
17:16 I hope to some of you it's funny.
17:19 Here's the email I got 17 minutes ago.
17:21 I was stuck on a call.
17:22 We got everything remounted.
17:24 So you can now go ahead and view the data.
17:25 Once more.
17:26 This is the moment.
17:30 Do we not have a two?
17:32 I don't think we ever had a two.
17:33 Oh, okay.
17:38 Okay, so let's go to that Channel Super Fun video
17:41 that we were looking at.
17:42 Yes, yes, Swagway jousting.
17:44 Okay, and it was the delivery.
17:51 I'm getting like tingling.
17:54 Okay, okay, you gotta check the source files though
17:56 because that was what we know was corrupted.
17:58 Okay, okay, okay, okay.
17:59 So untranscoded footage.
18:01 XA, they did it.
18:10 It's back.
18:13 My God.
18:15 It's back, here's another file.
18:16 It's back.
18:17 Do we have everything?
18:20 In theory, yes.
18:22 Really?
18:24 Holy .
18:27 Oh.
18:29 I think it's finally over.
18:31 Woo hoo.
18:34 That is fantastic.
18:37 Hey guys, guess what happened with the recovery?
18:42 We got everything back.
18:43 Oh, thank goodness.
18:45 Woo hoo.
18:46 Like it's just been such a stressful few weeks
18:49 and that was just such a huge, huge thing.
18:53 All of our pitches and proposals and stuff are on that server.
18:58 Okay, Wanik's back.
19:02 Bully?
19:02 Really?
19:03 Everything is back as far as I can tell.
19:06 I have played back video files.
19:08 They are not corrupted.
19:11 Wanik is officially back.
19:13 Swagway jousting footage for the melee battle should be fine.
19:17 And we'll have a better backup scheme in the future, I promise.
19:21 That is all.
19:22 Yay.
19:23 Yay.
19:24 Yay.
19:25 Video shot.
19:26 We should physically print all of our videos, every frame of every video on that paper.
19:30 I don't think so.
19:31 Did you hear the good news?
19:33 That Wanik's back?
19:34 Wanik is back.
19:35 Yay for hiring other people to do our jobs.
19:37 No, I don't think it's back, but it is.
19:41 It's back.
19:42 What are you doing?
19:43 I'm just letting everyone know that Taren, it's back.
19:45 The server is back.
19:46 It's back.
19:47 The server is back.
19:48 Okay?
19:49 So it's not always that our sponsors tie so perfectly into our content, but today it happened.
20:03 Today's sponsor is Rackspace, the top tier managed cloud computing company.
20:07 They pride themselves on best in class service across all platforms.
20:11 They've got over 300,000 customers in 120 countries with 10 worldwide data centers.
20:17 They've got Red Hat, Cisco, Microsoft, VMware certifications, and all of that amounts to,
20:23 whether you're running a small business or a billion dollar enterprise, whatever your
20:27 needs are in terms of capacity storage or flash-based high performance storage, they
20:33 can take care of it without you going gray or bald because you did something wrong and
20:42 screwed up a backup and went and lost a RAID array or whatever else.
20:46 They've got dedicated storage to meet your performance, security, network capacity, and
20:52 compliance needs.
20:53 Everything from direct attached storage, so that gives you the flexibility and scalability
20:57 of just a simple, okay, let's attach some drives to this thing, it's redundant, cool,
21:03 off you go to the races, to SAN, so that's high availability and reliability, fully redundant
21:08 for business, to even NAS with support for demanding workloads like virtualization, file
21:13 sharing, and rich media.
21:15 In fact, their NAS stuff can scale up to 20 petabytes of capacity.
21:20 They've got public cloud and private cloud options.
21:23 As well, you can get your own server, a scalable private cloud in the data center, in your
21:28 data center, in their data center, and it's all supported by Rackspace and VMware.
21:33 They call their support fanatical because these people are available 24 hours a day,
21:38 seven days a week, 365 days a year, and they've got industry leading service level agreements,
21:44 both managed and intensive.
21:46 So if you need dedicated storage, they got you covered.
21:51 Go to rackspace.com.
21:52 de to learn more, and not go through what I did.
21:58 I just, yeah, I don't really have anything to say
22:04 Thanks for watching guys.
22:05 If you just like this video, then hit the dislike button, I guess.
22:09 But come on.
22:10 This is about as real as it gets around here.
22:12 If you liked the video, hit that like button.
22:15 Get subscribed, maybe even consider checking out our Amazon affiliate, instructions for
22:20 how to use it.
22:21 Whenever you buy stuff there, instructions are up there.
22:22 Buy and share with worksheets.
22:23 That's it.
22:24 cool shirt like this one or even giving us a monthly contribution through our
22:28 forum which gives you a little contributor badge now that you're done
22:31 doing all that stuff hey maybe you want to check out that Channel Super Fun
22:34 video that we just discovered is back and that'll definitely be worth your
22:39 while we joust with each other on swagways it's pretty awesome so see you
22:45 next time