{"video_id":"fp_uxQn0MspyN","title":"I Built a PC that CAN’T Fail… and You Can Too! - Proxmox Compute Cluster (SPONSORED)","channel":"Linus Tech Tips","show":"Linus Tech Tips","published_at":"2024-08-15T18:09:00.038Z","duration_s":1128,"segments":[{"start_s":0.2,"end_s":7.0,"text":"Computer problems are a fact of life. And sometimes the fix is as simple as just turning it off","speaker":null,"is_sponsor":0},{"start_s":7.0,"end_s":11.28,"text":"and turning it back on again, but other times it's not.","speaker":null,"is_sponsor":0},{"start_s":11.28,"end_s":15.44,"text":"And when the system you're talking about is running an air traffic control system,","speaker":null,"is_sponsor":0},{"start_s":15.44,"end_s":19.12,"text":"controlling a bunch of ATMs, or say routing 911 calls,","speaker":null,"is_sponsor":0},{"start_s":19.12,"end_s":22.44,"text":"keeping them up and running can be a matter of life and death.","speaker":null,"is_sponsor":0},{"start_s":22.44,"end_s":25.84,"text":"Now, the stakes aren't nearly as high for us,","speaker":null,"is_sponsor":0},{"start_s":25.84,"end_s":30.36,"text":"but this server here runs multiple apps that we rely on every day,","speaker":null,"is_sponsor":0},{"start_s":30.36,"end_s":35.52,"text":"accelerates our game downloads with Steam caching, and it runs our DNS.","speaker":null,"is_sponsor":0},{"start_s":35.52,"end_s":39.76,"text":"If that service goes down, it breaks literally everyone in the company's internet,","speaker":null,"is_sponsor":0},{"start_s":39.76,"end_s":44.48,"text":"which my boss informs me, isn't great. So how do we make it more reliable?","speaker":null,"is_sponsor":0},{"start_s":44.48,"end_s":48.48,"text":"It's already a server. We build more servers.","speaker":null,"is_sponsor":0},{"start_s":48.48,"end_s":52.44,"text":"And what's really cool about this is everything we're about to show you,","speaker":null,"is_sponsor":0},{"start_s":52.44,"end_s":58.0,"text":"courtesy of Intel, who sponsored this video and sent over their new Emerald Rapid Xeon CPUs","speaker":null,"is_sponsor":0},{"start_s":58.0,"end_s":62.76,"text":"can be done on nearly any computer, even your dad's old Dell.","speaker":null,"is_sponsor":0},{"start_s":62.76,"end_s":66.32,"text":"That is, as long as you have more than one. So if one leaves for cigarettes,","speaker":null,"is_sponsor":0},{"start_s":66.32,"end_s":67.56,"text":"we can still play catch.","speaker":null,"is_sponsor":0},{"start_s":69.64,"end_s":72.88,"text":"More than one Dell, not more than one dad.","speaker":null,"is_sponsor":0},{"start_s":72.88,"end_s":76.32,"text":"Oh. Well, anyways, I'm done. Do you want to check this out?","speaker":null,"is_sponsor":0},{"start_s":76.32,"end_s":81.4,"text":"Yeah, let's have a look. You got your lovely cat picture, your crab rave on that computer.","speaker":null,"is_sponsor":0},{"start_s":81.4,"end_s":84.56,"text":"Watch this. Like I can, yeah, I can interact with this.","speaker":null,"is_sponsor":0},{"start_s":84.56,"end_s":87.18,"text":"Let's just give it a second. Okay. It's going.","speaker":null,"is_sponsor":0},{"start_s":88.14,"end_s":91.86,"text":"Now it's on this computer and like no bamboozle. Here, look, watch.","speaker":null,"is_sponsor":0},{"start_s":91.86,"end_s":95.78,"text":"Whoa, buddy. Watch, watch, watch. Boom, unplugged.","speaker":null,"is_sponsor":0},{"start_s":95.78,"end_s":99.5,"text":"I can just completely interact with this as I normally would.","speaker":null,"is_sponsor":0},{"start_s":99.5,"end_s":103.46,"text":"So what's going on here? What you guys just saw was the programs,","speaker":null,"is_sponsor":0},{"start_s":103.46,"end_s":106.5,"text":"the lovely drawing, the entire operating system,","speaker":null,"is_sponsor":0},{"start_s":106.5,"end_s":110.98,"text":"just teleporting from the computer over here to the one over here.","speaker":null,"is_sponsor":0},{"start_s":110.98,"end_s":115.7,"text":"No trickery. This is possible thanks to the magic of virtualization.","speaker":null,"is_sponsor":0},{"start_s":115.7,"end_s":121.3,"text":"We've talked about it before, but if you're not familiar, virtualization allows you to slice up a single machine","speaker":null,"is_sponsor":0},{"start_s":121.3,"end_s":124.86,"text":"into multiple less powerful virtual machines.","speaker":null,"is_sponsor":0},{"start_s":124.86,"end_s":130.1,"text":"And this setup leverages that technology to allow us to move these virtual machines","speaker":null,"is_sponsor":0},{"start_s":130.1,"end_s":133.28,"text":"between multiple physical computers.","speaker":null,"is_sponsor":0},{"start_s":133.28,"end_s":137.38,"text":"That way if one breaks, another one can immediately take its place.","speaker":null,"is_sponsor":0},{"start_s":137.38,"end_s":140.6,"text":"And the best part is that, well, this all sounds super fancy.","speaker":null,"is_sponsor":0},{"start_s":140.6,"end_s":144.06,"text":"All the software we're using is both open source and free.","speaker":null,"is_sponsor":0},{"start_s":144.06,"end_s":148.3,"text":"And we're going to show you guys how the setup works in a little bit. First, I want to take a look at the servers","speaker":null,"is_sponsor":0},{"start_s":148.3,"end_s":155.22,"text":"we're going to be using for our setup. Gigabyte sent over four of their R163 SG2 AAC1 servers.","speaker":null,"is_sponsor":0},{"start_s":155.74,"end_s":159.24,"text":"These are bare bones. So we're going to have to add a few of our own parts,","speaker":null,"is_sponsor":0},{"start_s":159.24,"end_s":163.58,"text":"but we should be able to build this in what? Like five minutes?","speaker":null,"is_sponsor":0},{"start_s":163.58,"end_s":167.1,"text":"I'd like to see you try. This guy, we're going to add some of our own parts,","speaker":null,"is_sponsor":0},{"start_s":167.1,"end_s":170.38,"text":"starting with a pair of Patriot 480 gig SATA SSDs","speaker":null,"is_sponsor":0},{"start_s":170.38,"end_s":175.54,"text":"that will function as a mirrored boot drive. This kind of per machine redundancy","speaker":null,"is_sponsor":0},{"start_s":175.54,"end_s":180.7,"text":"isn't strictly speaking necessary because we could lose an entire machine","speaker":null,"is_sponsor":0},{"start_s":180.7,"end_s":184.48,"text":"in our configuration without having any issues. But having them in pairs","speaker":null,"is_sponsor":0},{"start_s":184.48,"end_s":190.04,"text":"makes our lives easier in the future potentially. Since if one of them fails, we can just replace it","speaker":null,"is_sponsor":0},{"start_s":190.04,"end_s":193.26,"text":"and then rebuild it from the other one. Then on the other side of the machine,","speaker":null,"is_sponsor":0},{"start_s":193.26,"end_s":197.18,"text":"we're installing two of these Kyoksia CD6 7TB drives","speaker":null,"is_sponsor":0},{"start_s":197.18,"end_s":202.26,"text":"for fast bulk storage. That leaves us six more SATA bays to do nothing with","speaker":null,"is_sponsor":0},{"start_s":202.26,"end_s":205.18,"text":"and two more NVMe bays for potential future expansion.","speaker":null,"is_sponsor":0},{"start_s":207.02,"end_s":213.06,"text":"Moving back, let's get our CPU installed. We're using a Xeon Platinum 8562Y Plus in each node.","speaker":null,"is_sponsor":0},{"start_s":213.06,"end_s":217.14,"text":"These were graciously provided by Intel and with 32 cores, 64 threads","speaker":null,"is_sponsor":0},{"start_s":217.14,"end_s":221.42,"text":"and 4.1 gigahertz max turbo clock speeds. These are going to give us a ton of compute","speaker":null,"is_sponsor":0},{"start_s":221.42,"end_s":226.1,"text":"to share between our virtual machines, all at a modest 300 watt TDP.","speaker":null,"is_sponsor":0},{"start_s":226.1,"end_s":231.34,"text":"We're going to have it and the rest of the parts linked in the video description. Now, I've never installed in this socket before,","speaker":null,"is_sponsor":0},{"start_s":231.34,"end_s":236.78,"text":"so good luck me. Step one is to install the carrier on the CPU","speaker":null,"is_sponsor":0},{"start_s":236.78,"end_s":242.26,"text":"and you can tell which one of the three you're supposed to use by the little marking right there on the CPU IHS.","speaker":null,"is_sponsor":0},{"start_s":242.26,"end_s":247.78,"text":"Line up our little golden triangle with our gigantic gargantuan hole in the whole thing triangle.","speaker":null,"is_sponsor":0},{"start_s":247.78,"end_s":252.12,"text":"Oh, this is adorable. It's got a cute little ARM so you can break the thermal paste seal with the cooler","speaker":null,"is_sponsor":0},{"start_s":252.12,"end_s":255.7,"text":"so you can get the cooler and the CPU separated more easily. Love to see it.","speaker":null,"is_sponsor":0},{"start_s":255.7,"end_s":260.62,"text":"Speaking of thermal paste, we're going to be using a Honeywell PTM7950 pad,","speaker":null,"is_sponsor":0},{"start_s":260.66,"end_s":265.62,"text":"available at lttstore.com. This stuff is absolutely perfect for a server install","speaker":null,"is_sponsor":0},{"start_s":265.62,"end_s":271.52,"text":"because it lasts not forever, but for a very, very long time without maintenance.","speaker":null,"is_sponsor":0},{"start_s":271.52,"end_s":277.26,"text":"Now, you might think, okay, go ahead, put it onto the CPU socket, you'd be wrong.","speaker":null,"is_sponsor":0},{"start_s":277.26,"end_s":280.64,"text":"Instead, I'm going to install it onto the cooler.","speaker":null,"is_sponsor":0},{"start_s":280.64,"end_s":286.7,"text":"I'm going to know how to do that in a sec. So arrow and arrow.","speaker":null,"is_sponsor":0},{"start_s":286.7,"end_s":291.7,"text":"So maybe, ah, ah, ah, hey, there we go.","speaker":null,"is_sponsor":0},{"start_s":294.18,"end_s":295.84,"text":"Damn, look at that vapor chamber.","speaker":null,"is_sponsor":0},{"start_s":297.5,"end_s":302.06,"text":"Love me a vapor chamber. Okay, we're going to make sure all these are clicked into place.","speaker":null,"is_sponsor":0},{"start_s":302.06,"end_s":306.46,"text":"Look for our little arrow here. Line that up with the arrow on the socket","speaker":null,"is_sponsor":0},{"start_s":307.5,"end_s":311.46,"text":"and make sure that the locks are in their unlocked position","speaker":null,"is_sponsor":0},{"start_s":311.46,"end_s":315.14,"text":"then you should be able to just... That's it's locked.","speaker":null,"is_sponsor":0},{"start_s":315.14,"end_s":318.34,"text":"Oh, that's it. Okay, next comes something you don't see me do very often","speaker":null,"is_sponsor":0},{"start_s":318.34,"end_s":321.5,"text":"and that is use a screwdriver other than the LTT screwdriver.","speaker":null,"is_sponsor":0},{"start_s":321.5,"end_s":324.62,"text":"And that's because these need to be torqued to a specific value.","speaker":null,"is_sponsor":0},{"start_s":324.62,"end_s":326.7,"text":"That is 6.9 inch pounds.","speaker":null,"is_sponsor":0},{"start_s":329.82,"end_s":334.5,"text":"Nice. It's so cool to think that if I was doing this, you know, performing maintenance on the server,","speaker":null,"is_sponsor":0},{"start_s":334.5,"end_s":338.26,"text":"upgrading a bad RAM stick, our entire operation could be chugging along","speaker":null,"is_sponsor":0},{"start_s":338.26,"end_s":343.62,"text":"as if nothing happened. Speaking of RAM, we've gone with four 96 gig micron,","speaker":null,"is_sponsor":0},{"start_s":343.66,"end_s":347.22,"text":"5,600 megatransfer per second registered ECC dims.","speaker":null,"is_sponsor":0},{"start_s":347.22,"end_s":351.1,"text":"That's a somewhat unconventional choice because especially in a server,","speaker":null,"is_sponsor":0},{"start_s":351.1,"end_s":355.22,"text":"giving up half of the memory channels means that we will be giving up some performance,","speaker":null,"is_sponsor":0},{"start_s":355.22,"end_s":358.66,"text":"but we don't really need all of the performance for now","speaker":null,"is_sponsor":0},{"start_s":358.66,"end_s":363.5,"text":"and 384 gigs is a ton of capacity for our needs at the moment.","speaker":null,"is_sponsor":0},{"start_s":363.5,"end_s":369.94,"text":"And of course, if anything changes, we can always add more without any downtime to our services.","speaker":null,"is_sponsor":0},{"start_s":369.94,"end_s":373.58,"text":"The only thing that's really important here then is making sure that we install our sticks","speaker":null,"is_sponsor":0},{"start_s":373.58,"end_s":378.9,"text":"in the correct slots, which is not always super intuitive, so make sure to consult the manual.","speaker":null,"is_sponsor":0},{"start_s":378.9,"end_s":382.34,"text":"We don't need a GPU for now, though we could add one in the future.","speaker":null,"is_sponsor":0},{"start_s":382.34,"end_s":386.3,"text":"So that means all that's really left is these NVIDIA ConnectX 6 cards.","speaker":null,"is_sponsor":0},{"start_s":386.3,"end_s":389.34,"text":"Now, 100 gig networking might seem a bit overkill,","speaker":null,"is_sponsor":0},{"start_s":389.34,"end_s":395.02,"text":"but because our setup uses high speed drives in four servers and we want to be able","speaker":null,"is_sponsor":0},{"start_s":395.02,"end_s":399.34,"text":"to withstand two server failures, anytime we're writing data,","speaker":null,"is_sponsor":0},{"start_s":399.34,"end_s":402.58,"text":"it has to be simultaneously written to the drives","speaker":null,"is_sponsor":0},{"start_s":402.58,"end_s":408.06,"text":"on at least three machines. That ensures we have three up-to-date copies","speaker":null,"is_sponsor":0},{"start_s":408.06,"end_s":411.66,"text":"in the event of an unexpected failure. Now, if you were doing this at home,","speaker":null,"is_sponsor":0},{"start_s":411.66,"end_s":416.46,"text":"you obviously wouldn't want to spend this kind of money, but the good news is that you can do this","speaker":null,"is_sponsor":0},{"start_s":416.46,"end_s":421.18,"text":"with as few as two machines. And if you're not trying to run a high speed","speaker":null,"is_sponsor":0},{"start_s":421.18,"end_s":427.74,"text":"caching server for 100 people, 10 or 25 gig cards are available for a fraction of the price","speaker":null,"is_sponsor":0},{"start_s":427.74,"end_s":432.46,"text":"and you can connect them directly to each other without an expensive switch.","speaker":null,"is_sponsor":0},{"start_s":432.46,"end_s":435.7,"text":"I mean, even one gig could work for light applications","speaker":null,"is_sponsor":0},{"start_s":435.7,"end_s":439.3,"text":"like ensuring that your home automation system never goes down.","speaker":null,"is_sponsor":0},{"start_s":439.3,"end_s":443.3,"text":"Enough chitchat though. Let's get on with the demo and show you what happens","speaker":null,"is_sponsor":0},{"start_s":443.3,"end_s":447.14,"text":"if one of these things goes to heaven in a live environment.","speaker":null,"is_sponsor":0},{"start_s":447.14,"end_s":451.46,"text":"But not before we get them in the rack and set up, specifically here in the lab server room,","speaker":null,"is_sponsor":0},{"start_s":451.46,"end_s":456.02,"text":"because if you didn't notice earlier, the studio server room is kind of running out of space,","speaker":null,"is_sponsor":0},{"start_s":456.02,"end_s":460.1,"text":"at least until these machines are up and running and we can take the machine they're replacing out.","speaker":null,"is_sponsor":0},{"start_s":460.1,"end_s":464.46,"text":"Let's go grab the servers. Unfortunately, the rest of the machines are now magically built off of camera","speaker":null,"is_sponsor":0},{"start_s":464.46,"end_s":467.9,"text":"and we can just slide them in. What the hell is going on?","speaker":null,"is_sponsor":0},{"start_s":467.9,"end_s":472.26,"text":"Oh, there we go. Beautiful. These Gigabyte chassis come with nice tool-less rails.","speaker":null,"is_sponsor":0},{"start_s":472.26,"end_s":476.5,"text":"So installing these in our nice ginormous Hammond rack","speaker":null,"is_sponsor":0},{"start_s":476.5,"end_s":480.38,"text":"should be pretty easy. Yeah, look at that. Ooh, it's getting close.","speaker":null,"is_sponsor":0},{"start_s":480.38,"end_s":484.62,"text":"I can taste it. We just need networking. Like we mentioned before, a hundred gig,","speaker":null,"is_sponsor":0},{"start_s":484.62,"end_s":487.78,"text":"but what we didn't mention before is that each is getting two of them,","speaker":null,"is_sponsor":0},{"start_s":487.78,"end_s":490.9,"text":"specifically one to each of the network switches","speaker":null,"is_sponsor":0},{"start_s":490.9,"end_s":495.7,"text":"in the rack, that way if one of those switches has a problem, the servers will stay up","speaker":null,"is_sponsor":0},{"start_s":495.7,"end_s":500.26,"text":"and we even get an added bonus because it's some fancy Dell magic called VLT.","speaker":null,"is_sponsor":0},{"start_s":500.26,"end_s":505.02,"text":"We get the throughput of both of these cables. So 200 gig to each server.","speaker":null,"is_sponsor":0},{"start_s":505.02,"end_s":509.22,"text":"Pretty sick. All that's left then is power","speaker":null,"is_sponsor":0},{"start_s":509.22,"end_s":513.02,"text":"and like any other good server, IPMI, which is a management interface","speaker":null,"is_sponsor":0},{"start_s":513.02,"end_s":517.78,"text":"and allows us to control the machines. Even if they're not working, they have like a hardware problem.","speaker":null,"is_sponsor":0},{"start_s":517.78,"end_s":522.1,"text":"We can still access them. We can turn them on, turn them off. It's kind of magic.","speaker":null,"is_sponsor":0},{"start_s":522.1,"end_s":525.14,"text":"If you have a server that doesn't have IPMI, I don't know.","speaker":null,"is_sponsor":0},{"start_s":525.14,"end_s":530.82,"text":"I don't even know if that's a server really. There are two main elements to making this setup work.","speaker":null,"is_sponsor":0},{"start_s":530.82,"end_s":533.86,"text":"Clustering the hypervisor, which controls our virtual machines","speaker":null,"is_sponsor":0},{"start_s":533.86,"end_s":537.86,"text":"and clustering the storage, which you can skip if you have existing network storage","speaker":null,"is_sponsor":0},{"start_s":537.86,"end_s":540.94,"text":"you wanna use instead. If you're not interested in how to set this up,","speaker":null,"is_sponsor":0},{"start_s":540.94,"end_s":544.34,"text":"you can skip ahead to here to see what it's like when it's up and running.","speaker":null,"is_sponsor":0},{"start_s":544.34,"end_s":548.22,"text":"This isn't gonna be a perfect step-by-step guide, but with the documentation you could find","speaker":null,"is_sponsor":0},{"start_s":548.22,"end_s":551.98,"text":"down in the description, you should be able to replicate this setup pretty easily.","speaker":null,"is_sponsor":0},{"start_s":551.98,"end_s":555.5,"text":"Starting with networking, we added both of our 100 gate ports to a bond,","speaker":null,"is_sponsor":0},{"start_s":555.5,"end_s":559.42,"text":"created a bridge, and then added a VLAN for three different networks.","speaker":null,"is_sponsor":0},{"start_s":559.42,"end_s":562.62,"text":"One for our VMs to use, one for cluster communication,","speaker":null,"is_sponsor":0},{"start_s":562.62,"end_s":567.34,"text":"and one for the storage. They can all technically run on the same network,","speaker":null,"is_sponsor":0},{"start_s":567.34,"end_s":571.54,"text":"but the cluster needs low latency and the storage ideally uses jumbo frames.","speaker":null,"is_sponsor":0},{"start_s":571.54,"end_s":577.02,"text":"So splitting it up like this is best practice. You'll also need to add each node's cluster network IP address","speaker":null,"is_sponsor":0},{"start_s":577.02,"end_s":580.32,"text":"to the host file on each node. With the networking up and running,","speaker":null,"is_sponsor":0},{"start_s":580.32,"end_s":584.04,"text":"enabled in no subscription repo and disable the enterprise repo,","speaker":null,"is_sponsor":0},{"start_s":584.04,"end_s":587.44,"text":"it's not recommended by the Proxmox team for production.","speaker":null,"is_sponsor":0},{"start_s":587.44,"end_s":590.6,"text":"They want you to pay for the enterprise repo, which is a bit more stable,","speaker":null,"is_sponsor":0},{"start_s":590.6,"end_s":595.0,"text":"but the free one is totally fine for a home setup. Run any pending updates before proceeding,","speaker":null,"is_sponsor":0},{"start_s":595.0,"end_s":599.46,"text":"then make sure you have a reliable and ideally local time server configured","speaker":null,"is_sponsor":0},{"start_s":599.46,"end_s":603.56,"text":"on each of your individual servers as the clustering software wants the time","speaker":null,"is_sponsor":0},{"start_s":603.56,"end_s":606.76,"text":"very closely in sync to stay happy. With that out of the way,","speaker":null,"is_sponsor":0},{"start_s":606.76,"end_s":611.64,"text":"we can set up our cluster, which handles syncing the configuration and management of any virtual machines","speaker":null,"is_sponsor":0},{"start_s":611.64,"end_s":615.92,"text":"between our physical machines, and it also orchestrates migrating","speaker":null,"is_sponsor":0},{"start_s":615.92,"end_s":621.7,"text":"or restoring them when a machine goes down. Creating the cluster just takes actually a few clicks,","speaker":null,"is_sponsor":0},{"start_s":621.7,"end_s":625.1,"text":"but you might want to consider the size of your setup before you continue.","speaker":null,"is_sponsor":0},{"start_s":625.1,"end_s":629.56,"text":"That's because in order to make sure everything stays in sync in case of an issue with a machine,","speaker":null,"is_sponsor":0},{"start_s":629.56,"end_s":635.1,"text":"you need the majority of servers online and available to be able to say, hey, I see that one's offline,","speaker":null,"is_sponsor":0},{"start_s":635.1,"end_s":638.84,"text":"but we're still good. They call this quorum.","speaker":null,"is_sponsor":0},{"start_s":638.84,"end_s":642.68,"text":"If you have an even number of machines, let's say four, like we do,","speaker":null,"is_sponsor":0},{"start_s":642.68,"end_s":646.82,"text":"and each server gets the default single say or vote,","speaker":null,"is_sponsor":0},{"start_s":646.82,"end_s":652.96,"text":"the minimum possible majority is then three servers. So that means we can only withstand one going down,","speaker":null,"is_sponsor":0},{"start_s":652.96,"end_s":656.08,"text":"which is the same amount of redundancy you'd get if you had three machines,","speaker":null,"is_sponsor":0},{"start_s":656.08,"end_s":659.92,"text":"because you can only lose one to have two. If you only have two computers,","speaker":null,"is_sponsor":0},{"start_s":659.92,"end_s":662.92,"text":"then you only ever have a majority when both are online,","speaker":null,"is_sponsor":0},{"start_s":662.92,"end_s":665.92,"text":"which obviously doesn't work, that's not safe.","speaker":null,"is_sponsor":0},{"start_s":665.96,"end_s":669.08,"text":"But you can screw it around this by adding a third machine,","speaker":null,"is_sponsor":0},{"start_s":669.08,"end_s":672.34,"text":"like say a Raspberry Pi to be a tiebreaker,","speaker":null,"is_sponsor":0},{"start_s":672.34,"end_s":676.12,"text":"but that's kind of beyond the scope of this video. Once you're ready, select the cluster network","speaker":null,"is_sponsor":0},{"start_s":676.12,"end_s":679.68,"text":"in the creation menu, and then join the other machines to the cluster.","speaker":null,"is_sponsor":0},{"start_s":679.68,"end_s":683.3,"text":"Once they're in, you should be able to see them in the web GUI of any of the machines.","speaker":null,"is_sponsor":0},{"start_s":683.3,"end_s":688.84,"text":"Now on to clustering our storage. By default, Proxmox is very heavily integrated with Cef,","speaker":null,"is_sponsor":0},{"start_s":688.84,"end_s":692.8,"text":"an open source distributed storage system that's pretty easy to set up and maintain.","speaker":null,"is_sponsor":0},{"start_s":692.8,"end_s":697.8,"text":"With that in mind, newbies should start with Cef, and you can follow the great tutorial on their wiki,","speaker":null,"is_sponsor":0},{"start_s":697.8,"end_s":703.6,"text":"but it isn't the most performant in a small cluster like this. So we're gonna be using something called LinStore with DRBD,","speaker":null,"is_sponsor":0},{"start_s":703.6,"end_s":708.0,"text":"or Distributed Replicated Block Devices, another open source storage system.","speaker":null,"is_sponsor":0},{"start_s":708.0,"end_s":713.2,"text":"It requires a bit more manual configuration, but they do have a purpose-built tutorial for Proxmox","speaker":null,"is_sponsor":0},{"start_s":713.2,"end_s":716.88,"text":"and host the files for free with an optional paid enterprise version","speaker":null,"is_sponsor":0},{"start_s":716.88,"end_s":722.08,"text":"that operates on a similar model as Proxmox itself. Unlike Cef, it doesn't handle its own storage devices,","speaker":null,"is_sponsor":0},{"start_s":722.08,"end_s":725.64,"text":"so we mirrored our two Keoxia SSDs with ZFs first,","speaker":null,"is_sponsor":0},{"start_s":725.64,"end_s":728.92,"text":"and then pointed LinStore to that. Once it's installed and configured,","speaker":null,"is_sponsor":0},{"start_s":728.92,"end_s":732.88,"text":"then you can add the clustered storage to Proxmox, create a virtual machine with that storage,","speaker":null,"is_sponsor":0},{"start_s":732.88,"end_s":737.24,"text":"and it'll automatically be replicated in real time to the number of other nodes you specify.","speaker":null,"is_sponsor":0},{"start_s":737.24,"end_s":741.48,"text":"And if you happen to migrate a VM to a server that doesn't have a copy on it,","speaker":null,"is_sponsor":0},{"start_s":741.48,"end_s":747.52,"text":"it'll automatically stream the data over the network from one of those nodes in what they call diskless mode.","speaker":null,"is_sponsor":0},{"start_s":747.52,"end_s":750.28,"text":"But let's just try it. Hey.","speaker":null,"is_sponsor":0},{"start_s":751.52,"end_s":755.04,"text":"Pretty nice, right? Looking good. It's like even cable-managed.","speaker":null,"is_sponsor":0},{"start_s":755.04,"end_s":758.56,"text":"I know, right? So 200 gig on each of them? Nice.","speaker":null,"is_sponsor":0},{"start_s":758.56,"end_s":763.0,"text":"Who are you people, and what have you done with our infrateam? I made one small adjustment just for you.","speaker":null,"is_sponsor":0},{"start_s":763.0,"end_s":766.32,"text":"Look at the drives. They're in the same spot. No, they're not.","speaker":null,"is_sponsor":0},{"start_s":766.32,"end_s":770.08,"text":"The top one's different. I hate you so much. Why would you do that?","speaker":null,"is_sponsor":0},{"start_s":770.08,"end_s":773.4,"text":"But more importantly, does it work? Yeah, obviously.","speaker":null,"is_sponsor":0},{"start_s":773.4,"end_s":776.88,"text":"Okay, well, here's your Windows desktop. Obviously, he says. Well, what?","speaker":null,"is_sponsor":0},{"start_s":776.88,"end_s":780.64,"text":"Editor, a super cut of things not working here, please.","speaker":null,"is_sponsor":0},{"start_s":780.64,"end_s":783.96,"text":"Jake, we have a leak. Oh, God. One failure.","speaker":null,"is_sponsor":0},{"start_s":783.96,"end_s":787.44,"text":"You just downgraded my Wi-Fi. Four drives aren't working?","speaker":null,"is_sponsor":0},{"start_s":787.44,"end_s":791.32,"text":"Did you actually break it? Anyways, you see our Windows, right?","speaker":null,"is_sponsor":0},{"start_s":791.32,"end_s":794.8,"text":"Yeah. Our Windows is running right now on number four,","speaker":null,"is_sponsor":0},{"start_s":794.8,"end_s":801.56,"text":"which is the bottom server. Yes. Now, obviously, remoting into the machine over Wi-Fi.","speaker":null,"is_sponsor":0},{"start_s":801.56,"end_s":805.84,"text":"Okay, the video playback's a little bit choppy. That's not gonna affect the type of workload","speaker":null,"is_sponsor":0},{"start_s":805.84,"end_s":811.68,"text":"you would normally be running on something like this, like a DNS server, or like are we finally","speaker":null,"is_sponsor":0},{"start_s":811.68,"end_s":814.72,"text":"doing Active Directory? We will, not today.","speaker":null,"is_sponsor":0},{"start_s":814.72,"end_s":819.12,"text":"Not today, but we can now. But this is the kind of setup that you want","speaker":null,"is_sponsor":0},{"start_s":819.12,"end_s":822.56,"text":"for something like AD. Live playing the video, let's migrate to number one,","speaker":null,"is_sponsor":0},{"start_s":822.56,"end_s":826.36,"text":"which is the top one. The process will be a little bit faster,","speaker":null,"is_sponsor":0},{"start_s":826.36,"end_s":830.58,"text":"but basically what it's doing is copying the memory,","speaker":null,"is_sponsor":0},{"start_s":830.58,"end_s":834.6,"text":"like the RAM, what's actually in memory. And then once it's done most of it,","speaker":null,"is_sponsor":0},{"start_s":834.6,"end_s":840.32,"text":"it pauses the operating system for a split second, copies the last tiny little bit, and boom.","speaker":null,"is_sponsor":0},{"start_s":840.32,"end_s":843.4,"text":"That is so cool. You're exactly where you were before","speaker":null,"is_sponsor":0},{"start_s":843.4,"end_s":846.92,"text":"because the storage is already there. Right.","speaker":null,"is_sponsor":0},{"start_s":846.92,"end_s":850.4,"text":"So in terms of actual downtime, like interruption to that experience.","speaker":null,"is_sponsor":0},{"start_s":850.4,"end_s":853.48,"text":"17 seconds. No, 270 milliseconds.","speaker":null,"is_sponsor":0},{"start_s":853.48,"end_s":856.84,"text":"Oh, I thought you were pointing at the other thing. No, 17 seconds is that whole process.","speaker":null,"is_sponsor":0},{"start_s":856.84,"end_s":860.2,"text":"Oh yeah, yeah, well that's kind of downtime, I guess.","speaker":null,"is_sponsor":0},{"start_s":860.2,"end_s":863.64,"text":"No, because if there was somebody using this like as a virtual desktop, for instance.","speaker":null,"is_sponsor":0},{"start_s":864.64,"end_s":867.72,"text":"They would see like a quarter of a second blink,","speaker":null,"is_sponsor":0},{"start_s":867.72,"end_s":873.68,"text":"and otherwise like nothing changed. I wanted to show a more realistic to us demo.","speaker":null,"is_sponsor":0},{"start_s":873.68,"end_s":876.84,"text":"Sure. Come hither. Here's a Plex server.","speaker":null,"is_sponsor":0},{"start_s":876.84,"end_s":880.08,"text":"We've got some videos on it, and this is on server number one.","speaker":null,"is_sponsor":0},{"start_s":880.08,"end_s":885.76,"text":"Okay. Let's play a video. Now we go and move our Plex server to a different machine.","speaker":null,"is_sponsor":0},{"start_s":885.76,"end_s":889.64,"text":"So it's copying the RAM at 2.5 gigabytes a second.","speaker":null,"is_sponsor":0},{"start_s":889.64,"end_s":893.4,"text":"So that's like 2.8 gigabytes a second, that's pretty good. We haven't done any actual,","speaker":null,"is_sponsor":0},{"start_s":893.4,"end_s":897.76,"text":"oh, it's already done. And no interruption, because video playback,","speaker":null,"is_sponsor":0},{"start_s":897.76,"end_s":902.84,"text":"like many other applications, uses buffers to hide small interruptions in the service.","speaker":null,"is_sponsor":0},{"start_s":902.84,"end_s":906.84,"text":"In this case, downloading the video in small chunks a little bit at a time.","speaker":null,"is_sponsor":0},{"start_s":906.84,"end_s":913.8,"text":"Yeah, roughly 10 second chunks it looks like here, which is plenty to cover that 146 milliseconds of downtime.","speaker":null,"is_sponsor":0},{"start_s":913.8,"end_s":916.84,"text":"Wow. You want to try steam download with Lancash?","speaker":null,"is_sponsor":0},{"start_s":916.84,"end_s":921.12,"text":"I mean, we should? Yeah, why not? Yep, we're CPU bottlenecks for sure.","speaker":null,"is_sponsor":0},{"start_s":921.16,"end_s":925.36,"text":"Using 80 to 90% of a 24-core threadripper.","speaker":null,"is_sponsor":0},{"start_s":925.36,"end_s":928.48,"text":"But I realized I made a little bit of an oopsie here. Like you can see the CPU usage,","speaker":null,"is_sponsor":0},{"start_s":928.48,"end_s":932.72,"text":"we're using 4% of our eight CPUs that I assigned to this steam cache.","speaker":null,"is_sponsor":0},{"start_s":932.72,"end_s":935.8,"text":"We can see our network traffic's going up. Sick.","speaker":null,"is_sponsor":0},{"start_s":935.8,"end_s":938.92,"text":"Except I made this as a container, not a VM.","speaker":null,"is_sponsor":0},{"start_s":938.92,"end_s":942.84,"text":"And the thing with containers, they're great. They're a little bit lighter weight, better performance,","speaker":null,"is_sponsor":0},{"start_s":942.84,"end_s":946.8,"text":"but they run within the kernel of the main system.","speaker":null,"is_sponsor":0},{"start_s":946.8,"end_s":951.72,"text":"It'll shut down that container and then just reboot on the other machine. Right, which means it's fine,","speaker":null,"is_sponsor":0},{"start_s":951.72,"end_s":958.2,"text":"but there'll be a longer downtime delay. But way less than, hey, is that thing working?","speaker":null,"is_sponsor":0},{"start_s":958.2,"end_s":961.84,"text":"Oh, I think the internet's not working. Somebody should go look at that. Yeah.","speaker":null,"is_sponsor":0},{"start_s":961.84,"end_s":965.44,"text":"Trying to figure out what's going on, fixing the machine, getting the machine back going. So cool.","speaker":null,"is_sponsor":0},{"start_s":965.44,"end_s":970.12,"text":"You're talking about the matter of a couple minutes maybe. Yeah. Now for the most impressive demo yet,","speaker":null,"is_sponsor":0},{"start_s":970.12,"end_s":974.0,"text":"the unexpected migration. Which one am I yanking?","speaker":null,"is_sponsor":0},{"start_s":974.0,"end_s":978.08,"text":"Okay, so number one has three VMs on it. They're all in the high availability.","speaker":null,"is_sponsor":0},{"start_s":978.08,"end_s":981.28,"text":"Jake's chain. Ah. What?","speaker":null,"is_sponsor":0},{"start_s":981.28,"end_s":984.32,"text":"It means teasing. Oh, I get it. Okay, sorry.","speaker":null,"is_sponsor":0},{"start_s":984.32,"end_s":988.2,"text":"Which one? I wasn't even listening to you. Number one. Number one.","speaker":null,"is_sponsor":0},{"start_s":988.2,"end_s":993.08,"text":"And we'll see how fast it does. We're looking at server one from server two.","speaker":null,"is_sponsor":0},{"start_s":993.08,"end_s":994.92,"text":"So go for it. Ah.","speaker":null,"is_sponsor":0},{"start_s":1000.64,"end_s":1005.4,"text":"From my understanding, this process takes a minute or two.","speaker":null,"is_sponsor":0},{"start_s":1005.4,"end_s":1008.6,"text":"Okay. Let's go, let's already detected the notice offline.","speaker":null,"is_sponsor":0},{"start_s":1008.6,"end_s":1014.4,"text":"Sure is. If you're doing scheduled maintenance, you can actually go and just shut off a machine.","speaker":null,"is_sponsor":0},{"start_s":1014.4,"end_s":1018.4,"text":"And then it will just be like, oh crap, I need to move all those things before I shut off.","speaker":null,"is_sponsor":0},{"start_s":1018.4,"end_s":1022.12,"text":"Which is a little bit nicer. In this case, it has to be like, sure,","speaker":null,"is_sponsor":0},{"start_s":1022.12,"end_s":1027.08,"text":"the server is down. So all three of those are yelling at what was this.","speaker":null,"is_sponsor":0},{"start_s":1027.08,"end_s":1031.12,"text":"Say, hello? What happened? Are you alive? What's going on?","speaker":null,"is_sponsor":0},{"start_s":1031.12,"end_s":1034.68,"text":"I can hear them. Hello? What happened? Are you alive?","speaker":null,"is_sponsor":0},{"start_s":1034.68,"end_s":1041.36,"text":"What's going on? Oh, it did something. So in theory, it should distribute them evenly","speaker":null,"is_sponsor":0},{"start_s":1041.36,"end_s":1044.6,"text":"because that's the option that's set right now. Right.","speaker":null,"is_sponsor":0},{"start_s":1044.6,"end_s":1049.76,"text":"In terms of its workload, you mean? Yeah. There is also a mode that does like resource checking.","speaker":null,"is_sponsor":0},{"start_s":1049.76,"end_s":1054.44,"text":"Sure. But right now it's going, how many VMs are in each one and just like filling that number so it's even.","speaker":null,"is_sponsor":0},{"start_s":1054.44,"end_s":1058.08,"text":"That is so cool. Okay, so what service was running on that one?","speaker":null,"is_sponsor":0},{"start_s":1058.08,"end_s":1062.4,"text":"Was that the steam cache? So we should go download a game. You could go do Plex right now too.","speaker":null,"is_sponsor":0},{"start_s":1062.4,"end_s":1070.24,"text":"Let's go do it. Let's go do it. Come on, let's go. And no movie magic, but also magic, virtualization magic.","speaker":null,"is_sponsor":0},{"start_s":1071.44,"end_s":1075.16,"text":"This is flipping awesome. And it's going to be an absolute game changer","speaker":null,"is_sponsor":0},{"start_s":1075.16,"end_s":1078.52,"text":"for the way that we manage our infrastructure. And like I said at the beginning,","speaker":null,"is_sponsor":0},{"start_s":1078.52,"end_s":1084.34,"text":"I think the coolest thing about it is that this type of architecture doesn't even have to run","speaker":null,"is_sponsor":0},{"start_s":1084.34,"end_s":1088.68,"text":"on the kind of Emerald Rapids latest server technology","speaker":null,"is_sponsor":0},{"start_s":1088.68,"end_s":1091.88,"text":"that Intel and Gigabyte and Micron and NVIDIA","speaker":null,"is_sponsor":0},{"start_s":1091.88,"end_s":1095.96,"text":"all sent over here. So the takeaway for you guys is whether it's for work","speaker":null,"is_sponsor":0},{"start_s":1095.96,"end_s":1099.92,"text":"or whether it's just for your home automation or your Plex server at home,","speaker":null,"is_sponsor":0},{"start_s":1099.92,"end_s":1105.2,"text":"something like this is absolutely attainable with potentially very little financial outlay.","speaker":null,"is_sponsor":0},{"start_s":1105.2,"end_s":1108.44,"text":"Go buy some used like eighth gen Intel core processors.","speaker":null,"is_sponsor":0},{"start_s":1108.44,"end_s":1112.52,"text":"Those are pretty cheap. Some cheap DDR4 and you're off to the races.","speaker":null,"is_sponsor":0},{"start_s":1112.52,"end_s":1116.12,"text":"Or if you're doing this more properly for your business, check out Intel Emerald Rapids","speaker":null,"is_sponsor":0},{"start_s":1116.12,"end_s":1120.52,"text":"and their whole line of Xeon and GPU products down below.","speaker":null,"is_sponsor":0},{"start_s":1121.6,"end_s":1124.76,"text":"Where were you pointing? Down below, that's the description.","speaker":null,"is_sponsor":0},{"start_s":1126.48,"end_s":1128.48,"text":"Get your mind out of the description.","speaker":null,"is_sponsor":0}],"full_text":"Computer problems are a fact of life. And sometimes the fix is as simple as just turning it off and turning it back on again, but other times it's not. And when the system you're talking about is running an air traffic control system, controlling a bunch of ATMs, or say routing 911 calls, keeping them up and running can be a matter of life and death. Now, the stakes aren't nearly as high for us, but this server here runs multiple apps that we rely on every day, accelerates our game downloads with Steam caching, and it runs our DNS. If that service goes down, it breaks literally everyone in the company's internet, which my boss informs me, isn't great. So how do we make it more reliable? It's already a server. We build more servers. And what's really cool about this is everything we're about to show you, courtesy of Intel, who sponsored this video and sent over their new Emerald Rapid Xeon CPUs can be done on nearly any computer, even your dad's old Dell. That is, as long as you have more than one. So if one leaves for cigarettes, we can still play catch. More than one Dell, not more than one dad. Oh. Well, anyways, I'm done. Do you want to check this out? Yeah, let's have a look. You got your lovely cat picture, your crab rave on that computer. Watch this. Like I can, yeah, I can interact with this. Let's just give it a second. Okay. It's going. Now it's on this computer and like no bamboozle. Here, look, watch. Whoa, buddy. Watch, watch, watch. Boom, unplugged. I can just completely interact with this as I normally would. So what's going on here? What you guys just saw was the programs, the lovely drawing, the entire operating system, just teleporting from the computer over here to the one over here. No trickery. This is possible thanks to the magic of virtualization. We've talked about it before, but if you're not familiar, virtualization allows you to slice up a single machine into multiple less powerful virtual machines. And this setup leverages that technology to allow us to move these virtual machines between multiple physical computers. That way if one breaks, another one can immediately take its place. And the best part is that, well, this all sounds super fancy. All the software we're using is both open source and free. And we're going to show you guys how the setup works in a little bit. First, I want to take a look at the servers we're going to be using for our setup. Gigabyte sent over four of their R163 SG2 AAC1 servers. These are bare bones. So we're going to have to add a few of our own parts, but we should be able to build this in what? Like five minutes? I'd like to see you try. This guy, we're going to add some of our own parts, starting with a pair of Patriot 480 gig SATA SSDs that will function as a mirrored boot drive. This kind of per machine redundancy isn't strictly speaking necessary because we could lose an entire machine in our configuration without having any issues. But having them in pairs makes our lives easier in the future potentially. Since if one of them fails, we can just replace it and then rebuild it from the other one. Then on the other side of the machine, we're installing two of these Kyoksia CD6 7TB drives for fast bulk storage. That leaves us six more SATA bays to do nothing with and two more NVMe bays for potential future expansion. Moving back, let's get our CPU installed. We're using a Xeon Platinum 8562Y Plus in each node. These were graciously provided by Intel and with 32 cores, 64 threads and 4.1 gigahertz max turbo clock speeds. These are going to give us a ton of compute to share between our virtual machines, all at a modest 300 watt TDP. We're going to have it and the rest of the parts linked in the video description. Now, I've never installed in this socket before, so good luck me. Step one is to install the carrier on the CPU and you can tell which one of the three you're supposed to use by the little marking right there on the CPU IHS. Line up our little golden triangle with our gigantic gargantuan hole in the whole thing triangle. Oh, this is adorable. It's got a cute little ARM so you can break the thermal paste seal with the cooler so you can get the cooler and the CPU separated more easily. Love to see it. Speaking of thermal paste, we're going to be using a Honeywell PTM7950 pad, available at lttstore.com. This stuff is absolutely perfect for a server install because it lasts not forever, but for a very, very long time without maintenance. Now, you might think, okay, go ahead, put it onto the CPU socket, you'd be wrong. Instead, I'm going to install it onto the cooler. I'm going to know how to do that in a sec. So arrow and arrow. So maybe, ah, ah, ah, hey, there we go. Damn, look at that vapor chamber. Love me a vapor chamber. Okay, we're going to make sure all these are clicked into place. Look for our little arrow here. Line that up with the arrow on the socket and make sure that the locks are in their unlocked position then you should be able to just... That's it's locked. Oh, that's it. Okay, next comes something you don't see me do very often and that is use a screwdriver other than the LTT screwdriver. And that's because these need to be torqued to a specific value. That is 6.9 inch pounds. Nice. It's so cool to think that if I was doing this, you know, performing maintenance on the server, upgrading a bad RAM stick, our entire operation could be chugging along as if nothing happened. Speaking of RAM, we've gone with four 96 gig micron, 5,600 megatransfer per second registered ECC dims. That's a somewhat unconventional choice because especially in a server, giving up half of the memory channels means that we will be giving up some performance, but we don't really need all of the performance for now and 384 gigs is a ton of capacity for our needs at the moment. And of course, if anything changes, we can always add more without any downtime to our services. The only thing that's really important here then is making sure that we install our sticks in the correct slots, which is not always super intuitive, so make sure to consult the manual. We don't need a GPU for now, though we could add one in the future. So that means all that's really left is these NVIDIA ConnectX 6 cards. Now, 100 gig networking might seem a bit overkill, but because our setup uses high speed drives in four servers and we want to be able to withstand two server failures, anytime we're writing data, it has to be simultaneously written to the drives on at least three machines. That ensures we have three up-to-date copies in the event of an unexpected failure. Now, if you were doing this at home, you obviously wouldn't want to spend this kind of money, but the good news is that you can do this with as few as two machines. And if you're not trying to run a high speed caching server for 100 people, 10 or 25 gig cards are available for a fraction of the price and you can connect them directly to each other without an expensive switch. I mean, even one gig could work for light applications like ensuring that your home automation system never goes down. Enough chitchat though. Let's get on with the demo and show you what happens if one of these things goes to heaven in a live environment. But not before we get them in the rack and set up, specifically here in the lab server room, because if you didn't notice earlier, the studio server room is kind of running out of space, at least until these machines are up and running and we can take the machine they're replacing out. Let's go grab the servers. Unfortunately, the rest of the machines are now magically built off of camera and we can just slide them in. What the hell is going on? Oh, there we go. Beautiful. These Gigabyte chassis come with nice tool-less rails. So installing these in our nice ginormous Hammond rack should be pretty easy. Yeah, look at that. Ooh, it's getting close. I can taste it. We just need networking. Like we mentioned before, a hundred gig, but what we didn't mention before is that each is getting two of them, specifically one to each of the network switches in the rack, that way if one of those switches has a problem, the servers will stay up and we even get an added bonus because it's some fancy Dell magic called VLT. We get the throughput of both of these cables. So 200 gig to each server. Pretty sick. All that's left then is power and like any other good server, IPMI, which is a management interface and allows us to control the machines. Even if they're not working, they have like a hardware problem. We can still access them. We can turn them on, turn them off. It's kind of magic. If you have a server that doesn't have IPMI, I don't know. I don't even know if that's a server really. There are two main elements to making this setup work. Clustering the hypervisor, which controls our virtual machines and clustering the storage, which you can skip if you have existing network storage you wanna use instead. If you're not interested in how to set this up, you can skip ahead to here to see what it's like when it's up and running. This isn't gonna be a perfect step-by-step guide, but with the documentation you could find down in the description, you should be able to replicate this setup pretty easily. Starting with networking, we added both of our 100 gate ports to a bond, created a bridge, and then added a VLAN for three different networks. One for our VMs to use, one for cluster communication, and one for the storage. They can all technically run on the same network, but the cluster needs low latency and the storage ideally uses jumbo frames. So splitting it up like this is best practice. You'll also need to add each node's cluster network IP address to the host file on each node. With the networking up and running, enabled in no subscription repo and disable the enterprise repo, it's not recommended by the Proxmox team for production. They want you to pay for the enterprise repo, which is a bit more stable, but the free one is totally fine for a home setup. Run any pending updates before proceeding, then make sure you have a reliable and ideally local time server configured on each of your individual servers as the clustering software wants the time very closely in sync to stay happy. With that out of the way, we can set up our cluster, which handles syncing the configuration and management of any virtual machines between our physical machines, and it also orchestrates migrating or restoring them when a machine goes down. Creating the cluster just takes actually a few clicks, but you might want to consider the size of your setup before you continue. That's because in order to make sure everything stays in sync in case of an issue with a machine, you need the majority of servers online and available to be able to say, hey, I see that one's offline, but we're still good. They call this quorum. If you have an even number of machines, let's say four, like we do, and each server gets the default single say or vote, the minimum possible majority is then three servers. So that means we can only withstand one going down, which is the same amount of redundancy you'd get if you had three machines, because you can only lose one to have two. If you only have two computers, then you only ever have a majority when both are online, which obviously doesn't work, that's not safe. But you can screw it around this by adding a third machine, like say a Raspberry Pi to be a tiebreaker, but that's kind of beyond the scope of this video. Once you're ready, select the cluster network in the creation menu, and then join the other machines to the cluster. Once they're in, you should be able to see them in the web GUI of any of the machines. Now on to clustering our storage. By default, Proxmox is very heavily integrated with Cef, an open source distributed storage system that's pretty easy to set up and maintain. With that in mind, newbies should start with Cef, and you can follow the great tutorial on their wiki, but it isn't the most performant in a small cluster like this. So we're gonna be using something called LinStore with DRBD, or Distributed Replicated Block Devices, another open source storage system. It requires a bit more manual configuration, but they do have a purpose-built tutorial for Proxmox and host the files for free with an optional paid enterprise version that operates on a similar model as Proxmox itself. Unlike Cef, it doesn't handle its own storage devices, so we mirrored our two Keoxia SSDs with ZFs first, and then pointed LinStore to that. Once it's installed and configured, then you can add the clustered storage to Proxmox, create a virtual machine with that storage, and it'll automatically be replicated in real time to the number of other nodes you specify. And if you happen to migrate a VM to a server that doesn't have a copy on it, it'll automatically stream the data over the network from one of those nodes in what they call diskless mode. But let's just try it. Hey. Pretty nice, right? Looking good. It's like even cable-managed. I know, right? So 200 gig on each of them? Nice. Who are you people, and what have you done with our infrateam? I made one small adjustment just for you. Look at the drives. They're in the same spot. No, they're not. The top one's different. I hate you so much. Why would you do that? But more importantly, does it work? Yeah, obviously. Okay, well, here's your Windows desktop. Obviously, he says. Well, what? Editor, a super cut of things not working here, please. Jake, we have a leak. Oh, God. One failure. You just downgraded my Wi-Fi. Four drives aren't working? Did you actually break it? Anyways, you see our Windows, right? Yeah. Our Windows is running right now on number four, which is the bottom server. Yes. Now, obviously, remoting into the machine over Wi-Fi. Okay, the video playback's a little bit choppy. That's not gonna affect the type of workload you would normally be running on something like this, like a DNS server, or like are we finally doing Active Directory? We will, not today. Not today, but we can now. But this is the kind of setup that you want for something like AD. Live playing the video, let's migrate to number one, which is the top one. The process will be a little bit faster, but basically what it's doing is copying the memory, like the RAM, what's actually in memory. And then once it's done most of it, it pauses the operating system for a split second, copies the last tiny little bit, and boom. That is so cool. You're exactly where you were before because the storage is already there. Right. So in terms of actual downtime, like interruption to that experience. 17 seconds. No, 270 milliseconds. Oh, I thought you were pointing at the other thing. No, 17 seconds is that whole process. Oh yeah, yeah, well that's kind of downtime, I guess. No, because if there was somebody using this like as a virtual desktop, for instance. They would see like a quarter of a second blink, and otherwise like nothing changed. I wanted to show a more realistic to us demo. Sure. Come hither. Here's a Plex server. We've got some videos on it, and this is on server number one. Okay. Let's play a video. Now we go and move our Plex server to a different machine. So it's copying the RAM at 2.5 gigabytes a second. So that's like 2.8 gigabytes a second, that's pretty good. We haven't done any actual, oh, it's already done. And no interruption, because video playback, like many other applications, uses buffers to hide small interruptions in the service. In this case, downloading the video in small chunks a little bit at a time. Yeah, roughly 10 second chunks it looks like here, which is plenty to cover that 146 milliseconds of downtime. Wow. You want to try steam download with Lancash? I mean, we should? Yeah, why not? Yep, we're CPU bottlenecks for sure. Using 80 to 90% of a 24-core threadripper. But I realized I made a little bit of an oopsie here. Like you can see the CPU usage, we're using 4% of our eight CPUs that I assigned to this steam cache. We can see our network traffic's going up. Sick. Except I made this as a container, not a VM. And the thing with containers, they're great. They're a little bit lighter weight, better performance, but they run within the kernel of the main system. It'll shut down that container and then just reboot on the other machine. Right, which means it's fine, but there'll be a longer downtime delay. But way less than, hey, is that thing working? Oh, I think the internet's not working. Somebody should go look at that. Yeah. Trying to figure out what's going on, fixing the machine, getting the machine back going. So cool. You're talking about the matter of a couple minutes maybe. Yeah. Now for the most impressive demo yet, the unexpected migration. Which one am I yanking? Okay, so number one has three VMs on it. They're all in the high availability. Jake's chain. Ah. What? It means teasing. Oh, I get it. Okay, sorry. Which one? I wasn't even listening to you. Number one. Number one. And we'll see how fast it does. We're looking at server one from server two. So go for it. Ah. From my understanding, this process takes a minute or two. Okay. Let's go, let's already detected the notice offline. Sure is. If you're doing scheduled maintenance, you can actually go and just shut off a machine. And then it will just be like, oh crap, I need to move all those things before I shut off. Which is a little bit nicer. In this case, it has to be like, sure, the server is down. So all three of those are yelling at what was this. Say, hello? What happened? Are you alive? What's going on? I can hear them. Hello? What happened? Are you alive? What's going on? Oh, it did something. So in theory, it should distribute them evenly because that's the option that's set right now. Right. In terms of its workload, you mean? Yeah. There is also a mode that does like resource checking. Sure. But right now it's going, how many VMs are in each one and just like filling that number so it's even. That is so cool. Okay, so what service was running on that one? Was that the steam cache? So we should go download a game. You could go do Plex right now too. Let's go do it. Let's go do it. Come on, let's go. And no movie magic, but also magic, virtualization magic. This is flipping awesome. And it's going to be an absolute game changer for the way that we manage our infrastructure. And like I said at the beginning, I think the coolest thing about it is that this type of architecture doesn't even have to run on the kind of Emerald Rapids latest server technology that Intel and Gigabyte and Micron and NVIDIA all sent over here. So the takeaway for you guys is whether it's for work or whether it's just for your home automation or your Plex server at home, something like this is absolutely attainable with potentially very little financial outlay. Go buy some used like eighth gen Intel core processors. Those are pretty cheap. Some cheap DDR4 and you're off to the races. Or if you're doing this more properly for your business, check out Intel Emerald Rapids and their whole line of Xeon and GPU products down below. Where were you pointing? Down below, that's the description. Get your mind out of the description."}