Do Dual CPU Sockets Matter in 2018?

Linus Tech Tips ·Linus Tech Tips ·2019-05-06 · 1,284 words · ~6 min read

Floatplane YouTube

Transcript

jk

JSON SRT VTT 128

0:00 Zeon W,

0:05 Zeon scalable.

0:08 You know, they're both called Xeon,

0:12 but these things are really different.

0:16 One of them is basically a Core i9 with

0:20 ECC memory support and the other one is

0:24 a server CPU that I fanled all over

0:29 because I love super high-end expensive

0:32 tech toys. Now, in the past, you needed multiple

0:38 CPUs in one multi-socketed motherboard

0:43 in order to handle intensive multi-threaded workloads. But is that

0:48 still the case today? Do you still need

0:52 two of these given that a single Xeon

0:55 Platinum 8180 is 28 cores and 56 threads

1:02 on a single chip? Well, I don't know

1:06 what is the purpose today of a dual

1:09 socket machine like this one and how much have single high core count CPUs

1:16 eroded the market that they used to enjoy? Let's find out, shall we?

1:31 All right, there's a lot more room down here and we are going to need it for

1:35 this honking, not to mention heavy test bench. On this

1:40 test bench, you will find the ASUS C621E

1:44 Sage. This is a dual socket motherboard

1:48 rocking two LGA 3647 sockets for Intel's

1:54 Xeon scalable lineup of CPUs. And setups

1:59 like this have actually been around as far back as the 486 in 1989

2:06 with the resulting secondhand hardware giving enthusiasts the ability to get

2:11 multiple physical cores in their homes

2:14 over the years with the peak being somewhere in the mid 2000s or so. But

2:21 that was then and this is now. Now you

2:25 can get multiple processing cores in a

2:28 single chip. So

2:32 to see how far things have come, what we're going to do is pit this machine

2:38 against the fastest single CPU that

2:41 we've tested to date. We're going to try to keep the number of variables to a

2:46 minimum in order to gauge the impact that these extra CPU cores will have on

2:51 our setup. Though, it should be noted that there aren't many options when it

2:56 comes to aftermarket LGA3647

2:59 coolers because most of the folks

3:02 selling these kinds of systems would figure out their own solution. So, that

3:07 means that our dual socket workstation will run a little bit toasty, but we

3:11 didn't observe any thermal throttling, so it shouldn't affect our performance.

3:16 Let's start off then with good oldfashioned

3:19 Cinebench. I mean, we've we've seen this run before, but it's always fun to see

3:25 it finish that quickly.

3:31 So, in a surprise to no one, the dual

3:34 socket machine is faster. But

3:38 considering its 56

3:41 processing cores, not all of our workloads scale in the way that we might

3:48 expect. 7zip, for example, shows a

3:51 smaller thanex expected gain over our Core i9 Extreme Edition, and YC Cruncher

3:57 even finds itself losing ground. ASUS

4:00 Realbench demonstrates this, though, with that said, the encoding benchmark

4:05 ees out a lead over our Core i9 7980XE.

4:08 and then Blender. Well, here we actually get a victory for our dual CPU system

4:14 again, showing this platform's potential

4:17 for expanding render farms.

4:21 But what is really going on here? Well,

4:27 something you guys have to realize is that there is more to a dual socket

4:33 configuration than just more cores. Do

4:37 you remember when AMD managed a 3%

4:40 improvement in IPC with second gen Ryzen

4:44 just by improving cache latency? So on

4:47 this motherboard, we've got two separate CPUs with two separate sets of cache and

4:56 memory. See, these six banks go to this

4:59 one and these six banks are wired into this one. And that means a lot of

5:04 latency for compute tasks that require

5:07 the same data sets. This latency is a

5:11 necessary evil in the design of multiprocessor systems because of the

5:15 need for nonuniform memory access or

5:18 NUMA for short that allows these two

5:22 processors to efficiently share resources or as efficiently as they can.

5:27 So the short version of this is that it works by transparently allocating

5:31 devices and memory to each CPU which means they can more easily avoid

5:36 interrupting each other while accessing those resources. This in turn reduces

5:41 the amount of waiting around that they have to do for those resources to become

5:45 available. So that's what we're seeing during our testing like in Yunchruncher

5:51 for example where both CPUs are working

5:54 on the same data but it's not really the

5:57 intended use case for this kind of thing. What if we could use different

6:02 data sets? Then we should be able to find this kind of setups true calling.

6:08 And how better to do that than to effectively turn this system into two

6:14 independent computing machines using virtualization. So let's fire up on

6:20 which uses Red Hat KVM as a hypervisor

6:23 to see what kind of results we get splitting these resources into multiple

6:29 independent machines. Immediately, we

6:33 see worse results from our VMs than our

6:36 original 56 core testing.

6:40 But look closer at how much lower it is.

6:45 It's not a whole lot. In every test,

6:49 it's basically the same story here. And we are still way out ahead of the Core

6:55 i9 Extreme Edition, particularly when it comes to Blender. Now if we consider the

7:01 fact that we are getting simultaneous

7:04 work done that gives us a good look at

7:07 what an optimized workload might look like. I mean or heck uh virtualization

7:13 itself is a legitimate task too. I mean

7:16 this thing could be so many gamers in

7:20 one PC. But I digress. I mean, nobody is going

7:24 to buy something like this for their personal rig anytime soon, given the

7:29 $10,000 per CPU price tag, which puts it

7:35 squarely in the territory of big the

7:39 size of the check doesn't matter business. What they care about is

7:43 density. The more processing power a

7:47 single computer can manage, the more processing power that can be physically

7:52 fit into a building. And this is perhaps most important for data centers and

7:58 render farms in particular. The less those guys have to spend on setting up

8:03 the electricity and cooling management

8:07 for a data center versus the amount of

8:10 performance they can get, the better.

8:14 So, will multiple sockets make a comeback in

8:18 the proumer space? Outside of, you know, oil and gas

8:23 exploration where there are still workloads that can benefit from this

8:28 kind of thing, the chances look pretty slim if you ask me. But I don't

8:35 necessarily think that it's Intel's intent to sell these chips in the

8:39 proumer space. And for that matter, even

8:42 in the enterprise space, I don't think they move a ton of them. For me, I look

8:48 at a product like this as more of like a a future crafting exercise where it is

8:54 available today, but it's more of a representation of what actually might be

8:59 attainable a generation or two from now,

9:02 just like the 22 core processors that we were playing around with a couple of

9:07 years ago. Nowadays, those are much more

9:10 affordable and businesses are using them to power your cloud computing services.

9:17 So, thanks for watching, guys. If this video sucked, you know what to do. But

9:20 if it was awesome, get subscribed, hit that like button, or check out the link

9:24 to where to buy the stuff we featured in the video description. Also, linked down

9:27 there is our merch store, which has cool shirts like this one, and our community

9:31 forum, which you should totally join. Now I'm off to finally put this to use

9:37 for the reason that I obtained it.

9:41 Many video editors, one CPU.