1
00:00:00,800 --> 00:00:04,760
Zeon W,

2
00:00:05,120 --> 00:00:12,320
Zeon scalable.

3
00:00:08,240 --> 00:00:16,240
You know, they're both called Xeon,

4
00:00:12,320 --> 00:00:20,640
but these things are really different.

5
00:00:16,240 --> 00:00:24,720
One of them is basically a Core i9 with

6
00:00:20,640 --> 00:00:29,039
ECC memory support and the other one is

7
00:00:24,720 --> 00:00:32,079
a server CPU that I fanled all over

8
00:00:29,039 --> 00:00:34,480
because I love super high-end expensive

9
00:00:32,079 --> 00:00:43,360
tech toys. Now, in the past, you needed multiple

10
00:00:38,480 --> 00:00:45,520
CPUs in one multi-socketed motherboard

11
00:00:43,360 --> 00:00:52,480
in order to handle intensive multi-threaded workloads. But is that

12
00:00:48,960 --> 00:00:55,920
still the case today? Do you still need

13
00:00:52,480 --> 00:01:02,960
two of these given that a single Xeon

14
00:00:55,920 --> 00:01:06,400
Platinum 8180 is 28 cores and 56 threads

15
00:01:02,960 --> 00:01:09,840
on a single chip? Well, I don't know

16
00:01:06,400 --> 00:01:12,640
what is the purpose today of a dual

17
00:01:09,840 --> 00:01:18,320
socket machine like this one and how much have single high core count CPUs

18
00:01:16,000 --> 00:01:31,119
eroded the market that they used to enjoy? Let's find out, shall we?

19
00:01:31,119 --> 00:01:37,200
All right, there's a lot more room down here and we are going to need it for

20
00:01:35,280 --> 00:01:44,880
this honking, not to mention heavy test bench. On this

21
00:01:40,479 --> 00:01:48,880
test bench, you will find the ASUS C621E

22
00:01:44,880 --> 00:01:54,960
Sage. This is a dual socket motherboard

23
00:01:48,880 --> 00:01:59,439
rocking two LGA 3647 sockets for Intel's

24
00:01:54,960 --> 00:02:02,159
Xeon scalable lineup of CPUs. And setups

25
00:01:59,439 --> 00:02:09,200
like this have actually been around as far back as the 486 in 1989

26
00:02:06,880 --> 00:02:14,400
with the resulting secondhand hardware giving enthusiasts the ability to get

27
00:02:11,360 --> 00:02:16,640
multiple physical cores in their homes

28
00:02:14,400 --> 00:02:25,520
over the years with the peak being somewhere in the mid 2000s or so. But

29
00:02:21,680 --> 00:02:28,879
that was then and this is now. Now you

30
00:02:25,520 --> 00:02:32,400
can get multiple processing cores in a

31
00:02:28,879 --> 00:02:35,200
single chip. So

32
00:02:32,400 --> 00:02:41,760
to see how far things have come, what we're going to do is pit this machine

33
00:02:38,400 --> 00:02:44,239
against the fastest single CPU that

34
00:02:41,760 --> 00:02:48,879
we've tested to date. We're going to try to keep the number of variables to a

35
00:02:46,400 --> 00:02:54,160
minimum in order to gauge the impact that these extra CPU cores will have on

36
00:02:51,599 --> 00:02:59,680
our setup. Though, it should be noted that there aren't many options when it

37
00:02:56,000 --> 00:03:02,720
comes to aftermarket LGA3647

38
00:02:59,680 --> 00:03:05,360
coolers because most of the folks

39
00:03:02,720 --> 00:03:09,760
selling these kinds of systems would figure out their own solution. So, that

40
00:03:07,120 --> 00:03:13,599
means that our dual socket workstation will run a little bit toasty, but we

41
00:03:11,840 --> 00:03:18,080
didn't observe any thermal throttling, so it shouldn't affect our performance.

42
00:03:16,159 --> 00:03:22,640
Let's start off then with good oldfashioned

43
00:03:19,840 --> 00:03:30,360
Cinebench. I mean, we've we've seen this run before, but it's always fun to see

44
00:03:25,519 --> 00:03:30,360
it finish that quickly.

45
00:03:31,120 --> 00:03:38,239
So, in a surprise to no one, the dual

46
00:03:34,319 --> 00:03:41,760
socket machine is faster. But

47
00:03:38,239 --> 00:03:44,720
considering its 56

48
00:03:41,760 --> 00:03:51,599
processing cores, not all of our workloads scale in the way that we might

49
00:03:48,159 --> 00:03:53,599
expect. 7zip, for example, shows a

50
00:03:51,599 --> 00:04:00,959
smaller thanex expected gain over our Core i9 Extreme Edition, and YC Cruncher

51
00:03:57,200 --> 00:04:03,599
even finds itself losing ground. ASUS

52
00:04:00,959 --> 00:04:08,879
Realbench demonstrates this, though, with that said, the encoding benchmark

53
00:04:05,519 --> 00:04:11,439
ees out a lead over our Core i9 7980XE.

54
00:04:08,879 --> 00:04:17,359
and then Blender. Well, here we actually get a victory for our dual CPU system

55
00:04:14,239 --> 00:04:21,120
again, showing this platform's potential

56
00:04:17,359 --> 00:04:23,199
for expanding render farms.

57
00:04:21,120 --> 00:04:29,759
But what is really going on here? Well,

58
00:04:27,600 --> 00:04:37,440
something you guys have to realize is that there is more to a dual socket

59
00:04:33,199 --> 00:04:40,960
configuration than just more cores. Do

60
00:04:37,440 --> 00:04:44,560
you remember when AMD managed a 3%

61
00:04:40,960 --> 00:04:47,919
improvement in IPC with second gen Ryzen

62
00:04:44,560 --> 00:04:50,880
just by improving cache latency? So on

63
00:04:47,919 --> 00:04:59,280
this motherboard, we've got two separate CPUs with two separate sets of cache and

64
00:04:56,160 --> 00:05:02,160
memory. See, these six banks go to this

65
00:04:59,280 --> 00:05:07,919
one and these six banks are wired into this one. And that means a lot of

66
00:05:04,880 --> 00:05:11,280
latency for compute tasks that require

67
00:05:07,919 --> 00:05:13,039
the same data sets. This latency is a

68
00:05:11,280 --> 00:05:18,880
necessary evil in the design of multiprocessor systems because of the

69
00:05:15,520 --> 00:05:22,240
need for nonuniform memory access or

70
00:05:18,880 --> 00:05:24,000
NUMA for short that allows these two

71
00:05:22,240 --> 00:05:29,520
processors to efficiently share resources or as efficiently as they can.

72
00:05:27,440 --> 00:05:34,560
So the short version of this is that it works by transparently allocating

73
00:05:31,759 --> 00:05:38,880
devices and memory to each CPU which means they can more easily avoid

74
00:05:36,479 --> 00:05:43,600
interrupting each other while accessing those resources. This in turn reduces

75
00:05:41,600 --> 00:05:48,639
the amount of waiting around that they have to do for those resources to become

76
00:05:45,840 --> 00:05:54,639
available. So that's what we're seeing during our testing like in Yunchruncher

77
00:05:51,360 --> 00:05:57,919
for example where both CPUs are working

78
00:05:54,639 --> 00:05:59,520
on the same data but it's not really the

79
00:05:57,919 --> 00:06:05,759
intended use case for this kind of thing. What if we could use different

80
00:06:02,960 --> 00:06:10,800
data sets? Then we should be able to find this kind of setups true calling.

81
00:06:08,800 --> 00:06:17,280
And how better to do that than to effectively turn this system into two

82
00:06:14,319 --> 00:06:23,600
independent computing machines using virtualization. So let's fire up on

83
00:06:20,240 --> 00:06:26,160
which uses Red Hat KVM as a hypervisor

84
00:06:23,600 --> 00:06:33,280
to see what kind of results we get splitting these resources into multiple

85
00:06:29,759 --> 00:06:36,880
independent machines. Immediately, we

86
00:06:33,280 --> 00:06:40,080
see worse results from our VMs than our

87
00:06:36,880 --> 00:06:45,120
original 56 core testing.

88
00:06:40,080 --> 00:06:49,120
But look closer at how much lower it is.

89
00:06:45,120 --> 00:06:52,000
It's not a whole lot. In every test,

90
00:06:49,120 --> 00:06:57,759
it's basically the same story here. And we are still way out ahead of the Core

91
00:06:55,039 --> 00:07:04,479
i9 Extreme Edition, particularly when it comes to Blender. Now if we consider the

92
00:07:01,360 --> 00:07:07,520
fact that we are getting simultaneous

93
00:07:04,479 --> 00:07:10,240
work done that gives us a good look at

94
00:07:07,520 --> 00:07:16,240
what an optimized workload might look like. I mean or heck uh virtualization

95
00:07:13,039 --> 00:07:20,080
itself is a legitimate task too. I mean

96
00:07:16,240 --> 00:07:22,240
this thing could be so many gamers in

97
00:07:20,080 --> 00:07:26,960
one PC. But I digress. I mean, nobody is going

98
00:07:24,720 --> 00:07:31,360
to buy something like this for their personal rig anytime soon, given the

99
00:07:29,680 --> 00:07:39,280
$10,000 per CPU price tag, which puts it

100
00:07:35,280 --> 00:07:41,039
squarely in the territory of big the

101
00:07:39,280 --> 00:07:47,520
size of the check doesn't matter business. What they care about is

102
00:07:43,840 --> 00:07:50,080
density. The more processing power a

103
00:07:47,520 --> 00:07:55,840
single computer can manage, the more processing power that can be physically

104
00:07:52,880 --> 00:08:01,280
fit into a building. And this is perhaps most important for data centers and

105
00:07:58,800 --> 00:08:07,199
render farms in particular. The less those guys have to spend on setting up

106
00:08:03,919 --> 00:08:10,240
the electricity and cooling management

107
00:08:07,199 --> 00:08:14,479
for a data center versus the amount of

108
00:08:10,240 --> 00:08:16,160
performance they can get, the better.

109
00:08:14,479 --> 00:08:20,800
So, will multiple sockets make a comeback in

110
00:08:18,400 --> 00:08:26,319
the proumer space? Outside of, you know, oil and gas

111
00:08:23,440 --> 00:08:31,039
exploration where there are still workloads that can benefit from this

112
00:08:28,240 --> 00:08:36,959
kind of thing, the chances look pretty slim if you ask me. But I don't

113
00:08:35,039 --> 00:08:42,880
necessarily think that it's Intel's intent to sell these chips in the

114
00:08:39,279 --> 00:08:45,519
proumer space. And for that matter, even

115
00:08:42,880 --> 00:08:51,279
in the enterprise space, I don't think they move a ton of them. For me, I look

116
00:08:48,800 --> 00:08:57,200
at a product like this as more of like a a future crafting exercise where it is

117
00:08:54,480 --> 00:09:02,880
available today, but it's more of a representation of what actually might be

118
00:08:59,839 --> 00:09:05,440
attainable a generation or two from now,

119
00:09:02,880 --> 00:09:10,800
just like the 22 core processors that we were playing around with a couple of

120
00:09:07,040 --> 00:09:13,519
years ago. Nowadays, those are much more

121
00:09:10,800 --> 00:09:18,640
affordable and businesses are using them to power your cloud computing services.

122
00:09:17,120 --> 00:09:22,640
So, thanks for watching, guys. If this video sucked, you know what to do. But

123
00:09:20,640 --> 00:09:26,000
if it was awesome, get subscribed, hit that like button, or check out the link

124
00:09:24,320 --> 00:09:29,120
to where to buy the stuff we featured in the video description. Also, linked down

125
00:09:27,839 --> 00:09:33,760
there is our merch store, which has cool shirts like this one, and our community

126
00:09:31,040 --> 00:09:41,360
forum, which you should totally join. Now I'm off to finally put this to use

127
00:09:37,440 --> 00:09:46,760
for the reason that I obtained it.

128
00:09:41,360 --> 00:09:46,760
Many video editors, one CPU.
