1
00:00:00,000 --> 00:00:05,200
If you're watching this right now, you're probably using at least one ARM CPU to do it.

2
00:00:05,200 --> 00:00:10,000
Or, well, not an ARM CPU, because ARM doesn't actually make CPUs, or do they?

3
00:00:10,800 --> 00:00:16,960
That's the big news that they sponsored us down here to their ARM Everywhere event to announce.

4
00:00:16,960 --> 00:00:25,760
Behind me, and in my hand, is the ARM AGI CPU, built for performance, scale, and, as always,

5
00:00:26,720 --> 00:00:33,120
up to 136 ARM Neoverse V3 cores with two megabytes of level 2 cache each,

6
00:00:33,120 --> 00:00:41,120
built on TSMC's 3nm process node, and it can run at up to 3.6GHz, which, right out of the gate.

7
00:00:41,120 --> 00:00:47,760
This is some questions, doesn't it? Just 3.6GHz? Can it, like, dynamically boost a single core

8
00:00:47,760 --> 00:00:55,120
way higher or something? No. And, according to ARM, that's actually a key feature, not a bug.

9
00:00:55,120 --> 00:01:00,720
By eschewing SMT multithreading and the highly variable power consumption that's associated

10
00:01:00,720 --> 00:01:06,720
with constantly fluctuating clock speeds, not to mention designing a 12-channel DDR5 memory controller

11
00:01:06,720 --> 00:01:12,800
that can feed every individual core with a consistent 6GB per second of bandwidth,

12
00:01:12,800 --> 00:01:20,480
ARM is ensuring that every core in this CPU will perform its best at all times and keep power consumption

13
00:01:20,560 --> 00:01:26,160
more consistent, which will allow data centers to design to how much power their racks will

14
00:01:26,160 --> 00:01:32,240
consistently consume rather than having to build in a buffer for how much they might consume at peak.

15
00:01:33,120 --> 00:01:38,640
And that's huge, considering that cooling and especially power are just about the hottest

16
00:01:38,640 --> 00:01:44,320
commodities in a world that is rapidly scaling data center infrastructure. Each AGI CPU has

17
00:01:44,320 --> 00:01:51,040
96 lanes of PCI Express Gen 6 with support for CXL 3.0 for deploying massive shared memory pools

18
00:01:51,040 --> 00:01:57,040
over PCIe and ARM showed off node designs with their hardware partners that deployed up to two

19
00:01:57,040 --> 00:02:05,680
of these CPUs on a single motherboard. Super cool, but not exactly world-changing yet. To see the

20
00:02:05,680 --> 00:02:10,160
vision that led ARM to spend the last few years bringing this to life, you gotta zoom out and look

21
00:02:10,160 --> 00:02:18,480
beyond the individual node to the rack level. This rack contains 32 node 1P servers, so for those

22
00:02:18,480 --> 00:02:26,400
keeping count at home, that's 8,160 CPU cores. Okay, still not that big of a deal. I mean,

23
00:02:26,400 --> 00:02:33,360
dense CPU racks are already a thing. Well, here comes the big reveal. This sick error message

24
00:02:33,360 --> 00:02:39,760
hoodie is now available from LTTstore.com. JK, okay, I mean, it is, but that's not the big reveal.

25
00:02:39,760 --> 00:02:46,480
The big reveal is that everything that I just told you fits in a standard OCP 36 kilowatt air

26
00:02:46,480 --> 00:02:55,040
cooled rack. Each AGI CPU draws just 300 watts, a significant reduction compared to flagship

27
00:02:55,040 --> 00:03:02,160
x86 CPUs. So when you throw liquid cooling at them, the numbers get frankly kind of ridiculous.

28
00:03:02,160 --> 00:03:10,640
In an OCP 200 kilowatt rack, ARM figures, they can pack 42 8 node 1P systems for a grand total

29
00:03:10,640 --> 00:03:20,320
of 45,696 cores and over a petabyte of RAM, all while consuming only about half of that total

30
00:03:20,320 --> 00:03:25,840
available power budget. They are pegging the bottom line performance per watt in the neighborhood of

31
00:03:25,840 --> 00:03:32,160
double compared to x86. And this is largely thanks to carrying less legacy craft, but also

32
00:03:32,160 --> 00:03:37,120
thanks to architectural choices like using fewer chiplets to keep memory latency down,

33
00:03:37,120 --> 00:03:41,760
along with ARM's traditional strength in instructions per clock, and taking just a no

34
00:03:41,760 --> 00:03:48,560
silicon wasted approach to their design. With the cost and scarcity of power, that's a number that

35
00:03:48,560 --> 00:03:55,760
is going to perk up a lot of years. But why though? Everybody knows that CPUs aren't good at AI,

36
00:03:55,840 --> 00:04:02,720
compared to GPUs or application specific neural processors. So what's with the branding?

37
00:04:03,520 --> 00:04:09,280
ARM met that question head on. While GPUs and neural accelerators get all the attention,

38
00:04:09,280 --> 00:04:15,120
CPUs are still chugging along in the background, coordinating tasks, with ARM estimating that

39
00:04:15,120 --> 00:04:22,880
a typical deployment today is going to have about 30 million cores per gigawatt. But here's the thing,

40
00:04:22,880 --> 00:04:29,920
that's with humans handling most of the token requests. AI agents push requests much faster and

41
00:04:31,120 --> 00:04:36,480
don't sleep, meaning that your expensive AI accelerators can end up sitting around because

42
00:04:36,480 --> 00:04:43,280
the CPU coordinators can't keep up with all of those requests. So ARM figures that that 30 million

43
00:04:43,280 --> 00:04:49,360
cores per gigawatt number could go up a lot in the head node next to the accelerator rack,

44
00:04:49,360 --> 00:04:56,720
as high as about four times as many. But here's the thing, when these are doing all the actual AI

45
00:04:56,720 --> 00:05:03,760
work, nobody's going to want to spend more power budget on all of those CPUs. Well, that's where

46
00:05:03,760 --> 00:05:09,920
ARM comes in with their famously power efficient designs. Let's go to Nick from the lab to see

47
00:05:09,920 --> 00:05:14,560
this thing in action. Many of the demos were focused on the ease-supporting software to ARM

48
00:05:14,560 --> 00:05:19,200
and the support they're building for developers, which makes a lot of sense, but isn't very visual,

49
00:05:19,200 --> 00:05:25,280
so let's check out this one instead, where they're encoding a 1080p video from H.264 to H.265

50
00:05:25,280 --> 00:05:29,760
while running computer vision at the same time on the same CPU. Let's go take a look at the man

51
00:05:29,760 --> 00:05:36,640
behind the curtain. That's not a video recording. ARM actually had the stones to do it live,

52
00:05:36,640 --> 00:05:44,640
bringing an actual server running the actual hardware here to the show floor. But awkward

53
00:05:45,440 --> 00:05:50,720
Doesn't all of this put ARM kind of in direct competition with their own customers, you know,

54
00:05:50,720 --> 00:05:55,200
the ones who license their IP and their compute subsystems, the guys who got them where they are

55
00:05:55,200 --> 00:06:03,360
today? Well, on paper, yes, absolutely. But from ARM's perspective, this is actually something

56
00:06:03,360 --> 00:06:09,360
that many of their customers were asking for. Expanding on that, ARM laid out how their roadmap

57
00:06:09,360 --> 00:06:15,040
and their policies account for how all three of their business models are going to go forward,

58
00:06:15,040 --> 00:06:21,120
and they're positioning this as a choice between IP licensing, compute subsystems licensing,

59
00:06:21,120 --> 00:06:27,600
and physical CPUs, or hey, why not some combination of all three? They'll gladly take your money

60
00:06:27,600 --> 00:06:31,040
any way you want to give it to them. Contact your local sales representative.

61
00:06:33,520 --> 00:06:39,280
Will is here. He can be reached afterwards. And it seems like that's the plan

62
00:06:39,280 --> 00:06:44,880
for the long haul. In a move that I don't think I've ever seen before, ARM stood up on stage and

63
00:06:44,880 --> 00:06:51,440
said the quiet part out loud. This is just a safe first attempt. The best is yet to come with our

64
00:06:51,440 --> 00:06:57,200
second CPU due next year. Like, obviously, given the timelines of silicon development,

65
00:06:57,200 --> 00:07:01,760
but you almost never hear that from a company who probably wants you to buy the hardware they have

66
00:07:01,760 --> 00:07:08,240
today from partners like, you know, for example, Supermicro. Pretty wild. If you guys enjoyed

67
00:07:08,240 --> 00:07:13,120
this video, you might enjoy the one that we did at CES, also in partnership with ARM, highlighting

68
00:07:13,120 --> 00:07:17,360
some of the unexpected places that you can find ARM technology.