WEBVTT

00:00:00.000 --> 00:00:07.640
I've always had a problem comparing Apple Silicon to PC parts because gaming wasn't

00:00:07.640 --> 00:00:15.240
ready. I've always wanted to know how Apple Silicon compares to off-the-shelf PC GPUs and CPUs.

00:00:15.240 --> 00:00:19.360
And that's not something we've been able to test because Macs and PCs now use completely

00:00:19.360 --> 00:00:25.480
different architectures, and there were no games that ran natively on both.

00:00:25.480 --> 00:00:33.640
But that's changed. Check this out. I'm playing Baldur's Gate 3, one of the hottest video games right now, on a Mac.

00:00:33.640 --> 00:00:40.360
And it's running natively on Apple Silicon. There's now a growing list of games that support both Apple's graphics hardware and

00:00:40.360 --> 00:00:44.200
metal graphics APIs without any sort of translation layer.

00:00:44.200 --> 00:00:48.760
So with the help of LMG Labs, we're going to finally compare Apple chips with PC graphics

00:00:48.760 --> 00:00:54.680
cards and see what GPUs from NVIDIA and AMD compete with Apple's M series.

00:00:54.680 --> 00:01:00.160
Though not in this video. That's part two because in part one, we need to build the test benches that the GPUs will

00:01:00.160 --> 00:01:05.400
be on. And those test benches have CPUs, so we're going to figure out what CPUs are the closest

00:01:05.400 --> 00:01:13.840
to Apple Silicon. And I need to ask Labs if they'll help me.

00:01:13.840 --> 00:01:16.840
What did I think? Oh my god. Well, it's going to be interesting.

00:01:16.840 --> 00:01:22.800
It's going to be a learning experience. We haven't tested Macs really all that much, so this is going to be the first time Labs

00:01:22.800 --> 00:01:27.040
really tackles Macs. Nicholas, hey.

00:01:27.040 --> 00:01:30.520
This is Nicholas Harris. He's LTT Labs software developer.

00:01:30.520 --> 00:01:36.880
Part of his job has been developing and automating the tests for PCs and other parts here.

00:01:36.880 --> 00:01:42.960
The first step is figuring out the tests. And so Nicholas worked with test technician John Deuren to figure out how to measure these

00:01:42.960 --> 00:01:51.320
two completely different computers. Our primary goal was to find tests that can natively work on either system to avoid the

00:01:51.320 --> 00:01:55.840
Rosetta layer, because that's another variable that we want to isolate for.

00:01:55.840 --> 00:02:00.280
But we're also isolating for the CPU that also limits us, because we do have tests that

00:02:00.280 --> 00:02:05.440
test the whole system, but we're not looking to test the memory and the SSD and the graphics

00:02:05.440 --> 00:02:12.000
card yet. But our current test suite for Markbench is very Windows focused, so there wasn't really

00:02:12.000 --> 00:02:18.520
anything we could just reuse from that. There is a test framework out there called Pheronix, which has been there out there for

00:02:18.520 --> 00:02:26.200
a long time. So we tried to find some that did compression, stuff that did maybe some encoding, things

00:02:26.200 --> 00:02:30.000
that were just pure computational. Okay, so I have to confess something.

00:02:30.000 --> 00:02:35.160
I had this idea, fantasy really, that Labs would test a bunch of Macs and then compare

00:02:35.160 --> 00:02:39.960
it to a matrix of CPU data that would show us what desktop and laptop CPUs perfectly

00:02:39.960 --> 00:02:46.480
match their M-series counterparts. But we have to use new tests, so we're starting from scratch.

00:02:46.480 --> 00:02:49.720
While we would have loved to investigate every CPU, it is wholly unreasonable to get

00:02:49.720 --> 00:02:56.360
Labs to test them all. For instance, here in Logistics, there are about 150 different CPUs available to test,

00:02:56.360 --> 00:03:00.120
and that doesn't even include the laptop CPUs and all the shapes and sizes they're

00:03:00.120 --> 00:03:06.440
cooled in, either. So we have to make tradeoffs, and this is about graphics cards, and so that's why we're

00:03:06.440 --> 00:03:14.200
only including desktop components. If you're after gaming performance, the question then becomes like, okay, cool, we have three

00:03:14.200 --> 00:03:19.840
contenders, right? Apple, Intel, and AMD. Intel changes their socket all the time.

00:03:19.840 --> 00:03:24.760
But AMD, you can go back three generations on AM4 on the same platform.

00:03:24.760 --> 00:03:28.760
And so that's why we're going to be sticking with AM4 chips in our conclusion, though we

00:03:28.760 --> 00:03:32.280
did run the tests on a few Intel chips earlier in this project.

00:03:32.280 --> 00:03:36.240
Alright, so let's go through the tests and the results.

00:03:36.240 --> 00:03:40.160
The first we did is, of course, Cinebench, it's widely used in the tech media space

00:03:40.160 --> 00:03:43.960
and they just came out with an update for it, though we did R23.

00:03:43.960 --> 00:03:47.400
It includes both a multi-core and single-core score.

00:03:47.400 --> 00:03:53.200
We chose Cinebench because it kind of chooses itself as it's the prolific go-to processor

00:03:53.200 --> 00:03:59.400
benchmark, and it's really good that it supports Apple natively as well as Windows.

00:03:59.400 --> 00:04:03.200
Looking at the single-core results, you can see how the newest chips rise to the top,

00:04:03.200 --> 00:04:08.200
but once all the cores get involved, you can see just how the 24 and the M2 Ultra push

00:04:08.200 --> 00:04:14.080
it to the top. Along with single-core tests, Lab did a Flak encode test where they encoded a bunch of

00:04:14.080 --> 00:04:17.600
copies of a 9-inch nail song from Wave to Flak.

00:04:17.600 --> 00:04:21.920
We actually struggled to find single-core tests because most tests are all about loading

00:04:21.920 --> 00:04:26.380
the CPU and trying to, you know, how fast you compute this thing.

00:04:26.380 --> 00:04:31.640
In this test, and another you'll see, Apple Silicon is so far ahead of the other chips,

00:04:31.640 --> 00:04:37.920
and they're all grouped together. That's why we're going to be weighing this test less when we figure out our matches.

00:04:37.920 --> 00:04:43.160
The last single-core tests are the XZ and LZ4 compression tests, with both compressing

00:04:43.160 --> 00:04:49.640
an Ubuntu image. We actually tried like four different compression algorithms, or compression tests, but not

00:04:49.640 --> 00:04:55.380
all of them worked. Sometimes they worked on one, but not the other, even though they're advertised for cross-platform.

00:04:55.380 --> 00:05:00.960
So we did find that XZ and LZ4, we were able to compile for both natively.

00:05:00.960 --> 00:05:06.080
LZ4 single-core shakes out slightly differently from Cinebench, with most of the Ryzen 5,000s

00:05:06.080 --> 00:05:13.040
closer to the M3 generation. But it appears that with the XZ compression test, Ryzen has a bit more strength than it

00:05:13.040 --> 00:05:18.360
does in Cinebench. What was it like to do all the testing?

00:05:18.360 --> 00:05:25.280
Illuminating. Testing's pretty straightforward. Once you identify the tests and you come up with your test suite, the execution is just

00:05:25.280 --> 00:05:31.200
while you do your testing. And now as results come in, it's like putting your puzzle together, right?

00:05:31.200 --> 00:05:34.640
As the pieces slowly fit in more, you get more of the picture.

00:05:34.760 --> 00:05:38.480
However, the difference is you're completing a puzzle that doesn't, you don't know what

00:05:38.480 --> 00:05:43.920
the end picture is. So it's interesting that way to kind of see the story reveal itself to you.

00:05:43.920 --> 00:05:48.360
Alright, so how about the puzzle pieces that tested multiple CPU cores?

00:05:48.360 --> 00:05:53.120
Blender is a popular 3D modeling program, and in it, Labs rendered the barbershop scene.

00:05:53.120 --> 00:05:56.240
We might need to find a new scene as it can render fairly quickly.

00:05:56.240 --> 00:06:00.240
It's a popular scene used to benchmark rendering performance in Blender.

00:06:00.240 --> 00:06:04.960
In Blender, the mid-range AMDs provide a transition between the M2 and M3s.

00:06:04.960 --> 00:06:08.720
The 5600X and G are surprisingly weaker here.

00:06:08.720 --> 00:06:11.920
Another render type test that I'd never heard of is C-Ray.

00:06:11.920 --> 00:06:15.720
It's a simple ray tracer that outputs this 90s looking image.

00:06:15.720 --> 00:06:23.080
C-Ray gives us another render type of test, but mostly we chose it because it works on

00:06:23.080 --> 00:06:27.040
both. That's definitely got for some of these.

00:06:27.760 --> 00:06:32.320
It's a very simple, efficient load to multi-core.

00:06:32.320 --> 00:06:37.480
AMD is relatively weaker in this test with the 3600 sitting between the M1s and their

00:06:37.480 --> 00:06:44.560
different cooling. The next test Labs did is LibRAW, which tests how well CPUs handle raw photographs.

00:06:44.560 --> 00:06:49.680
LibRAW is also nice that it has a built-in post-processing benchmark, which we run 30

00:06:49.680 --> 00:06:54.480
times on the test image that comes with Fronix, and then it spits out like a megapixels per

00:06:54.480 --> 00:06:59.080
second. LibRAW is the other test we're going to have to weigh less because of Apple Silicon's apparent

00:06:59.080 --> 00:07:04.760
supremacy. It does feel like the Macs are especially tailored to calculate audio and visual codecs.

00:07:04.760 --> 00:07:08.440
And lastly, if you're into numbers, there's PrimeSiv.

00:07:08.440 --> 00:07:12.920
We chose it because Y Cruncher doesn't work natively on Mac, because we do favor Y Cruncher.

00:07:12.920 --> 00:07:19.160
It's a very popular benchmarking one, but we found PrimeSiv, PrimeSiv, PrimeSiv, PrimeCV,

00:07:19.160 --> 00:07:23.200
PrimeSciEV, it calculates Prime numbers up to a certain length.

00:07:23.200 --> 00:07:29.440
So we consider that it's our stand-in for Y Cruncher and that it's something computational,

00:07:29.440 --> 00:07:32.880
generating number over a long period.

00:07:32.880 --> 00:07:38.040
It's multi-core as well. We learned a lot by doing this, and that is that this is hard.

00:07:38.040 --> 00:07:42.720
For one, picking AM4 means that we've got an array of chips that don't quite fit with

00:07:42.720 --> 00:07:47.720
single-core performance, as that's where Apple Silicon shines, and these are old.

00:07:47.720 --> 00:07:52.080
But then, with AM5, there aren't any low-end chips to compare with the lower-end Mac chips

00:07:52.080 --> 00:07:57.280
either. Single-core is more important for gaming, so we waited higher, but we waited Flak and

00:07:57.280 --> 00:08:01.800
Libra less because they favor Apple Silicon egregiously in a way that's not related to

00:08:01.800 --> 00:08:09.080
gaming. Alright, our picks. These choices for CPUs are still even a best guess, because we don't know what's the bottleneck

00:08:09.080 --> 00:08:13.320
CPU-wise once the GPUs are installed in running games.

00:08:13.320 --> 00:08:18.480
So these are not exact matches. I really was in Fantasyland thinking this was possible.

00:08:18.480 --> 00:08:25.080
But we've learned a lot. We're going to use the AMD 5800X3D as a control and to match the M2 Ultra, because

00:08:25.080 --> 00:08:30.280
its 3D cache really helps in gaming, and the M2 Ultra screamed well ahead in every test

00:08:30.280 --> 00:08:36.840
we threw at it. The M2 Pro and Max, as well as the M3 Pro chips, will be matched against the 5800X.

00:08:36.840 --> 00:08:42.960
The 5700X were pitting alongside the basic M3, and the basic M2 and M1 chips are matched

00:08:42.960 --> 00:08:47.200
against a 5600G. I'm feeling as well as I could.

00:08:47.920 --> 00:09:02.240
And that, I guess, is the biggest lesson on this journey.

00:09:02.240 --> 00:09:08.680
And it's that we're always learning. But now that we have our CPUs figured out, the next step will be to test the GPUs.

00:09:08.680 --> 00:09:12.960
Have NVIDIA and AMD met their match? We're going to have to see where they line up.

00:09:12.960 --> 00:09:21.080
But things are looking good in their own little way. Personally, I wasn't expecting the Mac, the Apples, to be as strong as they were.

00:09:21.080 --> 00:09:25.360
I knew they were super efficient, so this is my first really experience.

00:09:25.360 --> 00:09:31.800
Now during this project, I started daily driving the 15-inch MacBook Air, and I loved it.

00:09:31.800 --> 00:09:37.640
The fact that I could close it, neglect it for three days, and it still had power.

00:09:37.640 --> 00:09:40.800
I mean, I ended up buying one. What?

00:09:40.800 --> 00:09:44.240
You bought a Mac from this project? I mean, I did immediately stick or bomb it.

00:09:44.240 --> 00:09:48.840
How dare you? Thanks for testing this Mac Address, Labs.

00:09:48.840 --> 00:09:53.520
If you want to check out another video we did, check out the iPad tier list video.

00:09:53.520 --> 00:09:57.680
And I'm curious in the comments below, who of you are like Nicholas and bought a Mac

00:09:57.680 --> 00:09:58.400
for gaming?
