Apple M5 Chip's Dual-Use Design Will Power Future Macs and AI Servers

svish · Jul 4, 2024

Good to know. Wonder when M5 will be released.

zombierunner · Jul 5, 2024

Give us some M4 Max Mac Studios and maybe ProMotion Studio display and usb-c magic mouse that doesn't have to roll over to be charged.

chucker23n1 · Jul 5, 2024

Cervisia said:
The Max chips, (M1, M2, M3 12 P-core) has 400 GB/s. The 10 P-core M3 Max is reduced to 300 GB/s. M2 Pro was 200, M3 Pro has 150. For base chips, M1 had 66.67, M2 and M3 100, M4 has 120 GB/s.

If you look at these numbers, M4's 120 GB/s is probably what Apple thinks as the necessary minimum for Apple Intelligence. M3 Pro barely has more.

What you need depends on what you do. I dug into this because I do astrophysical simulations on very long timescales. I repeat the same simple calculations on a large amount of numbers a lot of times. Which means the most time the CPU spends is waiting for RAM. For me RAM speed is king. For you, it could be completely irrelevant. For most people it's not really relevant, that's why the get away with pushing people who need it to more expensive products (or a PC).

It's a bit unclear if you're talking about the NE, the GPU, or the CPU. The bandwidth impact on the CPU p-cores between 200 GiB/s and 150 is minor, and likely zero between 400 and 300.

While this is with the M1 Max, it's unlikely to have changed that much for the M3 Max:

As you can see, aggregate used bandwidth quickly drops off after more than two cores, and never exceeds 250 GiB/s. Maybe that has been pushed to close to 300 GiB/s, but I doubt it.

Now, if you're talking about code running on the GPU or the NE, that's a different story.

chucker23n1 · Jul 5, 2024

thebart said:
I saw some CEO guy say on an interview in 5 years AI will do 90% of your job so you can chill. Dude when an AI can do 51% of my job they've already fired my ass. I will be chilling alright

Don't worry/look forward to that.

Yes, they might fire you, but no, AI won't magically be able to perform your tasks. Not in 5 years, probably not in 50 years.

DaveN · Jul 5, 2024

IIGS User said:
I’m not completely sold on the whole AI thing. First of all, it’s super power hungry. Especially for the half baked tripe it’s kicking out right now. Has anyone considered the carbon footprint of all these AI generated responses on Google I didn’t ask for? The processing is using a LOT of electricity for the sake of making bad term papers and cartoonish cat pictures.

I don’t see how this is going to be “carbon neutral” going forward.

PBS Newshour did a story on that. To me it seems that the most practical solution is the route Apple appears to be taking. That is, use small language models on device for common simple tasks and large language models off device for more complex tasks. This route keeps most of your data private and is more efficient energy and complexity wise. At leas that is my uneducated thoughts on the subject. You don’t need to consult a team of PhD math professors to total the costs of the groceries you are buying. Your kid can do that for you.

Cervisia · Jul 5, 2024

chucker23n1 said:
It's a bit unclear if you're talking about the NE, the GPU, or the CPU. The bandwidth impact on the CPU p-cores between 200 GiB/s and 150 is minor, and likely zero between 400 and 300.

chucker23n1 said:
As you can see, aggregate used bandwidth quickly drops off after more than two cores, and never exceeds 250 GiB/s. Maybe that has been pushed to close to 300 GiB/s, but I doubt it.

Now, if you're talking about code running on the GPU or the NE, that's a different story.

A little of both. My work is confined to the CPU because that's what other members of the group can work. But the thread is about AI, and how Apple deliberately started to cut back their cheaper hardware's performance years before AI became hyped up.

Ramchi · Jul 5, 2024

chucker23n1 said:
Don't worry/look forward to that.

Yes, they might fire you, but no, AI won't magically be able to perform your tasks. Not in 5 years, probably not in 50 years.

Before AI takes over humans, corporations and business will format our brain to follow the specific processes and standardized lifestyle across the globe! For example, if you want to reach a place from A to B, you need to follow the specific path to reach and it will be tutored in your smart devices and you will happily accept it. Like that in every walk of chaotic life, these corporations in the name of standardizing the process, they will prescribe the lifestyle for all human beings that can behave in a predictable manner under the process. This is already accomplished in some way or the other in many fields. It has its own plus and minus but largely help these AI and ML revolution centre around these standard processes. If you try to do something out of the box, you may be termed as a psycho or mentally unfit sort of terminologies. Corporations want predictable behaviour so that they can control all with a touch of a button!

Zdigital2015 · Jul 5, 2024

fatTribble said:
We could have an entire thread just to debate whose engrams to use 🤭🤓

Tim, Steve and Jonny? Sort of a quasi-MAGI thing going on there (obligatory NGE reference included).

chucker23n1 · Jul 5, 2024

Cervisia said:
how Apple deliberately started to cut back their cheaper hardware's performance

I dunno about that. Overall, each M revision has improved performance.

If we take M2 Pro vs. M3 Pro, for example: p-cores are down from 8 to 6, e-cores up from 4 to 6, memory bandwidth is down from 200 GiB/s to 150. ~~Yet, in every single workload,~~ ~~the M3 Pro wins by between 7% and 47%.~~

So I don't know why a lot of people seem to have the impression that the M3 Pro is somehow a downgrade.

Plus, the M2 Pro and especially M1 Pro were starved for e-cores, giving it poorer battery life. The M3 Pro mostly rectifies that.

And M4 makes this moot anyway. It increases the clock and instructions per clock and e-core count. I guess we'll see how they configure the higher ends, but I'm not too concerned.

(edit) My bad, I somehow ended up with a M3 Max benchmark result. Here we go.

So, in the worst benchmark, the M3 Pro is 4.6% slower than the M2 Pro; in the best, it's 35.1% faster.

bradman83 · Jul 5, 2024

AppleLeaker said:
Isn’t this what Intel is doing with their “tile” design? It also lets them fabricate the less important portions of the chip on cheaper nodes like 6nm.

Yes, Intel has their own version of this and AMD is already using TSMC’s SoIC technology.

Cervisia · Jul 5, 2024

chucker23n1 said:
I dunno about that. Overall, each M revision has improved performance.

If we take M2 Pro vs. M3 Pro, for example: p-cores are down from 8 to 6, e-cores up from 4 to 6, memory bandwidth is down from 200 GiB/s to 150. Yet, in every single workload, the M3 Pro wins by between 7% and 47%.

First of all, geekbench is a terrible benchmark, that intentionally scales badly. Second, you've picked an M3 Max, with 4 more P-cores and 20 GB more RAM.

chucker23n1 said:
So I don't know why a lot of people seem to have the impression that the M3 Pro is somehow a downgrade.

Because it is.

chucker23n1 said:
Plus, the M2 Pro and especially M1 Pro were starved for e-cores, giving it poorer battery life. The M3 Pro mostly rectifies that.

If this is true, then the M3 Max is badly designed, because it lacks E-cores compared to the M3 Pro...

chucker23n1 said:
And M4 makes this moot anyway. It increases the clock and instructions per clock and e-core count. I guess we'll see how they configure the higher ends, but I'm not too concerned.

While it also decreases the P-core count in the base model, to only three. The last time I remember about anyone selling a 3-core CPU was in 2010. No, I don't buy into that they gave it 6 e-cores. It's the same what Intel does: spam E-cores so the benchmarks look good, while holding back on cores people actually use for their tasks. No wonder these are called "cinebench-cores".

zenodux · Jul 5, 2024

Chet-NYC said:
It took me a few minutes to get that....lol.

Care to share? It went over my head, lol!

Zdigital2015 · Jul 5, 2024

Cervisia said:
First of all, geekbench is a terrible benchmark, that intentionally scales badly. Second, you've picked an M3 Max, with 4 more P-cores and 20 GB more RAM.

Because it is.

If this is true, then the M3 Max is badly designed, because it lacks E-cores compared to the M3 Pro...

While it also decreases the P-core count in the base model, to only three. The last time I remember about anyone selling a 3-core CPU was in 2010. No, I don't buy into that they gave it 6 e-cores. It's the same what Intel does: spam E-cores so the benchmarks look good, while holding back on cores people actually use for their tasks. No wonder these are called "cinebench-cores".

Doesn’t matter, the M3 Pro is often faster than the M1 Max. The bandwidth reduction everyone whines about doesn’t really seem to be an issue at all.

Go to EclecticLight.co and read the articles there - this is just one of many -

Evaluating M3 Pro CPU cores: 4 Vector processing in NEON

Differences in vector processing performance between the M1 Max and M3 Pro, and in their use of power. Their frequency control is more complex.

eclecticlight.co

Zdigital2015 · Jul 5, 2024

zenodux said:
Care to share? It went over my head, lol!

Star Trek TOS S2E24

ProbablyDylan · Jul 5, 2024

Wouldn't this make cooling more difficult? Since the chip would have less surface area to transfer heat.

jmonster · Jul 5, 2024

ProbablyDylan said:
Wouldn't this make cooling more difficult? Since the chip would have less surface area to transfer heat.

I was thinking that too, BUT, if the purpose is AI tasks and those tasks occur in burtsts instead of sustained load then it probably doesn't matter. Apple will charge their premium for a ranch-style that can sustain/cool? -- that would actually help prevent "consumer" hardware from being used in a server environment and protect their enterprise profits (if Apple goes for it)

I suspect/hope that most of us will just ignore this AI noise and enjoy our computers being fast AF for the things we actually do/enjoy

Cervisia · Jul 5, 2024

Zdigital2015 said:
Doesn’t matter, the M3 Pro is often faster than the M1 Max. The bandwidth reduction everyone whines about doesn’t really seem to be an issue at all.

Go to EclecticLight.co and read the articles there - this is just one of many -

Evaluating M3 Pro CPU cores: 4 Vector processing in NEON

Differences in vector processing performance between the M1 Max and M3 Pro, and in their use of power. Their frequency control is more complex.

eclecticlight.co

This article only uses the CPU. The memory bandwith is shared between the CPU, GPU and NPU.

tenthousandthings · Jul 5, 2024

bradman83 said:
Yes, Intel has their own version of this and AMD is already using TSMC’s SoIC technology.

You are correct, but this is likely SoIC-P, which is new and Apple will likely be the first consumer silicon to use it.

SoIC-X has been around for a while, and AMD and others are using it for high performance. I think a good way to understand the difference (as I understand it) is in the names, which follow the same protocol as N5P, N4P, N3P, N2P versus N5X, N4X, N3X, N2X.

Apple won’t use SoIC-X for the same reason they haven’t used the X process nodes. They’re not efficient. They will use SoIC-P, for the same reasons they have used all of the P process nodes. They’re the height of efficiency.

SoIC-P was announced in May, there will likely be more details about it in September. See Anandtech for a good summary.

M5 is a good guess, because there’s no sign of it in what we currently know about M4, but it’s not completely out of the question that we could see it in the M4 Ultra.

altaic · Jul 5, 2024

tenthousandthings said:
You are correct, but this is likely SoIC-P, which is new and Apple will likely be the first consumer silicon to use it.

SoIC-X has been around for a while, and AMD and others are using it for high performance. I think a good way to understand the difference (as I understand it) is in the names, which follow the same protocol as N5P, N4P, N3P, N2P versus N5X, N4X, N3X, N2X.

Apple won’t use SoIC-X for the same reason they haven’t used the X process nodes. They’re not efficient. They will use SoIC-P, for the same reasons they have used all of the P process nodes. They’re the height of efficiency.

SoIC-P was announced in May, there will likely be more details about it in September. See Anandtech for a good summary.

M5 is a good guess, because there’s no sign of it in what we currently know about M4, but it’s not completely out of the question that we could see it in the M4 Ultra.

AFAICT, the only difference (in 2025) between SoIC-P and -X is that -P is microbumped @ 25um pitch and max die size is 0.2 reticle, and -X just has TSVs @ 9um pitch and full reticle. Microbumps make assembly easy/cheap, and if the bump pitch and die area is okay, then that’s great. Doesn’t have anything to do with efficiency.

It should be noted that the bottom big (full reticle) die is at least one process node behind, e.g. N4 with the top die being N3; N4 is less efficient than N3. However, there are blocks that would probably be perfectly fine on N4, such as the SLC, display controllers, Secure Enclave, a bunch of I/O, and ancillary logic.

chucker23n1 · Jul 8, 2024

Cervisia said:
First of all, geekbench is a terrible benchmark,

Some people keep claiming this, without substantiating it.

Cervisia said:
that intentionally scales badly.

Yes, that's by design. Very little real-world stuff scales well.

Cervisia said:
Second, you've picked an M3 Max, with 4 more P-cores and 20 GB more RAM.

My bad. I've updated the post. There are indeed some benchmarks where the M3 Pro is slower than the M2 Pro, but only by 4.6%. Overall, it is faster.

Cervisia said:
Because it is.

But it's not. Only in multi-core does the M2 Pro even come close to the M3 Pro (which makes sense, since the M2 Pro has 33% more p-cores), and even then, I haven't found an overall benchmark where it wins. Cinebench R23 and 2024, Geekbench 5.5 and 6.2, Blender: all of them show the M3 Pro at least slightly ahead.

You'd have to have a very specific use case, and one that's heavily parallelized, for your notion that the M2 Pro is faster to apply.

Cervisia said:
While it also decreases the P-core count in the base model, to only three. The last time I remember about anyone selling a 3-core CPU was in 2010.

But it's not a 3-core CPU.

And that's only for the binned base model. The regular M4 has four p-cores.

Cervisia said:
No, I don't buy into that they gave it 6 e-cores. It's the same what Intel does: spam E-cores so the benchmarks look good, while holding back on cores people actually use for their tasks. No wonder these are called "cinebench-cores".

That's not why Apple and Intel give a lot of e-cores. Power efficiency is why. A ton of background tasks don't really need high performance. Very little of what people do on their computers actually needs it, and usually only in bursts.

chucker23n1 · Jul 8, 2024

Cervisia said:
This article only uses the CPU. The memory bandwith is shared between the CPU, GPU and NPU.

Yes, that's the point. The high memory bandwidth is mostly useful for the GPU cores, not the CPU cores. The decreased memory bandwidth and p-core count has a negligible impact on CPU performance.

Cervisia · Jul 11, 2024

chucker23n1 said:
Yes, that's the point. The high memory bandwidth is mostly useful for the GPU cores, not the CPU cores. The decreased memory bandwidth and p-core count has a negligible impact on CPU performance.

I'll say one last time: this thread is about ai performance, which utilises the gpu too. A mid level pc has around a 100 for the cpu and 300+ for the gpu. Yes, they have to copy between system ram and vram, which is a drawback compared to apple silicon, but that's mostly affects startup time. Because of unified memory, adding vram in a mac was ridiculously cheap compared to standalone gpus, even with these abhorrent upgrade prices. Cutting the bandwith is just about milking more money from people who'd use their mac for ai without increasing further the ram upgrade prices, they've simply forced them to skip the pro and go for the max.

Cervisia · Jul 11, 2024

chucker23n1 said:
Some people keep claiming this, without substantiating it.

Yes, that's by design. Very little real-world stuff scales well.

QED. Keep geekbench to theards talking about browsing experience, and forget it when talking about specific use cases which scale well with more resources. What geekbench does is kinda pointless, because an m1 is perfectly good for what they try to measure.

chucker23n1 said:
My bad. I've updated the post. There are indeed some benchmarks where the M3 Pro is slower than the M2 Pro, but only by 4.6%. Overall, it is faster.

And how is that an acceptable generational difference?

chucker23n1 said:
But it's not. Only in multi-core does the M2 Pro even come close to the M3 Pro (which makes sense, since the M2 Pro has 33% more p-cores), and even then, I haven't found an overall benchmark where it wins. Cinebench R23 and 2024, Geekbench 5.5 and 6.2, Blender: all of them show the M3 Pro at least slightly ahead.

"Only in multi-core" is a strange way to start a sentence in 2024.

chucker23n1 said:
You'd have to have a very specific use case, and one that's heavily parallelized, for your notion that the M2 Pro is faster to apply.

A very specific use case, hmm. Like trying to use a Pro CPU for Pro stuff?

chucker23n1 said:
But it's not a 3-core CPU.

And that's only for the binned base model. The regular M4 has four p-cores.

I'm not optimistic that the 4 core variant will cost the same as the 4 core M3.

chucker23n1 said:
That's not why Apple and Intel give a lot of e-cores. Power efficiency is why. A ton of background tasks don't really need high performance. Very little of what people do on their computers actually needs it, and usually only in bursts.

That's a valid reason to have 2-4 e-cores. Maybe even 6. But not to have 16 like intel does in certain models. Those are mostly there for the benchmarks.

chucker23n1 · Jul 11, 2024

Cervisia said:
Cutting the bandwith is just about milking more money from people who'd use their mac for ai

I really doubt it.

I would, however, be amenable to the argument that not offering a 48 GiB (say) option for the M3 Pro is in part about driving people to go for the Max, even when they don't need some of the Max's other features. But the memory bandwidth? I think that's a stretch. It just doesn't have that much impact.

Cervisia said:
QED. Keep geekbench to theards talking about browsing experience, and forget it when talking about specific use cases which scale well with more resources. What geekbench does is kinda pointless, because an m1 is perfectly good for what they try to measure.

And how is that an acceptable generational difference?

Are you asking "how is 17% / 7% / -4% overall change compared to 9 months before acceptable"? Intel and Qualcomm would love to have such a change in less than a year.

And that's before we get to M4. Apple's overall iteration pace with their ARM designs seems… fine to me?

Cervisia said:
"Only in multi-core" is a strange way to start a sentence in 2024.

Yeah man. Why focus on workloads that, oh, I dunno, most people are affected by.

Even if you heavily do AI (which, my condolences), most of the time, your machine will be doing a ton of single-core stuff, whether that's background processing, or starting Internet fights in a web browser.

Cervisia said:
A very specific use case, hmm. Like trying to use a Pro CPU for Pro stuff?

I dunno about you, but I do software development. Only during actual builds does some parallelism really come into play, and even then, a lot of it simply doesn't scale to more than a handful of cores for extended periods of time.

Cervisia · Jul 12, 2024

chucker23n1 said:
I really doubt it.

I would, however, be amenable to the argument that not offering a 48 GiB (say) option for the M3 Pro is in part about driving people to go for the Max, even when they don't need some of the Max's other features. But the memory bandwidth? I think that's a stretch. It just doesn't have that much impact.

It has a huge impact. Higher memory bandwidth and speeds are one of the main reasons GPUs are used for scientific computing and AI calculations. The entry level NVIDIA desktop card has 272 GB/s memory bandwitdh, and everyone is complaining about how low it is, and holds back the card.

So, to make it clear again: 150 GB/s is excellent for a CPU. It becomes meh, when it also has to handle a GPU that's supposedly could perform around entry level desktop cards. What makes it terrible that it's deliberately cut back from previous generations.

chucker23n1 said:
Are you asking "how is 17% / 7% / -4% overall change compared to 9 months before acceptable"? Intel and Qualcomm would love to have such a change in less than a year.

And that's before we get to M4. Apple's overall iteration pace with their ARM designs seems… fine to me?

Don't compare it to Intel or Qualcomm. Compare it to the M2->M3 change. I won't even say compare the Maxes, because they got more P-cores. Or compare to AMD. And I'd take a machine with M2 Pro over M3 Pro in a heartbeat. The benchmark probably most relevant to me is the PassMark Physics Test, and the M2 Pro beats the M3 Pro by 60% in that test.

chucker23n1 said:
Yeah man. Why focus on workloads that, oh, I dunno, most people are affected by.

Because those workloads can run on a toaster. Totally pointless to benchmark them then run around happily that your new cpu has faster cores, then don't feel a damn difference while actually doing that stuff. That's why I'm totally baffled at what Geekbench is trying to do with this new direction.

chucker23n1 said:
Even if you heavily do AI (which, my condolences), most of the time, your machine will be doing a ton of single-core stuff, whether that's background processing, or starting Internet fights in a web browser.
I dunno about you, but I do software development. Only during actual builds does some parallelism really come into play, and even then, a lot of it simply doesn't scale to more than a handful of cores for extended periods of time.

Yeah, software dev is lightweight on the CPU, contrary to what most people believe. In 2016 I worked at a company where we got i7s - with HDDs. The IT was totally clueless.

Apple M5 Chip's Dual-Use Design Will Power Future Macs and AI Servers

macrumors G4

macrumors 68000

macrumors G3

macrumors G3

macrumors 6502a

macrumors newbie

macrumors 65816

macrumors 601

macrumors G3

macrumors 65816

macrumors newbie

macrumors member

macrumors 601

macrumors 601

macrumors 6502a

macrumors member

macrumors newbie

Contributor

macrumors 6502a

macrumors G3

macrumors G3

macrumors newbie

macrumors newbie

macrumors G3

macrumors newbie

Our Staff