There were a number of exciting announcements from Apple at WWDC 2024, from macOS Sequoia to Apple Intelligence. However, a subtle addition to Xcode 16 — the development environment for Apple platforms, like iOS and macOS — is a feature called Predictive Code Completion. Unfortunately, if you bought into Apple’s claim that 8GB of unified memory was enough for base-model Apple silicon Macs, you won’t be able to use it. There’s a memory requirement for Predictive Code Completion in Xcode 16, and it’s the closest thing we’ll get from Apple to an admission that 8GB of memory isn’t really enough for a new Mac in 2024.
Yes, there are massive advantages. It’s basically what makes unified memory possible on modern Macs. Especially with all the interest in AI nowadays, you really don’t want a machine with a discrete GPU/VRAM, a discrete NPU, etc.
Take for example a modern high-end PC with an RTX 4090. Those only have 24GB VRAM and that VRAM is only accessible through the (relatively slow) PCIe bus. AI models can get really big, and 24GB can be too little for the bigger models. You can spec an M2 Ultra with 192GB RAM and almost all of it is accessible by the GPU directly. Even better, the GPU can access that without any need for copying data back and forth over the PCIe bus, so literally 0 overhead.
The advantages of this multiply when you have more dedicated silicon. For example: if you have an NPU, that can use the same memory pool and access the same shared data as the CPU and GPU with no overhead. The M series also have dedicated video encoder/decoder hardware, which again can access the unified memory with zero overhead.
For example: you could have an application that replaces the background on a video using AI. It takes a video, decompresses it using the video decoder , the decompressed video frames are immediately available to all other components. The GPU can then be used to pre-process the frames, the NPU can use the processed frames as input to some AI model and generate a new frame and the video encoder can immediately access that result and compress it into a new video file.
The overhead of just copying data for such an operation on a system with non-unified memory would be huge. That’s why I think that the AI revolution is going to be one of the driving factors in killing systems with non-unified memory architectures, at least for end-user devices.