Apple

DeepSeek’s AI breakthrough may actually benefit Apple Intelligence – BGR


Sanctions prevented DeepSeek from buying the NVIDIA GPUs it needed to train AI models as powerful as OpenAI’s ChatGPT o1 reasoning model. Unable to purchase the AI hardware it needed, the Chinese startup devised a different method to train the DeepSeek R1 reasoning model, sending shockwaves around the world.

DeepSeek R1 training costs 3% to 5% of what training ChatGPT o1 costs. DeepSeek’s models are also cheaper to operate, further reducing access costs. On top of that, you can install DeepSeek on your computer and run it locally, as the company made the AI open-source. Well, at least the commercial product, as the training data set and instructions are still secret.

These developments tanked the market, with the likes of NVIDIA being the most impacted. Suddenly, investors realized that AI companies like OpenAI would not necessarily need to amass more compute power to develop better versions of AI.

But there is one stock that outperformed the market, and that’s Apple. It might seem like a surprising development considering how far behind Apple Intelligence looks to be right now compared to the likes of ChatGPT o1, Operator, Gemini, and DeepSeek R1.

However, Apple has a unique approach to AI, and DeepSeek’s innovations might help it deliver the AI future it wants to offer iPhone users. And I’m not suggesting Apple will incorporate DeepSeek as an alternative to ChatGPT in Apple Intelligence. Instead, Apple might learn from DeepSeek’s innovations and copy them.

While the market was in freefall on Monday, I said the worries about NVIDIA GPU hardware suddenly becoming obsolete are ill-placed. Yes, DeepSeek might have come up with a more efficient way to train AI to be as smart and capable as ChatGPT. But that doesn’t mean you don’t need access to fast, reliable AI hardware.

The fact that DeepSeek registrations are temporarily limited, presumably due to a cyberattack, tells me that another explanation is possible. DeepSeek’s infrastructure might be too limited to accommodate demand. Blaming it all on a cyberattack sounds much better than admitting that AI needs tons of power to get off the ground.

That’s all speculation, but time will soon answer that mystery. Either the cyberattacks will be repelled and registrations will resume, or we’ll witness prolonged limitations indicative of other issues.

DeepSeek iPhone app.
DeepSeek iPhone app. Image source: App Store

I also said on Monday that China surpassing US AI firms is temporary. The innovations that DeepSeek introduced will be replicated across the industry. They probably already have been. What happens if an entity like OpenAI or Google adopts AI training similar to DeepSeek? We’ll see even faster innovation.

Again, it’s speculation. But everybody copies everybody in tech.

So how does this benefit Apple Intelligence on iPhone? Let’s start with the basics.

Remember that Apple is the only tech giant to have announced a massive AI project with privacy at the core. Apple Intelligence is supposed to run mostly on-device. When that’s impossible, Apple Intelligence will move information to Apple’s servers in what Apple calls the Private Cloud Compute.

Apple’s iOS 18.4 update will deliver the big Siri upgrade we saw at WWDC last year. Siri will be able to analyze more user data stored on-device to offer iPhone users an even better assistant. The problem with this Siri is that it’s not a chatbot. Apple doesn’t have a ChatGPT alternative, so it built ChatGPT access into Apple Intelligence. A Siri chatbot is likely coming with iOS 19 next year.

Whenever Apple is ready to offer chatbots similar to ChatGPT o1 and DeepSeek R1, it’ll have to find ways to have them run on iPhones. That’s where the DeepSeek tech might come in handy, specifically the distillation process. Ben Thompson explained it all in a DeepSeek FAQ. It refers to using a bleeding-edge AI model or model to train smaller models:

Distillation is a means of extracting understanding from another model; you can send inputs to the teacher model and record the outputs, and use that to train the student model. This is how you get models like GPT-4 Turbo from GPT-4. Distillation is easier for a company to do on its own models, because they have full access, but you can still do distillation in a somewhat more unwieldy way via API, or even, if you get creative, via chat clients.

Distillation obviously violates the terms of service of various models, but the only way to stop it is to actually cut off access, via IP banning, rate limiting, etc. It’s assumed to be widespread in terms of model training, and is why there are an ever-increasing number of models converging on GPT-4o quality. This doesn’t mean that we know for a fact that DeepSeek distilled 4o or Claude, but frankly, it would be odd if they didn’t.

Apple could use this tech to train specialized Apple Intelligence models that run on iPhones. Think of a “Siri mini” AI model that only handles conversational interactions via text and voice on the iPhone. A different mini model might be used for other specific tasks on the iPhone to ensure those tasks are performed on the iPhone.

iPhone 16 Guest Mode
iPhone 16 Pro. Image source: Jonathan S. Geller

This will make AI inference, the process of receiving a user command and providing an answer, cheaper, faster, and more private on iPhone than on other devices. Thompson identified the big winners in the wake of the DeepSeek R1 research, and Apple is one of them:

Apple is also a big winner. Dramatically decreased memory requirements for inference make edge inference much more viable, and Apple has the best hardware for exactly that. Apple Silicon uses unified memory, which means that the CPU, GPU, and NPU (neural processing unit) have access to a shared pool of memory; this means that Apple’s high-end hardware actually has the best consumer chip for inference (Nvidia gaming GPUs max out at 32GB of VRAM, while Apple’s chips go up to 192 GB of RAM).

There’s also the fact that DeepSeek did what we’ve known Apple to do for years: Optimize software to run on more limited hardware. The iPhone never matched Android in terms of specs, though it led the market with its high-end A-series chips. Apple optimized the iOS experience to run on more limited amounts of RAM while delivering a fast mobile experience that didn’t impact battery life.

DeepSeek achieved something similar in AI. It used software optimizations to train a ChatGPT o1 rival using less capable AI hardware than OpenAI has. Everyone will be interested in replicating that, especially companies with access to the latest NVIDIA hardware.

Apple is likely paying attention to all of these developments, and we might see results in the near future. I am speculating, of course, but who in their right mind can ignore DeepSeek’s AI innovations right now? Especially if AI is at the core of all the products you make.

Finally, I’ll also point out that DeepSeek made news for topping the App Store this week, turning the iPhone into the go-to device for sampling new AI innovations, even those that aren’t tied to Apple Intelligence. Also, unlike Apple Intelligence, DeepSeek works on your current iPhone, just like the ChatGPT standalone app.



READ SOURCE

This website uses cookies. By continuing to use this site, you accept our use of cookies.