Apple's Latest AI Models Lag Behind Competitors in Performance
Apple's new AI models powering features across iOS and macOS fall short against competitors. Human evaluators rated Apple's on-device model comparable to Google and Alibaba but behind OpenAI's GPT-4o and Meta's Llama 4 Scout in image analysis. Despite improvements in efficiency and multilingual support, Apple's AI efforts continue to trail industry leaders.
Apple recently unveiled updates to the artificial intelligence models that power its suite of Apple Intelligence features across iOS, macOS, and other platforms. However, according to Apple's own benchmark tests, these new models do not outperform older AI models from competitors such as OpenAI, Google, and Alibaba.
Apple’s “Apple On-Device” model, which runs offline on devices like the iPhone and contains roughly 3 billion parameters, was rated by human testers as producing text quality comparable to Google and Alibaba’s similarly sized models. However, it did not surpass them.
Meanwhile, Apple’s more powerful “Apple Server” model, designed to operate within Apple’s data centers, was rated behind OpenAI’s GPT-4o, a model that has been available for over a year. This indicates that Apple’s server-side AI lags behind the current state-of-the-art in language generation.
In image analysis tasks, human evaluators preferred Meta’s Llama 4 Scout model over Apple Server. This is notable because Llama 4 Scout itself performs worse than leading AI models from Google, Anthropic, and OpenAI, suggesting Apple’s image understanding capabilities are even further behind.
These benchmark results reinforce reports that Apple’s AI research division has struggled to keep pace with competitors in the fast-moving AI landscape. Despite years of development, Apple’s AI features have often underwhelmed, and a highly anticipated upgrade to Siri has been delayed indefinitely.
Additionally, some customers have filed lawsuits accusing Apple of marketing AI capabilities that have not yet been delivered, highlighting the gap between expectations and reality.
Apple On-Device supports features like text summarization and analysis, and is now accessible to third-party developers through Apple’s Foundation Models framework. Both Apple On-Device and Apple Server models have improved tool-use efficiency and support approximately 15 languages, thanks to an expanded training dataset that includes diverse data types such as images, PDFs, documents, infographics, tables, and charts.
Despite these enhancements, the performance gap with leading AI models remains significant, underscoring the challenges Apple faces in the competitive AI arena.
Keep Reading
View AllGoogle AI Features Slash Traffic to News Publishers
Google's AI tools reduce news site traffic by answering queries directly, challenging publishers' revenue and prompting new business models.
Pentagon Cuts AI Weapons Testing Team Raising Safety Concerns
Pentagon halves AI weapons testing staff, speeding adoption but risking safety and effectiveness of defense AI systems.
OpenAI ChatGPT Experiences Global Outages and Slowdowns
OpenAI's ChatGPT faces partial outages and slow responses affecting users worldwide since early June 10, 2025.
AI Tools Built for Agencies That Move Fast.
QuarkyByte offers deep insights into AI model benchmarking and competitive analysis. Explore how our data-driven evaluations can help your AI projects avoid pitfalls and leverage cutting-edge techniques to outperform rivals. Dive into QuarkyByte’s expert resources to sharpen your AI strategy with proven, actionable intelligence.