How Apple's On-Device and Server Foundation Models Work

How Apple's On-Device and Server Foundation Models Work

Siri icon in a data center



Apple announced new AI language models at WWDC. These models run both locally on Apple devices and on Apple's own Apple Silicon-powered AI servers.

Artificial intelligence (AI) depends on language models that provide knowledge input to train AI to produce results for prompts (questions).

Using language models, computers can be trained in specific subject areas, allowing them to act as domain experts in specific fields.

KI alignment refers to the process of designing and implementing AI systems so that they meet human goals, values, and desired outcomes. In other words, alignment is intended to keep AI on task and not become dangerous by deviating from its original purpose.

Bee WWDC 2024, Apple announced Apple Intelligence – Apple's own AI that will provide both on-device and server-based AI. By using new models in Apple Intelligence, Apple's AI will become more targeted, faster and more accurate.

Fundamental language models

Apple calls its general generative AI models basic language modelsThese models are Large Language Models (LLMs), which use up to 3 billion parameters and are designed for basic generative AI that most users might want to use.

Graph of the Apple Foundation Model.

Apple Foundation models.

Apple calls these two models AFM on deviceAnd AFM on server respectively.

Apple has also built other general models into Apple Intelligence. These models can run both on the device and on Apple's servers.

Apple offers a fairly detailed forty-seven page manual white paper about how the core language models work. From a technical standpoint, Apple's core models use a baseline of AI techniques, including:

  • Transformer architecture
  • IO Embedding Matrix
  • Pre-normalization
  • Query key normalization
  • Attention to grouped queries
  • SwiGLU activation
  • RoPE positional embeddings
  • Fine tuning
  • Human adjustments and inputs
Training Apple Foundation models for AI.

Apple Foundation models.

Apple Intelligence also uses an automated web crawler called AppleBot. Sites can tell AppleBot not to use their content by opting out in their robots.txt files.

For code AI, Apple Intelligence also learns from open-source software hosted on GitHub, learning and condensing from it, automatically removing duplicates.

Apple’s whitepaper describes in detail how the models work and the training methods used. Some advanced mathematics is also provided at the end.

Private cloud computing

Apple Private Cloud Compute (PCC) is a third-party AI service that uses all of the above models and also provides access to additional models for extended intelligence.

According to this blog post When describing PCC, Apple has several goals for PCC, including speed, accuracy, privacy, and site reliability.

PCC also uses the same Secure Enclave and Secure Boot as Apple consumer devices to ensure that the operating system and data cannot be tampered with.

PCC, like many other AI services from technology companies, offers the ability to execute AI tasks remotely, but with faster performance.

Apple summarizes its basic models as follows:

“Our models are built with the goal of helping users perform everyday activities on their Apple products, and are developed responsibly at every stage and guided by Apple’s core values. We look forward to sharing more information about our broader family of generative models, including language, diffusion, and coding models, soon.”

See also our articles iOS 17.6 and above will arrive after the release of the Apple Intelligence beta And Apple admits to use Google Tensor hardware to train Apple Intelligence.

Apple Intelligence promises to provide iOS And Mac users with faster, optimized AI on devices and in the cloud. We'll have to wait and see how it plays out with the upcoming release of iOS 18 and the next iteration of macOS.