Apple Foundation Models Framework

Apple's Foundation Models framework, introduced at WWDC 2025 and expanded significantly at WWDC 2026, is the native Swift API for integrating language models into Apple apps. It provides a unified LanguageModel protocol that covers three tiers of inference:

On-device — Apple's own ~3B and 20B sparse models, running locally on Apple Silicon. Zero API cost, fully offline.
Private Cloud Compute (PCC) — Apple's cloud-hosted AFM 3 Cloud and AFM 3 Cloud Pro models, stateless and verifiable. Free for Small Business Program developers under 2M first-time downloads.
Third-party providers — Anthropic Claude and Google Gemini, available through the same Swift API via provider-specific packages.

The framework entered beta with iOS 27, macOS 27, visionOS 27, and watchOS 27 (OS 27 betas as of WWDC 2026). Requires Apple Intelligence-capable hardware (Apple Silicon).

Note:

Apple announced intentions to open source the framework in late summer 2026. For now, it ships as part of the OS SDKs and requires Xcode 27 beta.

Architecture Overview

The framework's key innovation is the three-tier model hierarchy, each with different privacy and cost characteristics:

Model Tiers

On-device

AFM 3 Core (3B dense) and AFM 3 Core Advanced (20B sparse, 1–4B active params via IFPruning). Best for classification, summarization, short generation. Runs offline. ~30 tok/s on iPhone 15 Pro.

Values: Free, offline

Private Cloud Compute (PCC)

AFM 3 Cloud (general text, image understanding) and AFM 3 Cloud Pro (agentic tool use, deep reasoning on NVIDIA GPUs in Google Cloud). Stateless, verifiable compute enclaves.

Values: Free ≤2M downloads / undisclosed >2M

Third-party providers

Claude (via ClaudeForFoundationModels Swift package), Gemini (via Google package). Full frontier model capabilities. Data leaves Apple's privacy boundary.

Values: Provider API pricing

Installation

Requirements

Requirement	Version
iOS	27+
macOS	27+
visionOS	27+
watchOS	27+
Xcode	27 (beta)
Hardware	Apple Silicon (Apple Intelligence-capable)

Swift Package Manager

Add the Foundation Models framework — it ships with the SDK, so no external URL is needed for on-device and PCC models. For Claude integration, add the Anthropic package:

// Package.swift or Xcode project
// Foundation Models comes with the SDK — no SPM URL needed
// For Claude support, add:
dependencies: [
    .package(url: "https://github.com/anthropics/ClaudeForFoundationModels.git",
             from: "0.1.0")
]

In Xcode: File > Add Package Dependencies… with the repository URL https://github.com/anthropics/ClaudeForFoundationModels.git.

Import Statements

// Foundation Models — always required
import FoundationModels

// Claude provider — only if using Claude as a third-party model
import ClaudeForFoundationModels

Authentication

Authentication depends on which model tier you're using.

On-device models — No authentication required

On-device inference runs entirely on the user's device. No API key, no network call, no per-token cost.

let session = LanguageModelSession(
    instructions: "You are a helpful assistant."
)
// No API key needed — runs locally

Private Cloud Compute — Optional Small Business Program

If your app qualifies for the App Store Small Business Program (fewer than 2 million total first-time downloads), PCC inference is free. No API key required for these tiers either.

Claude API — API key for development, proxied for production

// Development — API key from Claude Console
let model = ClaudeLanguageModel(
    name: .sonnet4_6,
    auth: .apiKey(ProcessInfo.processInfo.environment["ANTHROPIC_API_KEY"] ?? "")
)

// Production — route through your own backend
let model = ClaudeLanguageModel(
    name: .opus4_8,
    auth: .proxied(headers: ["X-App-Token": "your-app-token"]),
    baseURL: "https://api.yourapp.com/claude"
)

Note:

Never bundle API keys in shipping binaries. Use the .proxied authentication pattern for production deployments, or prompt the user to provide their own key at runtime.

Quick Start

Minimal working example using the on-device model with structured output:

import FoundationModels

// 1. Define your output schema
struct TripIdea: Codable {
    let title: String
    let summary: String
    let estimatedDays: Int
}

// 2. Create a session
let session = LanguageModelSession(
    instructions: "Suggest a short trip the user can take this weekend."
)

// 3. Generate structured output
let idea = try await session.respond(
    to: "I'm in Lisbon and love food and walking.",
    generating: TripIdea.self
)

print(idea.title, idea.estimatedDays)
// "Sintra Day Trip" 2

That's it. No API key, no network configuration, no model selection — the framework picks the best available model for the task.

Core API Reference

The central abstraction is LanguageModelSession. Think of it as a stateful conversation session that can use any model tier.

Key APIs

LanguageModelSession

Stateful session with instructions, context, and tool definitions. Manages multi-turn state across model tiers.

Values: session.respond(to:)

respond(to:)

Single-turn text generation. Returns the full response. Supports optional structured output via `generating:` parameter.

Values: try await session.respond(to: "prompt")

streamResponse(to:)

Streaming text generation. Returns an async sequence of cumulative snapshots. Use for chat interfaces and progressive rendering.

Values: for try await partial in stream

ClaudeLanguageModel

Claude provider implementation of LanguageModel protocol. Supports Sonnet 4.6 and Opus 4.8, with baseURL, timeout, and server-side tool configuration.

Values: ClaudeLanguageModel(name: .sonnet4_6, auth: ...)

Prompt {}

Builder for prompts that include multimodal inputs (text + images). Available for on-device models in iOS 27.

Values: Prompt { "Text" imageAsset }

Streaming

let stream = session.streamResponse(
    to: "Summarize today's top science stories."
)
for try await partial in stream {
    print(partial.content)
}

streamResponse(to:) returns cumulative snapshots — each yield includes the full text generated so far, not just the delta.

Structured Output with @Generable

The framework uses Swift's macro system for type-safe structured output:

@Generable
struct ReceiptSummary {
    var merchant: String
    var total: Double
    var category: String
}

// Multimodal input — on-device models support image understanding
let summary = try await session.respond(
    to: Prompt {
        "Summarize this receipt and categorize the spend."
        receiptImage
    },
    generating: ReceiptSummary.self
)

Note:

@Generable works with both on-device and third-party models. The framework handles schema serialization and response parsing transparently — you get typed Swift values back, not raw JSON.

Claude-Specific Integration

When using Claude as a third-party provider, the ClaudeForFoundationModels package provides additional controls:

let model = ClaudeLanguageModel(
    name: .sonnet4_6,
    auth: .apiKey(apiKey),
    baseURL: "https://api.anthropic.com",        // default
    timeout: 30,                                   // seconds
    fixedEffort: .xhigh,                          // pin extended thinking level
    serverTools: [.webSearch(maxUses: 5), .codeExecution]
)

let session = LanguageModelSession(model: model)

Available Claude models:

Identifier	Model
`.sonnet4_6`	claude-sonnet-4-6
`.opus4_8`	claude-opus-4-8

Supported effort levels: .low, .high, .xhigh, .max

Server-Side Tools (Claude Only)

Claude supports server-side tool execution — tools that run on Anthropic's infrastructure rather than in your app:

.webSearch(maxUses: 5) — Claude searches the web for up-to-date information
.codeExecution — Claude writes and executes code in a sandboxed environment

Privacy Architecture

Apple's privacy model differs fundamentally from every other provider. The framework makes the privacy boundary explicit:

Tier	Privacy	Data Flow
On-device	Maximum	Data never leaves device
Private Cloud Compute	PCC-equivalent	Stateless enclaves, verifiable transparency, no data retention
Third-party (Claude, Gemini)	Provider-dependent	Data leaves Apple's boundary — governed by the provider's terms

Note:

This is the key decision point. If you choose a third-party model like Claude, user data reaches Anthropic's API. Your app's privacy policy and the user's consent must account for this — Apple's on-device and PCC guarantees do not extend to third-party providers. The LanguageModelSession abstraction makes it easy to switch tiers, but the privacy difference between on-device and third-party is invisible at the API layer and easy to overlook.

Tool Calling

Tools can be defined client-side (within the app) or server-side (on the provider's infrastructure).

Client-Side Tools

Pass an array of tools to LanguageModelSession. The model can invoke them and receive results:

// Tool protocol conformance — define what the model can invoke
struct WeatherTool: Tool {
    func call(arguments: [String: Any]) async throws -> String {
        // Your weather API logic
        return "72°F and sunny"
    }
}

let session = LanguageModelSession(tools: [WeatherTool()])

Server-Side Tools (ClaudeProvider)

Configure server-side tools when creating the ClaudeLanguageModel:

let model = ClaudeLanguageModel(
    name: .sonnet4_6,
    auth: .apiKey(apiKey),
    serverTools: [.webSearch(maxUses: 5), .codeExecution]
)

Multi-Agent Workflows

WWDC 2026 introduced Dynamic Profiles — the ability to swap instructions, tools, and model preferences within a single session. This enables coordinator-style multi-agent patterns:

let session = LanguageModelSession()

// Planning phase
session.applyProfile(
    "planner",
    instructions: "Break down the user's request into subtasks.",
    tools: []
)

// Execution phase — grant broader tool access
session.applyProfile(
    "executor",
    instructions: "Execute each subtask using available tools.",
    tools: [CalendarTool(), WeatherTool(), MapsTool()]
)

// Review phase — switch to stronger model for validation
session.applyProfile(
    "reviewer",
    model: ClaudeLanguageModel(name: .opus4_8, auth: apiKey),
    instructions: "Verify the executor's results for correctness."
)

The framework handles context and key-value caching between profile swaps, so each phase inherits the conversation state.

Pricing Comparison

The pricing story is where Apple Foundation Models radically diverge from every other platform.

Tier	Cost	When to Use
On-device (AFM 3 Core / Core Advanced)	Free	All local tasks — summarization, classification, short generation, offline features
Private Cloud Compute (AFM 3 Cloud / Cloud Pro)	Free (Small Business, <2M downloads)	Harder reasoning, larger context — no additional API cost for qualifying apps
Third-party (Claude, Gemini)	Provider API pricing	Frontier capabilities — complex reasoning, agentic tool use, specialized domains

Bottom line: If your app qualifies for the Small Business Program, you have access to Apple's own cloud models at no additional API cost. The only scenario where you pay per-token is using third-party providers like Claude for capabilities Apple's models can't match (extended reasoning, complex tool orchestration).

Limitations

The Foundation Models framework is not a general-purpose API client. Apple documents these limitations explicitly:

Limitation	Notes
Custom prompt caching	Applied automatically — no manual control
Stop sequences	Not supported at the framework level
Batch processing	Not available — use the Messages API directly for batch
Files API	Not exposed through Foundation Models
Token counting	Not exposed — no `countTokens()` equivalent
Beta headers	Not supported — provider-specific features behind beta headers require direct API access
Messages API	The framework is not a Messages API client — for full API access, use the provider's SDK directly

Note:

If you need granular control over prompts, caching, or batch processing, the Foundation Models framework is the wrong abstraction. Use the provider's native SDK for those use cases and drop down to Foundation Models only when you need on-device inference or the native Swift LanguageModelSession abstraction.

Comparison: Apple Foundation Models vs. Direct API Access

Aspect	Foundation Models	Direct Provider API
On-device inference	Yes — native, free, offline	No — requires Core ML model import
Privacy tiers	Explicit (on-device → PCC → third-party)	Single tier
Structured output	`@Generable` macro — typed Swift values	JSON schema / constrained decoding
Streaming	`streamResponse(to:)`	SDK-specific
Tool calling	Protocol-based + server-side	SDK-specific
Multi-agent	Dynamic Profiles (WWDC 26)	Manual implementation
Framework context	Swift SDK — no external deps (for on-device/PCC)	Requires provider SDK + network
Control	High-level, opinionated	Full control over every parameter

Pitfalls

1. Privacy is tier-dependent and invisible at the API layer. The LanguageModelSession API looks identical whether the model runs on-device, in PCC, or on Anthropic's servers. Your app could be sending user data to a third-party API without any code change — only a model configuration change. Audit your model selection logic and surface the privacy tier to users.

2. Third-party models require separate API keys and billing. If you switch from on-device to Claude via ClaudeLanguageModel, your users don't automatically get a Claude API key. You either need to provide one (and pay for it) or prompt the user. The free tiers only apply to Apple's own models.

3. Regional restrictions. Apple Foundation Models and the on-device models are unavailable in the EU and mainland China at launch due to regulatory hurdles. If your app targets these regions, you need fallback logic.

4. Beta stability. The framework ships as part of OS 27 betas. Expect breaking API changes between beta releases. Pin your Xcode version and test against each beta.

5. ClaudeForFoundationModels vs. direct Messages API. The ClaudeForFoundationModels Swift package is a subset of the full Anthropic API. You lose access to prompt caching controls, stop sequences, batch processing, files, beta headers, and token counting. If you need these, use anthropic-sdk-swift directly.

6. Xcode 27 requirement. The Foundation Models framework only ships with Xcode 27. You cannot use it with older Xcode versions. If your team is still on Xcode 25 or 26, you need to upgrade to even compile against the framework.

Key Takeaway

Apple Foundation Models is the first framework that gives iOS/macOS developers a single, native Swift API for on-device LLM inference, Apple's Private Cloud Compute, and third-party frontier models like Claude. The on-device tier is transformative — free inference at ~30 tok/s with full privacy is unlike anything else in the market. Use Foundation Models for on-device and PCC tasks, drop to direct provider SDKs when you need granular API control or features the framework doesn't expose (caching, batch, files). The privacy cliff between Apple models and third-party providers is the one thing every developer should understand before adopting this framework.

Apple Foundation Models Framework — Setup Guide