Monday, June 15, 2026
Apple Foundation Models Framework — Setup Guide
Posted by

Apple Foundation Models Framework
Apple's Foundation Models framework, introduced at WWDC 2025 and expanded significantly at WWDC 2026, is the native Swift API for integrating language models into Apple apps. It provides a unified LanguageModel protocol that covers three tiers of inference:
- On-device — Apple's own ~3B and 20B sparse models, running locally on Apple Silicon. Zero API cost, fully offline.
- Private Cloud Compute (PCC) — Apple's cloud-hosted AFM 3 Cloud and AFM 3 Cloud Pro models, stateless and verifiable. Free for Small Business Program developers under 2M first-time downloads.
- Third-party providers — Anthropic Claude and Google Gemini, available through the same Swift API via provider-specific packages.
The framework entered beta with iOS 27, macOS 27, visionOS 27, and watchOS 27 (OS 27 betas as of WWDC 2026). Requires Apple Intelligence-capable hardware (Apple Silicon).
Note:
Apple announced intentions to open source the framework in late summer 2026. For now, it ships as part of the OS SDKs and requires Xcode 27 beta.
Architecture Overview
The framework's key innovation is the three-tier model hierarchy, each with different privacy and cost characteristics:
Model Tiers
Values: Free, offline
Values: Free ≤2M downloads / undisclosed >2M
Values: Provider API pricing
Installation
Requirements
| Requirement | Version |
|---|---|
| iOS | 27+ |
| macOS | 27+ |
| visionOS | 27+ |
| watchOS | 27+ |
| Xcode | 27 (beta) |
| Hardware | Apple Silicon (Apple Intelligence-capable) |
Swift Package Manager
Add the Foundation Models framework — it ships with the SDK, so no external URL is needed for on-device and PCC models. For Claude integration, add the Anthropic package:
// Package.swift or Xcode project
// Foundation Models comes with the SDK — no SPM URL needed
// For Claude support, add:
dependencies: [
.package(url: "https://github.com/anthropics/ClaudeForFoundationModels.git",
from: "0.1.0")
]
In Xcode: File > Add Package Dependencies… with the repository URL https://github.com/anthropics/ClaudeForFoundationModels.git.
Import Statements
// Foundation Models — always required
import FoundationModels
// Claude provider — only if using Claude as a third-party model
import ClaudeForFoundationModels
Authentication
Authentication depends on which model tier you're using.
On-device models — No authentication required
On-device inference runs entirely on the user's device. No API key, no network call, no per-token cost.
let session = LanguageModelSession(
instructions: "You are a helpful assistant."
)
// No API key needed — runs locally
Private Cloud Compute — Optional Small Business Program
If your app qualifies for the App Store Small Business Program (fewer than 2 million total first-time downloads), PCC inference is free. No API key required for these tiers either.
Claude API — API key for development, proxied for production
// Development — API key from Claude Console
let model = ClaudeLanguageModel(
name: .sonnet4_6,
auth: .apiKey(ProcessInfo.processInfo.environment["ANTHROPIC_API_KEY"] ?? "")
)
// Production — route through your own backend
let model = ClaudeLanguageModel(
name: .opus4_8,
auth: .proxied(headers: ["X-App-Token": "your-app-token"]),
baseURL: "https://api.yourapp.com/claude"
)
Note:
Never bundle API keys in shipping binaries. Use the .proxied authentication pattern for production deployments, or prompt the user to provide their own key at runtime.
Quick Start
Minimal working example using the on-device model with structured output:
import FoundationModels
// 1. Define your output schema
struct TripIdea: Codable {
let title: String
let summary: String
let estimatedDays: Int
}
// 2. Create a session
let session = LanguageModelSession(
instructions: "Suggest a short trip the user can take this weekend."
)
// 3. Generate structured output
let idea = try await session.respond(
to: "I'm in Lisbon and love food and walking.",
generating: TripIdea.self
)
print(idea.title, idea.estimatedDays)
// "Sintra Day Trip" 2
That's it. No API key, no network configuration, no model selection — the framework picks the best available model for the task.
Core API Reference
The central abstraction is LanguageModelSession. Think of it as a stateful conversation session that can use any model tier.
Key APIs
Values: session.respond(to:)
Values: try await session.respond(to: "prompt")
Values: for try await partial in stream
Values: ClaudeLanguageModel(name: .sonnet4_6, auth: ...)
Values: Prompt { "Text" imageAsset }
Streaming
let stream = session.streamResponse(
to: "Summarize today's top science stories."
)
for try await partial in stream {
print(partial.content)
}
streamResponse(to:) returns cumulative snapshots — each yield includes the full text generated so far, not just the delta.
Structured Output with @Generable
The framework uses Swift's macro system for type-safe structured output:
@Generable
struct ReceiptSummary {
var merchant: String
var total: Double
var category: String
}
// Multimodal input — on-device models support image understanding
let summary = try await session.respond(
to: Prompt {
"Summarize this receipt and categorize the spend."
receiptImage
},
generating: ReceiptSummary.self
)
Note:
@Generable works with both on-device and third-party models. The framework handles schema serialization and response parsing transparently — you get typed Swift values back, not raw JSON.
Claude-Specific Integration
When using Claude as a third-party provider, the ClaudeForFoundationModels package provides additional controls:
let model = ClaudeLanguageModel(
name: .sonnet4_6,
auth: .apiKey(apiKey),
baseURL: "https://api.anthropic.com", // default
timeout: 30, // seconds
fixedEffort: .xhigh, // pin extended thinking level
serverTools: [.webSearch(maxUses: 5), .codeExecution]
)
let session = LanguageModelSession(model: model)
Available Claude models:
| Identifier | Model |
|---|---|
.sonnet4_6 | claude-sonnet-4-6 |
.opus4_8 | claude-opus-4-8 |
Supported effort levels: .low, .high, .xhigh, .max
Server-Side Tools (Claude Only)
Claude supports server-side tool execution — tools that run on Anthropic's infrastructure rather than in your app:
.webSearch(maxUses: 5)— Claude searches the web for up-to-date information.codeExecution— Claude writes and executes code in a sandboxed environment
Privacy Architecture
Apple's privacy model differs fundamentally from every other provider. The framework makes the privacy boundary explicit:
| Tier | Privacy | Data Flow |
|---|---|---|
| On-device | Maximum | Data never leaves device |
| Private Cloud Compute | PCC-equivalent | Stateless enclaves, verifiable transparency, no data retention |
| Third-party (Claude, Gemini) | Provider-dependent | Data leaves Apple's boundary — governed by the provider's terms |
Note:
This is the key decision point. If you choose a third-party model like Claude, user data reaches Anthropic's API. Your app's privacy policy and the user's consent must account for this — Apple's on-device and PCC guarantees do not extend to third-party providers. The LanguageModelSession abstraction makes it easy to switch tiers, but the privacy difference between on-device and third-party is invisible at the API layer and easy to overlook.
Tool Calling
Tools can be defined client-side (within the app) or server-side (on the provider's infrastructure).
Client-Side Tools
Pass an array of tools to LanguageModelSession. The model can invoke them and receive results:
// Tool protocol conformance — define what the model can invoke
struct WeatherTool: Tool {
func call(arguments: [String: Any]) async throws -> String {
// Your weather API logic
return "72°F and sunny"
}
}
let session = LanguageModelSession(tools: [WeatherTool()])
Server-Side Tools (ClaudeProvider)
Configure server-side tools when creating the ClaudeLanguageModel:
let model = ClaudeLanguageModel(
name: .sonnet4_6,
auth: .apiKey(apiKey),
serverTools: [.webSearch(maxUses: 5), .codeExecution]
)
Multi-Agent Workflows
WWDC 2026 introduced Dynamic Profiles — the ability to swap instructions, tools, and model preferences within a single session. This enables coordinator-style multi-agent patterns:
let session = LanguageModelSession()
// Planning phase
session.applyProfile(
"planner",
instructions: "Break down the user's request into subtasks.",
tools: []
)
// Execution phase — grant broader tool access
session.applyProfile(
"executor",
instructions: "Execute each subtask using available tools.",
tools: [CalendarTool(), WeatherTool(), MapsTool()]
)
// Review phase — switch to stronger model for validation
session.applyProfile(
"reviewer",
model: ClaudeLanguageModel(name: .opus4_8, auth: apiKey),
instructions: "Verify the executor's results for correctness."
)
The framework handles context and key-value caching between profile swaps, so each phase inherits the conversation state.
Pricing Comparison
The pricing story is where Apple Foundation Models radically diverge from every other platform.
| Tier | Cost | When to Use |
|---|---|---|
| On-device (AFM 3 Core / Core Advanced) | Free | All local tasks — summarization, classification, short generation, offline features |
| Private Cloud Compute (AFM 3 Cloud / Cloud Pro) | Free (Small Business, <2M downloads) | Harder reasoning, larger context — no additional API cost for qualifying apps |
| Third-party (Claude, Gemini) | Provider API pricing | Frontier capabilities — complex reasoning, agentic tool use, specialized domains |
Bottom line: If your app qualifies for the Small Business Program, you have access to Apple's own cloud models at no additional API cost. The only scenario where you pay per-token is using third-party providers like Claude for capabilities Apple's models can't match (extended reasoning, complex tool orchestration).
Limitations
The Foundation Models framework is not a general-purpose API client. Apple documents these limitations explicitly:
| Limitation | Notes |
|---|---|
| Custom prompt caching | Applied automatically — no manual control |
| Stop sequences | Not supported at the framework level |
| Batch processing | Not available — use the Messages API directly for batch |
| Files API | Not exposed through Foundation Models |
| Token counting | Not exposed — no countTokens() equivalent |
| Beta headers | Not supported — provider-specific features behind beta headers require direct API access |
| Messages API | The framework is not a Messages API client — for full API access, use the provider's SDK directly |
Note:
If you need granular control over prompts, caching, or batch processing, the Foundation Models framework is the wrong abstraction. Use the provider's native SDK for those use cases and drop down to Foundation Models only when you need on-device inference or the native Swift LanguageModelSession abstraction.
Comparison: Apple Foundation Models vs. Direct API Access
| Aspect | Foundation Models | Direct Provider API |
|---|---|---|
| On-device inference | Yes — native, free, offline | No — requires Core ML model import |
| Privacy tiers | Explicit (on-device → PCC → third-party) | Single tier |
| Structured output | @Generable macro — typed Swift values | JSON schema / constrained decoding |
| Streaming | streamResponse(to:) | SDK-specific |
| Tool calling | Protocol-based + server-side | SDK-specific |
| Multi-agent | Dynamic Profiles (WWDC 26) | Manual implementation |
| Framework context | Swift SDK — no external deps (for on-device/PCC) | Requires provider SDK + network |
| Control | High-level, opinionated | Full control over every parameter |
Pitfalls
1. Privacy is tier-dependent and invisible at the API layer.
The LanguageModelSession API looks identical whether the model runs on-device, in PCC, or on Anthropic's servers. Your app could be sending user data to a third-party API without any code change — only a model configuration change. Audit your model selection logic and surface the privacy tier to users.
2. Third-party models require separate API keys and billing.
If you switch from on-device to Claude via ClaudeLanguageModel, your users don't automatically get a Claude API key. You either need to provide one (and pay for it) or prompt the user. The free tiers only apply to Apple's own models.
3. Regional restrictions. Apple Foundation Models and the on-device models are unavailable in the EU and mainland China at launch due to regulatory hurdles. If your app targets these regions, you need fallback logic.
4. Beta stability. The framework ships as part of OS 27 betas. Expect breaking API changes between beta releases. Pin your Xcode version and test against each beta.
5. ClaudeForFoundationModels vs. direct Messages API.
The ClaudeForFoundationModels Swift package is a subset of the full Anthropic API. You lose access to prompt caching controls, stop sequences, batch processing, files, beta headers, and token counting. If you need these, use anthropic-sdk-swift directly.
6. Xcode 27 requirement. The Foundation Models framework only ships with Xcode 27. You cannot use it with older Xcode versions. If your team is still on Xcode 25 or 26, you need to upgrade to even compile against the framework.
Key Takeaway
Key Takeaway
Apple Foundation Models is the first framework that gives iOS/macOS developers a single, native Swift API for on-device LLM inference, Apple's Private Cloud Compute, and third-party frontier models like Claude. The on-device tier is transformative — free inference at ~30 tok/s with full privacy is unlike anything else in the market. Use Foundation Models for on-device and PCC tasks, drop to direct provider SDKs when you need granular API control or features the framework doesn't expose (caching, batch, files). The privacy cliff between Apple models and third-party providers is the one thing every developer should understand before adopting this framework.