Apple Intelligence APIs - Building Private, On-Device AI Features

Remember when adding AI to your app meant shipping API keys, managing rate limits, and writing privacy policies longer than your actual code? Well, Apple just changed the game. The Foundation Models framework lets you tap into the same on-device LLMs that power Apple Intelligence - no cloud required, no data leaving the device. Let's see what this means for us developers.

Getting Started

First things first - we need to check if Apple Intelligence is available. Not every device supports it, and users need to have it enabled. Here's how we handle that:

struct GenerativeView: View {
    private var model = SystemLanguageModel.default
    
    var body: some View {
        switch model.availability {
        case .available:
            // We're good to go! 🚀
            ContentView()
        case .unavailable(.deviceNotEligible):
            // Older device - show fallback UI
            AlternativeView()
        case .unavailable(.appleIntelligenceNotEnabled):
            // User needs to flip the switch in Settings
            EnableIntelligenceView()
        case .unavailable(.modelNotReady):
            // Still downloading or warming up
            LoadingView()
        case .unavailable(let other):
            // Something else is up
            UnavailableView(reason: other)
        }
    }
}

struct GenerativeView: View {
    private var model = SystemLanguageModel.default
    
    var body: some View {
        switch model.availability {
        case .available:
            // We're good to go! 🚀
            ContentView()
        case .unavailable(.deviceNotEligible):
            // Older device - show fallback UI
            AlternativeView()
        case .unavailable(.appleIntelligenceNotEnabled):
            // User needs to flip the switch in Settings
            EnableIntelligenceView()
        case .unavailable(.modelNotReady):
            // Still downloading or warming up
            LoadingView()
        case .unavailable(let other):
            // Something else is up
            UnavailableView(reason: other)
        }
    }
}

Notice something missing? No API keys. No network checks. No privacy policies to worry about. It's refreshingly simple.

Let's Generate Some Text

Once we know the model is available, it's time to create a session and start generating. The API is super straightforward:

// Simple one-shot generation
let session = LanguageModelSession()
let response = try await session.respond(to: "Write a haiku about coffee")

// But here's where it gets interesting - conversations!
let conversationSession = LanguageModelSession()
let first = try await conversationSession.respond(to: "Tell me about SwiftUI")
let followUp = try await conversationSession.respond(to: "How does it compare to UIKit?")
// The session remembers what you talked about 🧠

// Simple one-shot generation
let session = LanguageModelSession()
let response = try await session.respond(to: "Write a haiku about coffee")

// But here's where it gets interesting - conversations!
let conversationSession = LanguageModelSession()
let first = try await conversationSession.respond(to: "Tell me about SwiftUI")
let followUp = try await conversationSession.respond(to: "How does it compare to UIKit?")
// The session remembers what you talked about 🧠

Pretty powerful, huh? The session maintains context between calls, so you can build actual conversations.

Making It Your Own with Instructions

Here's where things get really interesting. You can give the model instructions that shape how it responds throughout the entire session. Think of it as setting the personality for your AI assistant:

let instructions = """
    You are a helpful recipe assistant for a meal planning app.
    - Suggest recipes based on available ingredients
    - Keep responses concise (under 100 words)
    - Focus on quick, healthy meals
    - If asked about non-food topics, politely redirect
    """

let session = LanguageModelSession(instructions: instructions)

// Now watch the magic happen
let response = try await session.respond(
    to: "I have chicken, broccoli, and rice. What can I make?"
)
// Returns a focused recipe suggestion, not a dissertation on poultry 🍗

let instructions = """
    You are a helpful recipe assistant for a meal planning app.
    - Suggest recipes based on available ingredients
    - Keep responses concise (under 100 words)
    - Focus on quick, healthy meals
    - If asked about non-food topics, politely redirect
    """

let session = LanguageModelSession(instructions: instructions)

// Now watch the magic happen
let response = try await session.respond(
    to: "I have chicken, broccoli, and rice. What can I make?"
)
// Returns a focused recipe suggestion, not a dissertation on poultry 🍗

The model follows these instructions religiously, keeping responses on-topic and appropriately sized.

Let's Build Something Real

Enough theory - let's look at some practical examples you can actually ship.

Smart Note Summarization

struct NotesSummaryView: View {
    @State private var summary = ""
    let noteContent: String
    
    func generateSummary() async {
        let session = LanguageModelSession(
            instructions: "Summarize notes concisely, highlighting key points and action items"
        )
        
        let prompt = """
            Summarize this note in 3-5 bullet points:
            
            \(noteContent)
            """
        
        do {
            summary = try await session.respond(to: prompt)
        } catch {
            // Fallback to showing the full note
            summary = "Could not generate summary"
        }
    }
}

struct NotesSummaryView: View {
    @State private var summary = ""
    let noteContent: String
    
    func generateSummary() async {
        let session = LanguageModelSession(
            instructions: "Summarize notes concisely, highlighting key points and action items"
        )
        
        let prompt = """
            Summarize this note in 3-5 bullet points:
            
            \(noteContent)
            """
        
        do {
            summary = try await session.respond(to: prompt)
        } catch {
            // Fallback to showing the full note
            summary = "Could not generate summary"
        }
    }
}

This is perfect for those meeting notes that go on forever. Your users will thank you.

Intelligent Search Suggestions

Remember Apple's own example from the docs? Let's implement it:

func generateSearchSuggestions(for query: String) async -> [String] {
    let instructions = """
        Suggest five related topics. Keep them concise (three to seven words) 
        and make sure they build naturally from the person's topic.
        """
    
    let session = LanguageModelSession(instructions: instructions)
    
    do {
        let response = try await session.respond(to: query)
        // Parse the response into an array
        return parseSearchSuggestions(from: response)
    } catch {
        // Fallback to empty suggestions
        return []
    }
}

// Usage: generateSearchSuggestions(for: "Making homemade bread")
// Returns: ["Sourdough starter basics", "No-knead bread recipes", ...]

func generateSearchSuggestions(for query: String) async -> [String] {
    let instructions = """
        Suggest five related topics. Keep them concise (three to seven words) 
        and make sure they build naturally from the person's topic.
        """
    
    let session = LanguageModelSession(instructions: instructions)
    
    do {
        let response = try await session.respond(to: query)
        // Parse the response into an array
        return parseSearchSuggestions(from: response)
    } catch {
        // Fallback to empty suggestions
        return []
    }
}

// Usage: generateSearchSuggestions(for: "Making homemade bread")
// Returns: ["Sourdough starter basics", "No-knead bread recipes", ...]

Blog Post Outliner

And here's one I'm personally excited about - generating content outlines:

struct BlogPostGenerator {
    func generateOutline(topic: String, style: WritingStyle) async -> String {
        let instructions = """
            You are a content strategist. Create detailed blog post outlines.
            Style: \(style.rawValue)
            Include: introduction, 3-5 main points, and conclusion
            """
        
        let session = LanguageModelSession(instructions: instructions)
        
        let prompt = "Create an outline for a blog post about: \(topic)"
        
        return try await session.respond(
            to: prompt,
            options: GenerationOptions(temperature: 0.7)
        )
    }
}

struct BlogPostGenerator {
    func generateOutline(topic: String, style: WritingStyle) async -> String {
        let instructions = """
            You are a content strategist. Create detailed blog post outlines.
            Style: \(style.rawValue)
            Include: introduction, 3-5 main points, and conclusion
            """
        
        let session = LanguageModelSession(instructions: instructions)
        
        let prompt = "Create an outline for a blog post about: \(topic)"
        
        return try await session.respond(
            to: prompt,
            options: GenerationOptions(temperature: 0.7)
        )
    }
}

Notice that temperature parameter? Let's talk about that...

Tuning Your Generations

The GenerationOptions struct lets you control how creative or focused the model should be:

// Want something creative? Crank up the temperature!
let creativeOptions = GenerationOptions(temperature: 2.0)
let story = try await session.respond(
    to: "Write me a story about a sentient espresso machine",
    options: creativeOptions
)

// Need facts and precision? Keep it cool
let preciseOptions = GenerationOptions(temperature: 0.5)
let summary = try await session.respond(
    to: "Extract all dates and deadlines from this email",
    options: preciseOptions
)

// Want something creative? Crank up the temperature!
let creativeOptions = GenerationOptions(temperature: 2.0)
let story = try await session.respond(
    to: "Write me a story about a sentient espresso machine",
    options: creativeOptions
)

// Need facts and precision? Keep it cool
let preciseOptions = GenerationOptions(temperature: 0.5)
let summary = try await session.respond(
    to: "Extract all dates and deadlines from this email",
    options: preciseOptions
)

I've found 0.7-0.8 works great for most tasks, but experiment and see what works for your use case.

The Privacy Story

Let's address the elephant in the room - privacy. This is where Apple's approach really shines:

Everything stays on device - The model runs on the Neural Engine
Works offline - Airplane mode? No problem
Session isolation - Each session is sandboxed
User control - Only works when Apple Intelligence is enabled

// Here's a fun experiment - try this with airplane mode on
func demonstratePrivacy() async {
    let session = LanguageModelSession()
    
    // This works without any network connection
    let response = try await session.respond(
        to: "Explain quantum computing simply"
    )
    
    // Your users' data never leaves their device
    // No analytics, no cloud logs, no nothing
}

// Here's a fun experiment - try this with airplane mode on
func demonstratePrivacy() async {
    let session = LanguageModelSession()
    
    // This works without any network connection
    let response = try await session.respond(
        to: "Explain quantum computing simply"
    )
    
    // Your users' data never leaves their device
    // No analytics, no cloud logs, no nothing
}

This is huge for apps dealing with sensitive data - journals, health apps, financial tools. You get AI capabilities without the privacy headaches.

A Word on Safety

Of course, with great power comes great responsibility. Here's how to build features that are helpful, not harmful:

let safeSession = LanguageModelSession(
    instructions: """
        You are a helpful assistant for a journaling app.
        - Be supportive and encouraging
        - Never provide medical or legal advice
        - If someone seems distressed, suggest talking to someone they trust
        - Keep responses appropriate for all ages
        """
)

let safeSession = LanguageModelSession(
    instructions: """
        You are a helpful assistant for a journaling app.
        - Be supportive and encouraging
        - Never provide medical or legal advice
        - If someone seems distressed, suggest talking to someone they trust
        - Keep responses appropriate for all ages
        """
)

The model respects these boundaries really well. I've tested it with edge cases, and it's surprisingly good at staying in its lane.

What's Coming Next

The documentation hints at some exciting features on the horizon:

Guided Generation - Generate structured Swift types directly (no more parsing!)
Tool Calling - Let the model use your app's functions
Custom Attributes - Define your own model behaviors
Multimodal Support - Because text is just the beginning

Wrapping Up

The Foundation Models framework changes everything. We finally have powerful AI that respects privacy, works offline, and integrates seamlessly with our apps. No more choosing between features and privacy. No more API rate limits ruining your launch day.

Go experiment! Build something cool. The framework is available now in iOS 26, and I can't wait to see what people create with it.

If you fancy sharing what you build or have questions about the framework, I'm @SwiftyAlex on twitter.