Skip to content

feat(ManifoldFoundation): wire AFM 3 image-input API (WWDC 2026) #1710

@roryford

Description

@roryford

Context

WWDC 2026 announced that AFM 3 (Apple's third-generation on-device model) is 'natively multimodal' — developer docs confirm: 'Multimodal prompts let you pass images alongside text so your app can reason about visual content, and Vision framework tools like OCR and barcode readers are available for your model to call directly, all on-device.'

FoundationBackend currently advertises supportsVision: false because the pre-WWDC FoundationModels SDK (Xcode 26.4, module 1.4.34) had no image-input surface. This issue tracks wiring the new API.

Ref: #20 (closed — FoundationBackend was the last unchecked checkbox)

Investigation required first (Phase 0)

A post-WWDC Xcode beta may have shipped with the image API. Before implementing, probe the installed SDK:

  • Check for an image / attachment case on Transcript.Segment
  • Check for a Data- or CGImage-accepting Prompt initializer
  • Check for a PromptRepresentable conformance on image types
  • Check PromptBuilder for new content cases

If the SDK doesn't yet expose image input, document the gap and revisit when the SDK ships.

Implementation (once SDK confirmed)

  • Flip supportsVision: true in FoundationBackend.capabilities
  • Map MessagePart.image(data:mimeType:) → SDK image type in generate() (follow ClaudeBackend / OpenAIBackend pattern)
  • Handle Private Cloud Compute routing (surface hints if available; otherwise document behavior)
  • Remove the 'no image API' note from the FoundationBackend doc comment; describe the wiring
  • Integration tests: image-bearing session test, gated with FoundationBackend.probeIsReady()
  • Docs: update quickstart with multimodal example

Cross-cutting

  • UI: PhotoAttachmentButton / VisionInputButton already ship — no UI changes needed once vision is wired
  • Open question: does image routing go on-device or to Private Cloud Compute? Impact on on-device positioning claim.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions