Skip to content

feat: gemini image analyzer (auto-detect role/vibe/treatment on upload)#29

Merged
cuio merged 1 commit intomainfrom
feat/storyline-gemini-image-analysis
Apr 30, 2026
Merged

feat: gemini image analyzer (auto-detect role/vibe/treatment on upload)#29
cuio merged 1 commit intomainfrom
feat/storyline-gemini-image-analysis

Conversation

@cuio
Copy link
Copy Markdown
Owner

@cuio cuio commented Apr 30, 2026

Summary

Closes Milestone D from #25. Every image upload now fires a fire-and-forget Gemini Flash analysis that tags the manifest entry with role, mood vibe, a recommended image-scene suggestedTreatment, and a 1-10 retentionStrengthAtAttachment estimate. The visual director consumes those fields as soft priors, so the very first render has intentional visual logic instead of defaulting to editorial-bleed across every scene.

  • packages/core/src/images/analyzer.ts — Gemini Flash + inlineData (base64 webp) for sub-2s latency; pure normalizeAnalysis defends against out-of-range scores, unknown enums, and missing fields.
  • packages/core/src/images/manifest.tsImageEntry gains vibe, suggestedTreatment, retentionStrengthAtAttachment, analysisStatus, analysisError, analyzedAt, analysisRationale. All optional, so old manifests load unchanged.
  • packages/core/src/studio-api/routes/images.ts — Upload patches analysisStatus: pending synchronously, fires analysis off-cycle, writes result back. New POST /images/:id/analyze re-runs on demand.
  • packages/core/src/script/visualDirector.ts — Catalog labels new fields as (analyzer prior) so the LLM treats them as advisory; new constraint feat(script): cinematography agents + hook layering + ssr cache fix #10 + retention-strength guidance for hook scenes.
  • packages/studio/src/components/sidebar/ImagesTab.tsx — Row chip shows analyzing… / R7 (color-coded retention) / analysis failed. Editor gains an Analyzer panel with vibe / treatment prior / score / rationale, plus a re-analyze button that optimistically flips state.

Cost

Logged under script.images.analyze with kind: "gemini". ~$0.001 per image at gemini-2.5-flash rates.

Tests

15 new analyzer tests (clamping, role coercion, treatment normalization, char-clipping, default-treatment helper). Full suite green: 745 core + 281 studio passing.

Failure modes

  • GEMINI_API_KEY missing → entry flips to analysisStatus: "failed" with a "configure it in Settings" reason. Upload still succeeds.
  • Network error / model decline → same fail-soft path; user-typed role / description / tags are preserved and never overwritten.
  • File missing on disk (raced delete) → analyzer returns the failed entry rather than throwing.

Test plan

  • Upload a fresh image to a project that has GEMINI_API_KEY set → row chip shows analyzing… for 1-2s then flips to R<n> color-coded.
  • Inspect the entry → analyzer panel shows vibe + treatment prior + rationale.
  • Re-render the project → visual director's catalog now includes the prior; rationales reference it.
  • Click re-analyze on an existing image → chip flips to analyzing… then refreshes with new values.
  • Remove GEMINI_API_KEY and upload a new image → chip shows analysis failed with the reason on hover; upload still succeeds.

🤖 Generated with Claude Code

Closes Milestone D from PR #25. Every image upload now fires a fire-and-forget
Gemini Flash analysis that tags the manifest entry with role, mood vibe, a
recommended image-scene treatment, and a 1-10 retention-strength estimate.
The visual director consumes those fields as soft priors so the very first
render has intentional visual logic instead of defaulting to editorial-bleed
across every scene.

Backend
- packages/core/src/images/analyzer.ts: analyzeImage() uses Gemini Flash
  inlineData (base64 webp) for sub-2s latency; normalizeAnalysis defends
  against out-of-range scores, unknown enums, and missing fields.
- ImageEntry gains vibe / suggestedTreatment / retentionStrengthAtAttachment
  / analysisStatus / analysisError / analyzedAt / analysisRationale, all
  optional so old manifests continue to load.
- Upload route patches the entry to "pending" synchronously, then runs
  analysis off-cycle and writes the result back. POST /images/:id/analyze
  re-runs on demand.
- Visual director catalog labels new fields as "(analyzer prior)" so the LLM
  knows they are advisory and can override based on script context.

Frontend
- ImagesTab row chip surfaces analysis state: "analyzing…", "R7" (color-coded
  retention chip), or "analysis failed" with the reason on hover.
- Editor gains an Analyzer panel showing vibe / treatment prior / retention
  score / rationale, plus a re-analyze button that optimistically flips the
  chip to pending while the request is in flight.

Cost is logged under script.images.analyze with kind: gemini, ~$0.001 per
image at gemini-2.5-flash rates. User-typed role / description / tags are
preserved and never overwritten by the analyzer.

15 new analyzer tests; full suite stays green (760 core + 281 studio).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant