Last updated: 2026-05-26
This document is the shared cross-repo source of truth for aligning auto-assist, Sophia, and auto-ingest into one offline, Tailscale-first swarm compute architecture.
It records the design decisions that have now been made and should guide the next implementation pass.
Repos covered:
scottjoyner/auto-assistscottjoyner/Sophiascottjoyner/auto-ingest
AssistX is the task-state authority.
auto-ingest and Sophia may run jobs, maintain local outboxes, keep operational SQLite/cache state, and publish events, but they should not become independent long-term task authorities.
The unified model is:
Sophia = voice/auth edge
AssistX = orchestration, task state, policy, command center, delegation
Neo4j main DB = unified historical/personal memory about Scott
auto-ingest = periodic historical memory ingestion and data enrichment
LM Studio / Hermes / opencode nodes = delegated execution workers
NAS/filesystem = binary artifact source of truth
Tailscale = default private network fabric
The existing data is split historically:
- The original / legacy Neo4j
neo4jdatabase contains the long-running historical graph nodes from auto-ingest and prior memory work. - Sophia initially wrote quick transcription / voice-related graph records into the
memorydatabase. - AssistX introduced the
assistxdatabase later as orchestration needs became clearer.
Use the legacy/main neo4j database as the unified Scott memory graph.
Use the assistx database as the orchestration/control-plane graph.
Treat Sophia's memory database as a transitional or staging database until its voice/capture/auth data is either migrated, copied, or dual-written into the unified Scott memory graph.
This database should contain durable knowledge about Scott and his world:
- historical transcripts
- auto-ingest outputs
- dashcam/bodycam/audio-derived knowledge
- user preferences and opinions
- normalized entities
- speaker and voice identity history
- source media references
- vector embeddings for searchable memory
- facts, notes, observations, and long-term semantic memory
This database should contain live and historical control-plane state:
- tasks
- dispatches
- agent runs
- tool calls
- approvals
- policy decisions
- swarm node registry
- model endpoint registry
- worker heartbeats
- execution artifacts as task outputs
- delegation trace state
This database exists because Sophia started before AssistX orchestration was fully defined. Next steps should decide whether to:
- migrate Sophia
memoryrecords intoneo4j, - keep it as a voice staging database, or
- dual-write new Sophia memory events into both
memoryand unifiedneo4juntil migration is complete.
The preferred direction is to make long-term Scott memory searchable from the main neo4j database.
Sophia voice/capture nodes do not yet have vector embeddings. That should be fixed as part of memory unification.
Add embeddings for:
- Sophia transcriptions
- voice captures
- voice-auth decisions with transcript summaries
- auto-ingest transcripts
- cleaned dashcam/bodycam utterances
- long-form summaries
- extracted opinions/preferences
- memory facts
- source-backed observations
Many dashcam transcripts are not meaningful Scott memory. They may contain:
- songs playing in the car
- radio/audio bleed
- non-Scott voices
- background navigation prompts
- road noise artifacts
- low-confidence STT hallucinations
Before using dashcam transcripts as semantic memory, auto-ingest should classify transcript segments by likely value:
scott_speechpassenger_speechmusic_or_medianavigation_or_alertambient_noiseunknown_low_confidence
Only higher-confidence memory-bearing segments should be promoted into long-term semantic memory without review. Lower-confidence or media-like segments can remain searchable as source records but should not become asserted facts about Scott.
auto-ingest is a secondary historical-memory ingestion system, not the real-time control path.
It should be periodically reviewed and run when new historical data has accumulated. Expected cadence is weekly or monthly depending on dashcam/audio/bodycam usage, not necessarily daily.
- ingest historical audio/video/transcript/metadata
- normalize source records
- extract useful context about Scott
- generate transcript and media artifacts
- classify memory-bearing content
- enrich the unified Neo4j memory graph
- provide search context for AssistX and Sophia
- preserve provenance back to source files
The real-time path should be:
voice/web command -> Sophia -> AssistX -> task/delegation -> execution -> Sophia/AssistX response
auto-ingest should run as a background enrichment system:
weekly/monthly historical run -> normalize data -> enrich unified Neo4j memory -> improve future context retrieval
After each auto-ingest batch, AssistX should be able to review:
- new source files detected
- new transcript volume
- new candidate memories
- low-confidence segments
- likely music/media segments
- new entities
- new opinions/preferences extracted
- failed or skipped files
- proposed memory promotions
The result should be a human-reviewable memory update summary.
All swarm nodes should join the Tailscale network.
Tailscale should be the default private connectivity layer for:
- AssistX API
- Sophia voice sidecar
- Neo4j access
- LM Studio / OpenAI-compatible endpoints
- Hermes agents
- opencode-capable machines
- auto-ingest workers
- NAS-adjacent services where applicable
LAN IPs can be preferred when clearly faster and reachable, but Tailscale names/IPs should be the default stable addressing layer.
Pi-hole can be updated with stable local names as needed.
Suggested aliases:
assistx.local -> AssistX control plane
neo4j.local -> Neo4j host
sophia.local -> Sophia voice sidecar
models.local -> primary model endpoint
x1-370.local -> x1-370
mini-pc-22.local -> mini-pc-22
deathstar.local -> deathstar-XPS-8920
If MagicDNS is sufficient, Pi-hole aliases can be optional. If multiple services move between hosts, Pi-hole aliases become useful stable service names.
Paperclip is primarily for:
- delegation
- agent management
- human/project-management visibility
- mapping AssistX tasks into an external agent-management view
Paperclip should not be the core source of task truth. AssistX remains task-state authority.
Preferred model:
AssistX Task = authoritative execution state
Paperclip Issue = optional mirror / delegation wrapper / human coordination object
If Paperclip is offline, AssistX should still be able to run local/offline tasks.
Redis is not the long-term strategic decision yet.
Near-term:
- keep current Redis usage where it already works.
- do not let Redis become the source of task truth.
- use local outboxes for disconnected worker reliability.
- keep AssistX task state authoritative.
Future direction:
- evaluate adding FalkorDB as a graph/cache layer on top of or adjacent to Neo4j.
- use FalkorDB for fast cache/working-set graph operations if useful.
- keep Neo4j as the durable memory/provenance store unless a later migration decision changes that.
Primary high-power machine and likely first production control-plane / knowledge host.
Expected responsibilities:
- central AssistX runtime candidate
- local knowledge access
- high-throughput orchestration
- LM Studio/OpenAI-compatible endpoint hosting
- access to the most complete local info
- heavy jobs when appropriate
Former primary machine and legacy auto-ingest host.
Expected responsibilities:
- continue selected legacy auto-ingest jobs until migrated
- run Hermes-side delegation agents
- run Qwen-class local model endpoints where practical
- provide continuity for existing data paths and historical jobs
Still early-stage. May migrate to Ubuntu-only.
Expected responsibilities:
- future always-on service host candidate
- lightweight AssistX/Sophia/worker sidecars
- monitoring, Pi-hole-adjacent utilities, or control-plane support
- model endpoint if resources allow
Powerful GPU machines with limited memory.
Expected responsibilities:
- fast token-generation delegation agents
- burst heavy lifting when memory footprint allows
- Qwen-class 9B model work
- first-choice delegation workers for fast local model responses
Lowest-power machines should handle drafts and low-risk, low-compute tasks where possible.
Examples:
- draft generation
- notes cleanup
- small summarization jobs
- status reporting
- lightweight file classification
- non-urgent text transformations
Any node may expose:
- OpenAI-compatible model endpoint
- Hermes agent
- opencode CLI
- auto-ingest worker
- voice sidecar
- file processing service
AssistX should delegate based on registered capability, availability, and benchmark history rather than hard-coded host assumptions.
Sophia should separate:
- speaker identity/authentication,
- action authorization,
- response voice selection.
Supported speaker states:
authenticated_scottscott_voice_unverifiedregistered_user_authenticatedregistered_user_unverifiedunknown_speakeradmin_voice_override
Unknown speakers should be able to register themselves as users.
Registration should not grant high-impact permissions automatically. It should create a pending or limited user profile until Scott approves expanded permissions.
Sophia should respond using Scott's voice for non-Scott voices as well.
Response voice selection should be independent from action authorization.
Default:
voice_response_policy:
tts_voice_when_scott_authenticated: scott_clone
tts_voice_when_scott_unverified: scott_clone
tts_voice_when_registered_user: scott_clone
tts_voice_when_unknown: scott_cloneScott's voice may fail authentication when his voice changes, such as first thing in the morning.
Add an override mechanism named admin-voice for now.
Policy:
- If voice auth does not recognize Scott but the override password/passphrase is supplied, classify as
admin_voice_override. - Log the override event.
- Allow the session to proceed under Scott-equivalent authority or a slightly restricted Scott authority, depending on final policy.
- Store enough metadata to audit why the override was accepted.
Scott-authenticated voice sessions may auto-approve low-risk actions without a popup.
All other voices require Scott approval for actions.
Suggested policy table:
| Speaker/Auth State | Ask Questions | Register User | Submit Notes | Low-Risk Actions | High-Risk Actions |
|---|---|---|---|---|---|
authenticated_scott |
yes | yes | yes | auto-approve | approval or explicit confirmation |
admin_voice_override |
yes | yes | yes | auto-approve or fast-confirm | approval or explicit confirmation |
registered_user_authenticated |
yes | n/a | yes | requires Scott approval | requires Scott approval |
registered_user_unverified |
limited | n/a | yes | requires Scott approval | requires Scott approval |
unknown_speaker |
limited | yes | limited | requires Scott approval | requires Scott approval |
Potential low-risk actions:
- create a note
- draft text
- summarize local context
- search memory
- list tasks
- create a draft task
- queue a non-destructive ingest scan
- ask a model endpoint for analysis
High-risk actions still requiring approval or explicit confirmation:
- deleting files
- modifying important records
- sending external messages
- changing network/system configuration
- running shell commands with destructive potential
- moving/renaming large data sets
- exposing data outside the local network
- changing auth/security policy
The first end-to-end demo should focus on the real-time AssistX path, not the historical auto-ingest batch path.
Target demo:
1. User speaks to Sophia.
2. Sophia performs VAD/STT/speaker auth.
3. Sophia sends quick input to AssistX.
4. AssistX classifies intent and creates authoritative task state.
5. AssistX delegates execution to a suitable local node.
6. The worker/Hermes/opencode/model endpoint completes the task.
7. AssistX records AgentRun/ToolCall/Artifact/Memory updates.
8. Sophia speaks the response using Scott clone voice.
This should run fully offline assuming LAN/Tailscale, local models, local Neo4j, and local storage are reachable.
The second path is periodic memory enrichment.
Target flow:
1. auto-ingest runs weekly/monthly or on demand.
2. New dashcam/audio/bodycam/source files are scanned.
3. Transcripts and metadata are normalized.
4. Music/noise/media segments are classified and downgraded.
5. Memory-bearing segments are summarized/extracted.
6. Candidate memories are reviewed or auto-promoted based on confidence.
7. Unified Neo4j `neo4j` memory graph is updated.
8. Embeddings are generated for searchable memory.
9. AssistX retrieval improves for future real-time tasks.
This makes auto-ingest the historical graph-of-knowledge builder about Scott, not the live voice/action execution loop.
The NAS should ideally be available to other nodes in the system.
NAS/filesystem remains source of truth for:
- raw audio
- dashcam video
- bodycam video
- birdcam clips
- extracted training clips
- generated transcripts
- model artifacts
- reports
- evidence-grade/protected files
Neo4j stores metadata and relationships:
- artifact ID
- source path
- storage root
- optional host/container path mappings
- sha256 when practical
- retention class
- source task/event
- derived transcript/summary/entity relationships
Because paths differ between hosts and containers, store multiple path forms when useful:
artifact_id: string
storage_root: nas1 | s-drive | local-ssd | external-drive
relative_path: path/under/root/file.ext
host_path: /media/scott/NAS1/...
container_path: /nas/...
tailscale_host_hint: x1-370
sha256: optional
retention_class: ephemeral | keep | protected | evidenceDo not rely on one absolute path being valid from every node.
AssistX should select delegation targets by:
- required capability,
- node availability,
- model endpoint status,
- power/cost profile,
- benchmark results,
- task risk,
- data locality.
Preferred routing examples:
- low-power machines: drafts, notes, light summaries
- demo/demo-1: fast model delegation and occasional GPU-heavy work
- deathstar-XPS-8920: legacy auto-ingest continuity and Hermes-side model work
- x1-370: central knowledge/control, heavy orchestration, high-memory jobs
- mini-pc-22: future always-on services once stabilized
Create these shared contract docs next:
docs/swarm_contracts/database_unification.md
docs/swarm_contracts/task_authority.md
docs/swarm_contracts/voice_auth_policy.md
docs/swarm_contracts/artifact_paths.md
docs/swarm_contracts/node_registry.md
docs/swarm_contracts/model_endpoint_registry.md
docs/swarm_contracts/auto_ingest_memory_enrichment.md
docs/swarm_contracts/event_envelope.md
Minimum schemas to implement:
SwarmNodeServiceEndpointCapabilityModelEndpointStorageRootHealthCheckVoiceAuthDecisionRegisteredSpeakerArtifactRefMemoryCandidateIngestBatchReview
- Commit this
UNIFICATION.mdto all three repos. - Convert this into contract docs under
docs/swarm_contracts/. - Update prior integration plans to point here as the current source of truth.
- Add authoritative task-state definitions.
- Add swarm node registry.
- Add model endpoint registry.
- Add event ingestion endpoint.
- Add voice quick-input endpoint or formalize existing endpoint.
- Add
auth_stateandadmin-voiceoverride flow. - Add unknown speaker registration path.
- Emit quick-input events to AssistX.
- Speak AssistX responses with Scott clone voice.
- Register Hermes/opencode/model endpoints as node capabilities.
- Route low-power draft tasks to lower-power nodes.
- Route fast generation tasks to demo/demo-1.
- Keep x1-370 as high-context/high-control node.
- Add ingest batch review summaries.
- Classify dashcam transcript segments.
- Generate embeddings for useful memory nodes.
- Push/migrate Sophia and auto-ingest memory into unified
neo4jdatabase.
- Tailscale-only routing.
- Local model endpoint.
- Sophia voice input.
- AssistX task creation/delegation.
- Local execution.
- Sophia spoken response.
- Audit trail in Neo4j.
- Should
memorybe fully migrated intoneo4j, or retained as a voice staging DB with sync jobs? - Should
admin-voicebe a spoken passphrase, typed PIN, device-bound token, or combination? - Which exact actions count as low-risk for Scott auto-approval?
- Which node should host the first canonical AssistX control-plane service?
- Which local DNS names should be added to Pi-hole first?
- Should FalkorDB be deployed soon as a working-set cache experiment, or deferred until after the first voice demo?
- Which embedding model should become canonical for unified memory?
- What is the first acceptance test for unknown speaker registration?
The system should become an offline, Scott-centered, graph-memory operating layer where:
- Scott can speak naturally to Sophia.
- Sophia can authenticate or safely handle unknown speakers.
- AssistX owns task state and policy.
- local nodes execute delegated work.
- auto-ingest periodically enriches Scott's historical memory.
- Neo4j unifies long-term context.
- everything critical can operate over Tailscale/LAN without public cloud dependency.