Learn-RV

You have a Cognitum One Seed. You want it to be a genius on something that matters to you.

Pick any topic — cooking, investing, a medical condition, a sport, a language. Learn-RV finds the best videos and content on the internet, downloads everything, reads every word, and turns it into a searchable expert that lives on your Seed. Then ask it anything, in plain language, and it answers with citations back to the exact moment in the exact video.

No cloud account. No subscription. No ongoing fees. Your knowledge, on your device, working offline.

Overview diagram (text version for accessibility)

Talk to Claude            Use the CLI              Use MCP Server
"Build me a KB on         learn ingest <url>        learn serve <topic>
 French cooking"          learn ask <topic> "q"     → Claude gains
"Watch this video"        learn chat <topic>          kb_query
"What did it say?"        learn apply <topic> "t"     kb_synthesize
        ↓                         ↓                         ↓
                      learn binary
                            ↓
                    <topic>.rvf
               (one file, on your device)

Your Instant Expert in Four Steps

Steps (text version)

1. Download  →  2. Pick a topic  →  3. Build your KB  →  4. Ask anything
   learn doctor     learn study "X"    learn ingest <url>   learn ask <topic> "?"

# 1. Install — pick your platform
# M-series Mac:
curl -L https://github.com/stuinfla/learner-rv/releases/latest/download/learn-aarch64-apple-darwin.tar.gz \
  | tar xz -C /tmp && /tmp/learn-aarch64-apple-darwin/install.sh
# Linux x86_64:
# curl -L https://github.com/stuinfla/learner-rv/releases/latest/download/learn-x86_64-unknown-linux-gnu.tar.gz \
#   | tar xz -C /tmp && /tmp/learn-x86_64-unknown-linux-gnu/install.sh
# Windows: download learn-x86_64-pc-windows-msvc.zip from GitHub Releases

# 2. Check everything is ready
learn doctor

# 3. Pick a topic and let Learn-RV find the best videos
learn study "sous vide cooking techniques"
# → Shows a shortlist of recommended videos, confirm to ingest

# 4. Ask your new expert anything
learn ask sous-vide "What temperature for a medium-rare steak?"
# → "54°C for 1–4 hours gives perfect medium-rare edge-to-edge [Sous Vide Everything @ 3:12]"

# 5. Chat with your expert (multi-turn, remembers the conversation)
learn chat sous-vide

Your knowledge base lives at ~/Docs/KB/sous-vide.rvf — one file you own completely.

What You Get

Capabilities (text version)

Own your data   │  Cited answers  │  Self-learning
No cloud. One   │  Every answer   │  The KB gets
.rvf file you   │  points to the  │  smarter the
control fully.  │  exact moment.  │  more you use it.

On-device       │  RuVector-      │  Scales with
Everything runs │  native         │  you
on your machine.│  .rvf works     │  From one video
Audio never     │  with the whole │  to thousands,
leaves.         │  RuVector stack.│  same commands.

You get	Because of how it works
Add videos anytime without corrupting the KB	Append-only RVF segments
Millisecond search across thousands of video chunks	HNSW index native to the file
Every answer traces to the exact video moment	Witness chain per chunk, cryptographically anchored
Move the whole KB to another machine — nothing to migrate	Single `.rvf` file = single unit
Works on Cognitum One Seed without conversion	RVF is the Seed's native vector format

For Cognitum One Seed Owners

Your Seed is where all your knowledge lives. Build a knowledge base on your computer, and it lands on the Seed automatically — no cloud, no subscription, no conversion. Just your hardware.

Seed workflow (text version for accessibility)

YOUR COMPUTER                    AUTO-PUSH             COGNITUM ONE SEED
─────────────────────────────    ────────────────────  ─────────────────────────────
learn ingest <video URLs>                              Seed RVF Store
learn study "your topic"       → every ingest pushes  native vector format
learn ask / chat / apply          automatically   →   zero conversion needed

~/Docs/KB/<topic>.rvf                                 114-tool MCP proxy
one file · fully portable                             any MCP-capable agent
                                                       Ed25519 witness chain
                                                       cryptographic provenance

One-time setup — bind your Seed and forget it:

learn config set seed.address 192.168.1.42    # your Seed's IP (or mDNS name)
learn config set seed.auto_push true          # push automatically after every ingest
learn doctor                                  # confirm Seed is reachable

After this, every learn ingest and learn study automatically pushes to your Seed. You never need to remember to push.

Manual push (if you prefer explicit control):

learn push knife-sharpening                   # push on demand, auto-discovers Seed
learn push knife-sharpening --seed 192.168.1.42  # explicit address

Full workflow example:

# Build the expert
learn study "Japanese knife sharpening"       # finds + ingests best videos
learn ask knife-sharpening "What angle for a 210mm gyuto?"

# If auto-push is enabled, the Seed already has it.
# If not, push manually:
learn push knife-sharpening

# Now any AI agent connected to the Seed can query it

Why it fits the Seed:

RVF is the Seed's native vector store — no conversion, no export, no migration
learn serve <topic> aligns with the Seed's 114-tool MCP proxy
Every ingest writes an Ed25519 witness chain, matching the Seed's custody model
The Rust binary is compatible with the cognitum-one SDK

Three Ways to Use It

Whether you prefer talking to Claude, typing commands, or wiring it into an AI workflow — it all leads to the same place: your knowledge, cited, on your device.

Three modes (text version for accessibility)

Claude Skill              CLI                        MCP Server
──────────────────        ─────────────────────────  ──────────────────────────────
Talk naturally:           learn ingest <url>          learn serve <topic>
"Build me a KB on         learn ask <topic> "q"
 French cooking"          learn chat <topic>          Claude gains:
"Watch this video"        learn apply <topic> "t"       · kb_query
"What did it say?"                                     · kb_synthesize
                          learn status / list           · kb_list_videos
Claude picks the          learn cloud / map
right command and         22 subcommands total        Grounded multi-step
runs it for you.                                      workflows — every answer
No syntax needed.                                     anchored to a video moment.

🤖 As a Claude Code skill (just talk to Claude)

Learn-RV installs as a global Claude Code skill. In any Claude session, just describe what you want:

"Build me a knowledge base on Japanese knife sharpening."
"Watch this video and remember it: https://youtu.be/QZMljuD10sU"
"What did the speaker say about sharpening angle?"
"Apply what we learned in knife-sharpening to draft a sharpening routine for my 3 knives."

Claude reads the skill, picks the right learn subcommand, runs it, and returns a cited answer. No syntax to remember.

💻 As a CLI (direct control)

# Build a knowledge base
learn ingest "https://youtu.be/QZMljuD10sU" --topic claude-skills
learn ingest "https://youtube.com/playlist?list=PLxxx" --topic my-playlist
learn import ~/Downloads/lectures/ --topic university-physics   # local files

# Ask / apply / chat
learn ask   french-cooking "what is lamination and why does it matter?"
learn apply french-cooking "give me a croissant recipe with weights in grams"
learn chat  french-cooking                       # multi-turn dialog, session-persistent

# Inspect and visualize
learn status french-cooking                      # chunk count, coherence score
learn cloud  french-cooking                      # → SVG word cloud of key concepts
learn map                                        # → PCA galaxy of all your topics

# Push to your Cognitum Seed
learn push french-cooking

🔌 As an MCP server (Claude drives the KB end-to-end)

// ~/.claude/mcp.json
{
  "mcpServers": {
    "learn-rv": {
      "command": "learn",
      "args": ["serve", "your-topic-name"]
    }
  }
}

Claude Code gains three tools: kb_query, kb_synthesize, kb_list_videos. Now you can say "using my french-cooking topic, walk me through making croissants — write the schedule to disk, adjust if I tell you my kitchen is 68°F" and Claude calls the KB at each step, grounding every instruction in a specific video moment.

📦 All 22 commands

Discovery + ingestion

learn study — Strategic: describe what you want to learn. Learn-RV discovers a curriculum, ranks candidates, shows a shortlist, ingests on confirmation.

learn study "How to make laminated pastry"
learn study "ETF arbitrage strategies" --depth deep
learn study "RAG architectures 2026" --auto

learn ingest — Tactical: paste a URL, playlist, channel, or search query.

learn ingest "https://youtube.com/playlist?list=PLxxx"
learn ingest "https://youtu.be/abc" --topic indexed-arbitrage

learn import — Bulk ingest a local directory of files (PDF, MP4, MP3, TXT, MD).

learn import ~/Downloads/lectures/ --topic university-physics
learn import ~/Documents/recipes/ --topic french-cooking

Consumption

learn ask — Cited answer grounded in the KB.
learn apply — Uses the KB as prior to produce a grounded artifact (recipe, plan, code).
learn chat — Multi-turn dialog with session persistence.
learn quiz — Generates quiz questions from the KB to test your knowledge.

learn ask   french-cooking "what is the Maillard reaction?"
learn apply french-cooking "give me a laminated dough schedule for 20 croissants"
learn chat  french-cooking                                    # → interactive REPL
learn chat  french-cooking --resume <session-id>             # → resume a prior session
learn quiz  french-cooking                                    # → 5 questions with answers

Sessions persist at ~/Docs/KB/_chat/<topic>/<id>.jsonl.

Inspection + visualization

learn status   french-cooking   # chunk count, file size, coherence KPI
learn list     french-cooking   # videos in the topic
learn who-said french-cooking "Julia Child"          # which videos mention a name
learn timeline french-cooking "beurrage"             # chronological mentions
learn compare  french-cooking sourdough              # cross-topic concept overlap
learn cloud    french-cooking                        # SVG word cloud of top concepts
learn map                                            # PCA galaxy of all your topics
learn summarize french-cooking                       # key takeaways across the topic

Distribution + maintenance

learn push    french-cooking   # push KB to Cognitum One Seed on local network
learn serve   french-cooking   # start MCP server for Claude Code integration
learn watch   french-cooking   # monitor a channel for new videos, auto-ingest
learn eval    french-cooking   # run golden Q&A regression against the KB
learn forget  french-cooking <video_id>    # remove one video from the KB
learn compact french-cooking               # defragment the RVF file
learn doctor                               # check deps, models, env, release version

🏗️ How it works

Ingest pipeline

Pipeline (text version for accessibility)

Source URL / path
      ↓
  ACQUIRE (yt-dlp) — captions-first; audio-only fallback
      ↓
  SMART FRAME DECISION
  pHash variance → skip talking heads, extract visual demos
  Sonnet vision captions frames when useful
      ↓
  TRANSCRIBE — VTT captions (instant) or Whisper.cpp on-device
      ↓
  CHUNK — sentence-aware, ~300 tokens, 50-token overlap
      ↓
  EMBED — BGE-large-en-v1.5 (1024-dim, ONNX, on-device)
      ↓
  INDEX — RvfStore append-only HNSW + Ed25519 witness chain per chunk
      ↓
  AUTO-SUMMARY — 3–5 key takeaways via Sonnet
      ↓
  ~/Docs/KB/<topic>.rvf

Query path

Query path (text version for accessibility)

User question
      ↓
  EXPAND — HyDE hypothetical answer as second query vector
      ↓
  HYBRID RETRIEVE — dense (BGE) + BM25, RRF fusion → top 50
      ↓
  RERANK — cross-encoder (BGE-base) → top 10
      ↓
  MMR + SOURCE-CAP — diversity λ=0.7, ≤3 chunks per video
      ↓
  SYNTHESIZE — cited prompt, abstain if signal weak, AIMDS scan in/out
      ↓
  Answer with [Title @ MM:SS](url&t=Xs) citations

Architecture: 17 crates, one binary

Layer	Crate	Responsibility
CLI	`learn-cli`	22 subcommands, routing, orientation
Ingestion	`learn-acquire`, `learn-asr`, `learn-frames`, `learn-chunk`, `learn-embed`, `learn-index`, `learn-graph`	Full pipeline from URL to `.rvf`
Retrieval	`learn-retrieve`	Hybrid BM25+dense, rerank, MMR
Synthesis	`learn-synth`	Cited answers, in-tree AIMDS scanner
Chat	`learn-chat`	Multi-turn REPL, JSONL sessions
MCP	`learn-serve`	JSON-RPC 2.0 server for Claude Code
Contracts	`learn-core`	Shared types, errors, topic slug

Storage model

Storage layout (text version)

~/Docs/KB/
├── french-cooking.rvf          ← chunks · embeddings · HNSW · witness chain
├── indexed-arbitrage.rvf
├── french-cooking.summary.md   ← auto-generated key takeaways
├── _graph/
│   └── french-cooking.graphdb  ← claims, entities, relations
├── _meta/
│   └── french-cooking.json     ← per-video state (slug → progress)
└── _chat/
    └── french-cooking/         ← session JSONL files

Per-topic isolation is total. Drop a topic by deleting one file. Move the whole thing to another machine and it just works.

Self-learning

BGE-large-en-v1.5 (1024-dim) — best-in-class English sentence embedder, on-device ONNX
HNSW via RvfStore — logarithmic search, native to the file format
SONA per-topic adapters — LoRA fine-tuning per topic; the embedder specializes with use
In-tree AIMDS — 12 inbound + 8 outbound regex patterns; scans every query and every answer

📂 One-time setup

Easy path (M-series Mac or Linux x86_64 — no Rust required):

# M-series Mac
curl -L https://github.com/stuinfla/learner-rv/releases/latest/download/learn-aarch64-apple-darwin.tar.gz \
  | tar xz -C /tmp && /tmp/learn-aarch64-apple-darwin/install.sh

# Linux x86_64
curl -L https://github.com/stuinfla/learner-rv/releases/latest/download/learn-x86_64-unknown-linux-gnu.tar.gz \
  | tar xz -C /tmp && /tmp/learn-x86_64-unknown-linux-gnu/install.sh

install.sh symlinks the binary to ~/.cargo/bin/learn and drops the Claude Code skill into ~/.claude/skills/learn-rv/.

Build from source (any platform, Rust toolchain required):

git clone https://github.com/stuinfla/learner-rv.git
cd learner-rv
git clone https://github.com/ruvnet/RuVector.git ../RuVector
cargo install --path crates/learn-cli
mkdir -p ~/.claude/skills/learn-rv
cp .claude/skills/learn-rv/SKILL.md ~/.claude/skills/learn-rv/SKILL.md

Runtime dependencies:

brew install yt-dlp ffmpeg   # macOS
# apt install yt-dlp ffmpeg  # Debian/Ubuntu

Whisper and BGE-large models auto-fetch into ~/.cache/learn-rs/models/ on first use (learn doctor shows status).

Environment setup:

Copy .env.example to .env and fill in your Anthropic API key (required for learn ask, learn apply, and learn chat):

cp .env.example .env
# edit .env and add: ANTHROPIC_API_KEY=sk-ant-...

⚙️ Configuration

Variable	Purpose	Default
`ANTHROPIC_API_KEY`	Required for `learn ask` / `learn apply` / `learn chat` synthesis	unset
`LEARN_SYNTH_LOCAL`	`1` → use local RuVLLM instead of Anthropic. Fully on-device	`0`
`LEARN_AIMDS_REQUIRED`	`1` → fail closed on any `Blocked` AIMDS verdict	`0`
`LEARN_KB_ROOT`	Where `.rvf` files live	`~/Docs/KB`
`LEARN_MODEL_CACHE`	Where Whisper + BGE models cache	`~/.cache/learn-rs/models`
`RUST_LOG`	Tracing filter (`info`, `debug`, `learn_synth=trace`)	`warn`

Sovereignty defaults: Every byte of audio, every transcript, every embedding, and every index stays on the machine. The only outbound call is learn ask/learn apply to Anthropic — swap for local RuVLLM with LEARN_SYNTH_LOCAL=1.

🖥️ Platform support

Platform	Binary?	Notes
M-series Mac (`aarch64-apple-darwin`)	✅ v0.2.5	Primary, fully supported
Linux x86_64 (`x86_64-unknown-linux-gnu`)	✅ v0.2.5	Captions-only (no local Whisper on Linux)
Windows (`x86_64-pc-windows-msvc`)	✅ v0.2.6	No on-device ASR (whisper-rs is Apple-only)
Intel Mac (`x86_64-apple-darwin`)	Build from source	macOS-13 runner deprecated by GitHub
Linux ARM64	Build from source	cross-Docker can't reach RuVector path-deps

⚠️ Honest caveats

Current state: v0.2.5 (2026-05-05)

Linux ARM64 + Intel Mac binaries are not published. Build from source. Reasons: cross Docker cannot reach the ../ruvector sibling path-dep; macOS-13 runner deprecated.
Coherence KPI uses Fiedler eigenvalue × NN-cosine density — a useful relative health signal, not a research-grade IIT Φ.
AIMDS guardrails are in-tree regex patterns (12 inbound, 8 outbound). Synchronous, zero-subprocess, intentionally lightweight.
SONA self-learning works but the feedback signal that updates the LoRA adapter requires explicit record_feedback API calls — not yet wired into a passive thumbs-up/down on learn ask.
Smart frame decision runs pHash variance; low-variance (talking-head) videos skip frame extraction automatically to save API budget.
Windows builds and runs but omits on-device speech recognition (learn-asr is Apple-only due to whisper-rs metal feature).

🧪 Testing

cargo fmt --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace
cargo build --release --workspace

CI requires all four green before merge. 311+ unit + integration tests.

📜 License + contributing

Dual-licensed under MIT or Apache-2.0 at your option.

Contributions welcome. Open an issue before sending a PR larger than ~50 lines so we can align on approach. CI gate must be green.

Built with RuVector · MIT/Apache-2.0 · Releases

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
.claude/skills/learn-rv		.claude/skills/learn-rv
.github/workflows		.github/workflows
assets		assets
crates		crates
docs		docs
fixtures		fixtures
scripts		scripts
tests		tests
.ascii-to-svg-manifest.json		.ascii-to-svg-manifest.json
.env.example		.env.example
.gitignore		.gitignore
BUILDING.md		BUILDING.md
CLAUDE.md		CLAUDE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
cog-store.json		cog-store.json
rust-toolchain.toml		rust-toolchain.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learn-RV

Your Instant Expert in Four Steps

What You Get

For Cognitum One Seed Owners

Three Ways to Use It

🤖 As a Claude Code skill (just talk to Claude)

💻 As a CLI (direct control)

🔌 As an MCP server (Claude drives the KB end-to-end)

Discovery + ingestion

Consumption

Inspection + visualization

Distribution + maintenance

Ingest pipeline

Query path

Architecture: 17 crates, one binary

Storage model

Self-learning

About

Uh oh!

Releases 9

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Learn-RV

Your Instant Expert in Four Steps

What You Get

For Cognitum One Seed Owners

Three Ways to Use It

🤖 As a Claude Code skill (just talk to Claude)

💻 As a CLI (direct control)

🔌 As an MCP server (Claude drives the KB end-to-end)

Discovery + ingestion

Consumption

Inspection + visualization

Distribution + maintenance

Ingest pipeline

Query path

Architecture: 17 crates, one binary

Storage model

Self-learning

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 9

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages