diff --git a/.claude/commands/process-bookmarks.md b/.claude/commands/process-bookmarks.md
index f04faa7..fd4a3f1 100644
--- a/.claude/commands/process-bookmarks.md
+++ b/.claude/commands/process-bookmarks.md
@@ -59,11 +59,11 @@ TodoWrite({ todos: [
 **CRITICAL for parallel processing:** Spawn ALL subagents in ONE message, each writing to a batch file:
 ```javascript
 // Send ONE message with multiple Task calls - they run in parallel
-// Use model="haiku" for cost-efficient parallel processing (~50% cost savings)
+// Use model="sonnet" for high-quality summarization and insight extraction
 // Each subagent writes to .state/batch-N.md, NOT to bookmarks.md!
-Task(subagent_type="general-purpose", model="haiku", prompt="Process batch 0: write to .state/batch-0.md: {json for bookmarks 0-4}")
-Task(subagent_type="general-purpose", model="haiku", prompt="Process batch 1: write to .state/batch-1.md: {json for bookmarks 5-9}")
-Task(subagent_type="general-purpose", model="haiku", prompt="Process batch 2: write to .state/batch-2.md: {json for bookmarks 10-14}")
+Task(subagent_type="general-purpose", model="sonnet", prompt="Process batch 0: write to .state/batch-0.md: {json for bookmarks 0-4}")
+Task(subagent_type="general-purpose", model="sonnet", prompt="Process batch 1: write to .state/batch-1.md: {json for bookmarks 5-9}")
+Task(subagent_type="general-purpose", model="sonnet", prompt="Process batch 2: write to .state/batch-2.md: {json for bookmarks 10-14}")
 // ... all batches in the SAME message
 ```
 
@@ -107,8 +107,8 @@ Each bookmark includes:
 - `id`, `author`, `authorName`, `text`, `tweetUrl`, `date`
 - `tags[]` - folder tags from bookmark folders (e.g., `["ai-tools"]`)
 - `links[]` - each with `original`, `expanded`, `type`, and `content`
-  - `type`: "github", "article", "video", "tweet", "media", "image"
-  - `content`: extracted text, headline, author (for articles/github)
+  - `type`: "github", "article", "video", "podcast", "tweet", "media", "image"
+  - `content`: extracted text, headline, author (for articles/github); for video/podcast: `{ source: 'yt-dlp-captions'|'whisper', transcriptFile: '.state/transcripts/{id}.txt', chars: N }`
 - `isReply`, `replyContext` - parent tweet info if this is a reply
 - `isQuote`, `quoteContext` - quoted tweet info if this is a quote tweet
 
@@ -160,13 +160,15 @@ For each bookmark (or batch):
 
 #### a. Determine the best title/summary
 
-Don't use generic titles like "Article" or "Tweet". Based on the content:
+Don't use generic titles like "Article" or "Tweet". The title appears after `## @author - ` and must be descriptive enough to scan in a list. Based on the content:
 - GitHub repos: Use the repo name and brief description
 - Articles: Use the article headline or key insight
-- Videos: Note for transcript, use tweet context
-- Quote tweets: Capture the key insight being highlighted
+- Videos/Podcasts with transcript: Use the key insight or topic from the transcript content
+- Videos/Podcasts without transcript: Note for transcript, use tweet context
+- Quote tweets: Capture the key insight being highlighted, NOT the reaction (e.g., "JC_builds: Local Calorie Estimation Model Beating GPT-4o" not "Retweet: Are you guys starting to catch on?")
 - Reply threads: Include parent context in the summary
 - Plain tweets: Use the key point being made
+- Image/video-only with no context: Use `@{author} - [Media] {best guess from author's bio/context}` — never just "Video post" or the raw t.co URL
 
 #### b. Categorize using the categories config
 
@@ -175,7 +177,7 @@ Match each bookmark's links against category patterns (check `match` arrays). Us
 **For each action type:**
 - `file`: Create a separate file in the category's folder using its template
 - `capture`: Just add to bookmarks.md (no separate file)
-- `transcribe`: Add to bookmarks.md with a "Needs transcript" flag, optionally create placeholder in folder
+- `transcribe`: If the link's `content` has a `transcriptFile` path, read the first 20,000 characters of that file (enough for summarization — don't read the whole file for long transcripts). Create a full knowledge file with structured summary, key takeaways, and insights, with `status: transcribed` and `transcript_source` in frontmatter. If `content` is null or has no `transcriptFile`, fall back to placeholder behavior with `status: needs_transcript`
 
 **Special handling:**
 - Quote tweets: Include quoted tweet context in entry
@@ -185,6 +187,8 @@ Match each bookmark's links against category patterns (check `match` arrays). Us
 
 **Use the Edit tool** to insert entries into the `archiveFile` (expand `~` to home directory). NEVER use Write - it will destroy existing entries.
 
+**DEDUPLICATION CHECK (CRITICAL):** Before inserting ANY entry, search bookmarks.md for the tweet URL (`x.com/{author}/status/{id}`). If it already exists, SKIP it — do not create a duplicate. Log: `Skipping duplicate: @{author} {id}`.
+
 **CRITICAL ordering rules for bookmarks.md:**
 
 The file must be in **descending chronological order** (newest dates at TOP, oldest at BOTTOM).
@@ -216,9 +220,27 @@ The file must be in **descending chronological order** (newest dates at TOP, old
 - **Link:** {expanded_url}
 - **Tags:** [[tag1]] [[tag2]] (if bookmark has tags from folders)
 - **Filed:** [{filename}](./knowledge/tools/{slug}.md) (if filed)
-- **What:** {1-2 sentence description of what this actually is}
+- **What:** {1-2 sentence synthesis — see What Field Rules below}
 ```
 
+### What Field Rules (CRITICAL — apply to every entry)
+
+The **What** field is the most important metadata. It must be a **synthesis**, not a label or echo.
+
+**Minimum bar:** 80+ characters, written as a complete sentence describing what the bookmark contains and why it matters. Descriptions under 80 characters are almost always lazy echoes or category labels — rewrite them until they actually say something.
+
+**NEVER write:**
+- Category labels: "Tool or resource share", "Commentary/perspective", "Claude Code insights/comparison"
+- Echo of the tweet text: If the tweet says "karpathy really is the fucking goat", the What must explain *what about Karpathy* — don't parrot the reaction
+- Generic placeholders: "Video content post", "Social media image bookmark", "Chart showing interesting data trend"
+- Raw URLs: Never put a t.co or any URL as the What
+
+**For quote tweets / replies:** The What MUST synthesize BOTH the author's commentary AND the quoted/parent content. The quoted content often contains the actual substance — a reaction tweet like "Are you guys starting to catch on?" is worthless without explaining what the quoted tweet actually describes.
+
+**For thin-content tweets (image-only, short with no link):** Write what you CAN infer from the author, text, and any visible context, prefixed with `THIN:` so downstream tools can flag these for manual review. Example: `THIN: @calebporzio reacting to an unspecified Chrome feature — tweet is image/video only, no text context available.`
+
+**For failed link expansion (raw t.co links):** Write `LINK_FAILED: Could not expand link from @{author} — original t.co URL did not resolve.`
+
 **Tags format:** Use wiki-link style `[[TagName]]` for each tag. Only include the **Tags:** line if the bookmark has tags in its `tags` array (from folder configuration). Example: `- **Tags:** [[AI]] [[Coding]]`
 
 **For quote tweets, include the quoted content:**
@@ -261,6 +283,12 @@ const remaining = pending.bookmarks.filter(b => !processedIds.has(b.id));
 pending.bookmarks = remaining;
 pending.count = remaining.length;
 fs.writeFileSync(pendingPath, JSON.stringify(pending, null, 2));
+
+// Clean up transcript files for processed bookmarks
+for (const id of processedIds) {
+  const transcriptFile = path.join(path.dirname(pendingPath), 'transcripts', `${id}.txt`);
+  if (fs.existsSync(transcriptFile)) fs.unlinkSync(transcriptFile);
+}
 ```
 
 ### 4. Commit and Push Changes
@@ -354,6 +382,42 @@ via: "Twitter bookmark from @{author}"
 
 ### Podcast Entry (`./knowledge/podcasts/{slug}.md`)
 
+**When transcript content is available** (`content.source` is `'yt-dlp-captions'` or `'whisper'`):
+
+```yaml
+---
+title: "{episode_title}"
+type: podcast
+date_added: {YYYY-MM-DD}
+source: "{podcast_url}"
+show: "{show_name}"
+tags: [{relevant_tags}, {folder_tags}]
+via: "Twitter bookmark from @{author}"
+status: transcribed
+transcript_source: "{yt-dlp-captions or whisper}"
+---
+
+{Summary of the episode's key points synthesized from the transcript}
+
+## Episode Info
+
+- **Show:** {show_name}
+- **Episode:** {episode_title}
+- **Why bookmarked:** {context from tweet}
+
+## Key Takeaways
+
+- Point 1 (from transcript)
+- Point 2 (from transcript)
+
+## Links
+
+- [Episode]({podcast_url})
+- [Original Tweet]({tweet_url})
+```
+
+**When no transcript available** (`content` is null):
+
 ```yaml
 ---
 title: "{episode_title}"
@@ -386,6 +450,42 @@ status: needs_transcript
 
 ### Video Entry (`./knowledge/videos/{slug}.md`)
 
+**When transcript content is available** (`content.source` is `'yt-dlp-captions'` or `'whisper'`):
+
+```yaml
+---
+title: "{video_title}"
+type: video
+date_added: {YYYY-MM-DD}
+source: "{video_url}"
+channel: "{channel_name}"
+tags: [{relevant_tags}, {folder_tags}]
+via: "Twitter bookmark from @{author}"
+status: transcribed
+transcript_source: "{yt-dlp-captions or whisper}"
+---
+
+{Summary of the video's key points synthesized from the transcript}
+
+## Video Info
+
+- **Channel:** {channel_name}
+- **Title:** {video_title}
+- **Why bookmarked:** {context from tweet}
+
+## Key Takeaways
+
+- Point 1 (from transcript)
+- Point 2 (from transcript)
+
+## Links
+
+- [Video]({video_url})
+- [Original Tweet]({tweet_url})
+```
+
+**When no transcript available** (`content` is null):
+
 ```yaml
 ---
 title: "{video_title}"
@@ -427,10 +527,10 @@ status: needs_transcript
 Spawn multiple Task subagents in ONE message. Each writes to a separate temp file:
 
 ```
-Task 1: model="haiku", "Process batch 0" → writes to .state/batch-0.md
-Task 2: model="haiku", "Process batch 1" → writes to .state/batch-1.md
-Task 3: model="haiku", "Process batch 2" → writes to .state/batch-2.md
-Task 4: model="haiku", "Process batch 3" → writes to .state/batch-3.md
+Task 1: model="sonnet", "Process batch 0" → writes to .state/batch-0.md
+Task 2: model="sonnet", "Process batch 1" → writes to .state/batch-1.md
+Task 3: model="sonnet", "Process batch 2" → writes to .state/batch-2.md
+Task 4: model="sonnet", "Process batch 3" → writes to .state/batch-3.md
 ```
 
 **Subagent prompt template:**
@@ -448,9 +548,19 @@ DATE: {bookmark.date}
 
 - **Tweet:** {url}
 - **Tags:** [[tag1]] [[tag2]] (if tags exist)
-- **What:** {description}
-
-Also create knowledge files (./knowledge/tools/*.md, ./knowledge/articles/*.md) as needed.
+- **What:** {1-2 sentence synthesis, 80+ chars minimum}
+
+**What field rules (MUST follow):**
+- Write a SYNTHESIS, not a category label or echo of the tweet
+- NEVER write generic labels like "Tool or resource share" or "Commentary/perspective"
+- NEVER just echo the tweet text as the What — explain what the content IS and why it matters
+- For quote tweets: synthesize BOTH the commentary AND the quoted content (the quote often has the substance)
+- For image/video-only tweets with no text context: prefix with "THIN:" and describe what you can infer
+- For failed link expansion (raw t.co URLs): write "LINK_FAILED: Could not expand link from @{author}"
+- Minimum 80 characters. Under 80 means you're echoing or categorizing, not synthesizing. Rewrite.
+
+For video/podcast bookmarks with a transcriptFile in content, read the FIRST 20,000 characters of that file for summarization.
+Also create knowledge files (./knowledge/tools/*.md, ./knowledge/articles/*.md, ./knowledge/videos/*.md, ./knowledge/podcasts/*.md) as needed.
 DO NOT touch bookmarks.md - only write to .state/batch-{N}.md
 ```
 
@@ -468,6 +578,7 @@ After ALL subagents complete:
 **Merge logic for bookmarks.md:**
 - File is descending order (newest dates at top)
 - For each entry from batch files (processed in order):
+  - **DEDUP CHECK FIRST:** Search bookmarks.md for the tweet URL. If already present, skip.
   - Find or create the date section at correct position
   - Insert entry at TOP of that date section
 - Since batches are oldest-first, entries end up in correct order
diff --git a/README.md b/README.md
index 7f3517a..12a090e 100644
--- a/README.md
+++ b/README.md
@@ -154,10 +154,11 @@ Categories define how different bookmark types are handled. Smaug comes with sen
 | **github** | github.com | file | `./knowledge/tools/` |
 | **article** | medium.com, substack.com, dev.to, blogs | file | `./knowledge/articles/` |
 | **x-article** | x.com/i/article/* | file | `./knowledge/articles/` |
+| **podcast** | podcasts.apple.com, spotify.com/episode, overcast.fm | transcribe | `./knowledge/podcasts/` |
+| **youtube** | youtube.com, youtu.be | transcribe | `./knowledge/videos/` |
+| **video** | vimeo.com, loom.com | transcribe | `./knowledge/videos/` |
 | **tweet** | (fallback) | capture | bookmarks.md only |
 
-🔜 _Note: Transcription is flagged but not yet automated. PRs welcome!_
-
 ### X/Twitter Long-Form Articles
 
 X articles (`x.com/i/article/*`) are Twitter's native long-form content format. Smaug extracts the full article text using bird CLI:
@@ -177,11 +178,56 @@ Example X article bookmark:
 - **What:** Deep dive into patterns from scaling CrewAI to billions of agent executions.
 ```
 
+### Video & Podcast Transcription
+
+Smaug automatically extracts transcripts from YouTube videos, podcasts, and other video content using a tiered strategy:
+
+1. **Captions (fast)**: yt-dlp downloads existing subtitles — no audio processing needed
+2. **Whisper (fallback)**: If no captions exist, yt-dlp extracts audio and Whisper transcribes it locally
+3. **Placeholder**: If neither tool is installed, bookmarks are flagged with `status: needs_transcript`
+
+Both tools are **optional** — install what you need:
+
+```bash
+# For caption extraction (recommended — handles most YouTube videos)
+pip install yt-dlp
+# or: brew install yt-dlp
+
+# For audio transcription when captions aren't available
+pip install openai-whisper
+
+# Required for Whisper (Tier 2) — ffmpeg handles audio extraction and decoding
+# Not needed if you only use caption extraction (Tier 1)
+sudo apt-get install ffmpeg   # Debian/Ubuntu
+# or: brew install ffmpeg     # macOS
+```
+
+**Supported platforms:** YouTube (excellent), Vimeo (good), SoundCloud (audio only), direct video URLs. Spotify and Apple Podcasts are not supported by yt-dlp — these produce placeholder files.
+
+**Configuration** (all optional, in `smaug.config.json`):
+
+```json
+{
+  "ytdlpPath": "/custom/path/to/yt-dlp",
+  "whisperPath": "/custom/path/to/whisper",
+  "whisperModel": "small.en",
+  "transcribeTimeouts": {
+    "subtitle": 30000,
+    "audio": 300000,
+    "whisper": 600000
+  }
+}
+```
+
+**How transcripts are stored:** Full transcripts are written to `.state/transcripts/{tweet_id}.txt` — they never bloat the pending JSON, even for 6-hour podcasts. During processing, Claude reads only the first ~20K characters needed for summarization.
+
+Environment variables: `YTDLP_PATH`, `WHISPER_PATH`, `WHISPER_MODEL`.
+
 ### Actions
 
 - **file**: Create a separate markdown file with rich metadata
 - **capture**: Add to bookmarks.md only (no separate file)
-- **transcribe**: Flag for future transcription *(auto-transcription coming soon! PRs welcome)*
+- **transcribe**: Extract transcript via yt-dlp captions or Whisper, then create a rich knowledge file. Falls back to a placeholder if tools are not installed
 
 ### Custom Categories
 
diff --git a/src/config.js b/src/config.js
index 1e6aac4..4ceeacb 100644
--- a/src/config.js
+++ b/src/config.js
@@ -34,6 +34,23 @@ const DEFAULT_CONFIG = {
   // Path to bird CLI (if not in PATH)
   birdPath: null,
 
+  // Path to yt-dlp CLI (if not in PATH)
+  ytdlpPath: null,
+
+  // Path to whisper CLI (if not in PATH)
+  whisperPath: null,
+
+  // Whisper model to use (default: small.en)
+  whisperModel: 'small.en',
+
+  // Transcription timeouts (ms)
+  transcribeTimeouts: {
+    subtitle: 30000,    // 30s for subtitle download
+    audio: 300000,      // 5min for audio extraction
+    whisper: 600000     // 10min for whisper transcription
+  },
+
+
   // Twitter credentials (can also use AUTH_TOKEN and CT0 env vars)
   twitter: {
     authToken: null,
@@ -221,6 +238,15 @@ export function loadConfig(configPath) {
   if (process.env.BIRD_PATH) {
     config.birdPath = process.env.BIRD_PATH;
   }
+  if (process.env.YTDLP_PATH) {
+    config.ytdlpPath = process.env.YTDLP_PATH;
+  }
+  if (process.env.WHISPER_PATH) {
+    config.whisperPath = process.env.WHISPER_PATH;
+  }
+  if (process.env.WHISPER_MODEL) {
+    config.whisperModel = process.env.WHISPER_MODEL;
+  }
   if (process.env.SOURCE) {
     config.source = process.env.SOURCE;
   }
@@ -268,6 +294,8 @@ export function loadConfig(configPath) {
   config.pendingFile = expandTilde(config.pendingFile);
   config.stateFile = expandTilde(config.stateFile);
   config.birdPath = expandTilde(config.birdPath);
+  config.ytdlpPath = expandTilde(config.ytdlpPath);
+  config.whisperPath = expandTilde(config.whisperPath);
   config.projectRoot = expandTilde(config.projectRoot);
 
   // Expand ~ in category folders
diff --git a/src/index.js b/src/index.js
index 6e67e08..49ae3d5 100644
--- a/src/index.js
+++ b/src/index.js
@@ -13,6 +13,11 @@ export {
   fetchGitHubContent,
   fetchArticleContent,
   fetchXArticleContent,
+  fetchTranscriptContent,
+  parseJson3Transcript,
+  parseVttTranscript,
+  findYtDlp,
+  findWhisper,
   isPaywalled,
   loadState,
   saveState
diff --git a/src/processor.js b/src/processor.js
index 2c1bfbc..db6a157 100644
--- a/src/processor.js
+++ b/src/processor.js
@@ -541,6 +541,244 @@ export async function fetchArticleContent(url) {
   }
 }
 
+// ---- Video/Podcast Transcript Extraction ----
+
+/**
+ * Locate yt-dlp binary. Returns path or null if not installed.
+ * Accepts options for dependency injection (testability).
+ */
+export function findYtDlp(options = {}) {
+  const execSyncFn = options.execSyncFn || execSync;
+  const configPath = options.configPath || null;
+
+  if (configPath) {
+    try {
+      if (fs.existsSync(configPath)) return configPath;
+    } catch {}
+  }
+
+  try {
+    return execSyncFn('which yt-dlp', { encoding: 'utf8', timeout: 5000, stdio: 'pipe' }).trim();
+  } catch {
+    return null;
+  }
+}
+
+/**
+ * Locate whisper binary. Returns path or null if not installed.
+ * Accepts options for dependency injection (testability).
+ */
+export function findWhisper(options = {}) {
+  const execSyncFn = options.execSyncFn || execSync;
+  const configPath = options.configPath || null;
+
+  if (configPath) {
+    try {
+      if (fs.existsSync(configPath)) return configPath;
+    } catch {}
+  }
+
+  try {
+    return execSyncFn('which whisper', { encoding: 'utf8', timeout: 5000, stdio: 'pipe' }).trim();
+  } catch {
+    return null;
+  }
+}
+
+/**
+ * Parse YouTube json3 subtitle format into plain text.
+ * json3 contains events with segs[].utf8 text segments.
+ */
+export function parseJson3Transcript(content) {
+  try {
+    const data = JSON.parse(content);
+    const events = data.events || [];
+    const lines = [];
+    let lastLine = '';
+
+    for (const event of events) {
+      if (!event.segs) continue;
+      const text = event.segs.map(s => s.utf8 || '').join('').trim();
+      if (text && text !== lastLine) {
+        lines.push(text);
+        lastLine = text;
+      }
+    }
+
+    return lines.join('\n');
+  } catch {
+    return '';
+  }
+}
+
+/**
+ * Parse VTT (WebVTT) subtitle format into plain text.
+ * Strips headers, timestamps, tags, and deduplicates adjacent lines.
+ */
+export function parseVttTranscript(content) {
+  if (!content || typeof content !== 'string') return '';
+
+  const lines = content.split('\n');
+  const textLines = [];
+  let lastLine = '';
+
+  for (const raw of lines) {
+    const line = raw.trim();
+
+    // Skip WEBVTT header, NOTE blocks, and empty lines
+    if (!line || line === 'WEBVTT' || line.startsWith('NOTE') || line.startsWith('Kind:') || line.startsWith('Language:')) continue;
+
+    // Skip timestamp lines (00:00:00.000 --> 00:00:05.000)
+    if (/^\d{2}:\d{2}/.test(line) && line.includes('-->')) continue;
+
+    // Skip cue identifiers (numeric lines before timestamps)
+    if (/^\d+$/.test(line)) continue;
+
+    // Strip VTT formatting tags: <c>, </c>, <00:00:01.234>, <c.colorCCCCCC>, etc.
+    let cleaned = line
+      .replace(/<\/?c[^>]*>/g, '')           // color/class tags
+      .replace(/<\d{2}:\d{2}[\d:.]*>/g, '')  // inline timestamps
+      .replace(/<\/?[a-z][^>]*>/g, '')        // other HTML-like tags
+      .trim();
+
+    // Decode common HTML entities
+    cleaned = cleaned
+      .replace(/&amp;/g, '&')
+      .replace(/&lt;/g, '<')
+      .replace(/&gt;/g, '>')
+      .replace(/&quot;/g, '"')
+      .replace(/&#39;/g, "'")
+      .replace(/&apos;/g, "'")
+      .replace(/&nbsp;/g, ' ');
+
+    if (cleaned && cleaned !== lastLine) {
+      textLines.push(cleaned);
+      lastLine = cleaned;
+    }
+  }
+
+  return textLines.join('\n');
+}
+
+/**
+ * Fetch transcript for a video/podcast URL using tiered strategy:
+ *   Tier 1: yt-dlp captions (subtitle-only, no download)
+ *   Tier 2: yt-dlp audio extraction + Whisper transcription
+ *   Tier 3: return null (placeholder behavior)
+ *
+ * @param {string} url - The video/podcast URL
+ * @param {object} config - App config (timeouts, whisperModel)
+ * @param {object} toolPaths - { ytdlp: string|null, whisper: string|null }
+ * @returns {Promise<{text: string, source: string}|null>}
+ */
+export async function fetchTranscriptContent(url, config, toolPaths) {
+  const { ytdlp, whisper } = toolPaths || {};
+  const timeouts = config.transcribeTimeouts || {};
+  const subtitleTimeout = timeouts.subtitle || 30000;
+  const audioTimeout = timeouts.audio || 300000;
+  const whisperTimeout = timeouts.whisper || 600000;
+  const whisperModel = config.whisperModel || 'small.en';
+
+  if (!ytdlp) {
+    console.log('  yt-dlp not available — install yt-dlp for video/podcast transcripts');
+    return null;
+  }
+
+  const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'smaug-transcript-'));
+
+  try {
+    // ---- Tier 1: Subtitle extraction ----
+    const isYouTube = /youtube\.com|youtu\.be/.test(url);
+    const subFormat = isYouTube ? 'json3' : 'vtt/best';
+    const outputTemplate = path.join(tmpDir, '%(id)s.%(ext)s');
+
+    try {
+      execSync(
+        `"${ytdlp}" --write-subs --write-auto-subs --sub-langs "en.*,-live_chat" --sub-format "${subFormat}" --skip-download --no-warnings -o "${outputTemplate}" "${url}"`,
+        { timeout: subtitleTimeout, stdio: 'pipe', encoding: 'utf8', shell: true }
+      );
+    } catch {
+      // yt-dlp may exit non-zero even when subtitle files were written
+      // (e.g., 429 on one variant while others succeeded). Check files below.
+    }
+
+    // Check for subtitle output files (yt-dlp exits non-zero even with no subs)
+    const files = fs.readdirSync(tmpDir);
+    const subFile = files.find(f => f.endsWith('.json3')) ||
+                    files.find(f => f.endsWith('.vtt')) ||
+                    files.find(f => f.endsWith('.srt'));
+
+    if (subFile) {
+      const subContent = fs.readFileSync(path.join(tmpDir, subFile), 'utf8');
+      let text;
+
+      if (subFile.endsWith('.json3')) {
+        text = parseJson3Transcript(subContent);
+      } else {
+        text = parseVttTranscript(subContent);
+      }
+
+      if (text && text.length > 0) {
+        console.log(`  Transcript extracted via captions (${text.length} chars)`);
+        return { text, source: 'yt-dlp-captions' };
+      }
+    }
+
+    // ---- Tier 2: Audio extraction + Whisper ----
+    if (!whisper) {
+      console.log('  No captions available — install whisper for audio transcription');
+      return null;
+    }
+
+    console.log('  No captions found, extracting audio for Whisper...');
+    const audioFile = path.join(tmpDir, 'audio.mp3');
+
+    try {
+      execSync(
+        `"${ytdlp}" -x --audio-format mp3 --audio-quality 5 -o "${audioFile}" "${url}"`,
+        { timeout: audioTimeout, stdio: 'pipe', encoding: 'utf8', shell: true }
+      );
+    } catch (err) {
+      console.log(`  Audio extraction failed: ${err.message?.split('\n')[0]}`);
+      return null;
+    }
+
+    if (!fs.existsSync(audioFile)) {
+      console.log('  Audio extraction produced no output');
+      return null;
+    }
+
+    console.log(`  Transcribing with Whisper (model: ${whisperModel})...`);
+    try {
+      execSync(
+        `"${whisper}" "${audioFile}" --model ${whisperModel} --output_format txt --output_dir "${tmpDir}"`,
+        { timeout: whisperTimeout, stdio: 'pipe', encoding: 'utf8', shell: true }
+      );
+    } catch (err) {
+      console.log(`  Whisper transcription failed: ${err.message?.split('\n')[0]}`);
+      return null;
+    }
+
+    // Whisper outputs audio.txt
+    const txtFiles = fs.readdirSync(tmpDir).filter(f => f.endsWith('.txt'));
+    if (txtFiles.length > 0) {
+      const text = fs.readFileSync(path.join(tmpDir, txtFiles[0]), 'utf8').trim();
+      if (text.length > 0) {
+        console.log(`  Transcript extracted via Whisper (${text.length} chars)`);
+        return { text, source: 'whisper' };
+      }
+    }
+
+    console.log('  Whisper produced no output');
+    return null;
+  } finally {
+    // Clean up temp directory
+    try {
+      fs.rmSync(tmpDir, { recursive: true, force: true });
+    } catch {}
+  }
+}
+
 export async function fetchContent(url, type, config) {
   // Use GitHub API for GitHub URLs
   if (type === 'github') {
@@ -640,6 +878,13 @@ export async function fetchAndPrepareBookmarks(options = {}) {
     return { bookmarks: [], count: 0 };
   }
 
+  // Detect transcription tools once per run
+  const ytdlpPath = findYtDlp({ configPath: config.ytdlpPath });
+  const whisperPath = findWhisper({ configPath: config.whisperPath });
+  const toolPaths = { ytdlp: ytdlpPath, whisper: whisperPath };
+  if (ytdlpPath) console.log(`  yt-dlp found: ${ytdlpPath}`);
+  if (whisperPath) console.log(`  whisper found: ${whisperPath}`);
+
   console.log(`Preparing ${toProcess.length} tweets...`);
 
   const prepared = [];
@@ -738,6 +983,14 @@ export async function fetchAndPrepareBookmarks(options = {}) {
           }
         } else if (expanded.match(/\.(jpg|jpeg|png|gif|webp)$/i)) {
           type = 'image';
+        } else if (
+          expanded.includes('podcasts.apple.com') ||
+          expanded.includes('spotify.com/episode') ||
+          expanded.includes('overcast.fm') ||
+          expanded.includes('pocketcasts.com') ||
+          expanded.includes('castro.fm')
+        ) {
+          type = 'podcast';
         } else {
           type = 'article';
         }
@@ -773,6 +1026,30 @@ export async function fetchAndPrepareBookmarks(options = {}) {
           }
         }
 
+        // Fetch transcript for video and podcast links
+        if (type === 'video' || type === 'podcast') {
+          try {
+            const transcriptResult = await fetchTranscriptContent(expanded, config, toolPaths);
+            if (transcriptResult && transcriptResult.text) {
+              // Write full transcript to separate file to keep pending JSON small
+              const transcriptsDir = path.join(path.dirname(config.pendingFile), 'transcripts');
+              if (!fs.existsSync(transcriptsDir)) {
+                fs.mkdirSync(transcriptsDir, { recursive: true });
+              }
+              const transcriptFile = path.join(transcriptsDir, `${bookmark.id}.txt`);
+              fs.writeFileSync(transcriptFile, transcriptResult.text);
+              content = {
+                source: transcriptResult.source,
+                transcriptFile,
+                chars: transcriptResult.text.length
+              };
+              console.log(`  Transcript: ${content.chars} chars (${content.source}) → ${transcriptFile}`);
+            }
+          } catch (error) {
+            console.log(`  Transcript extraction failed: ${error.message}`);
+          }
+        }
+
         links.push({
           original: link,
           expanded,
diff --git a/test/config.test.js b/test/config.test.js
index 8ba97fa..8f0e5ea 100644
--- a/test/config.test.js
+++ b/test/config.test.js
@@ -63,4 +63,37 @@ describe('loadConfig', () => {
     // Default paths don't use ~, but the function should work
     assert.ok(!config.archiveFile.includes('~'));
   });
+
+  test('transcription config keys have correct defaults', () => {
+    const config = loadConfig('./nonexistent.json');
+    assert.strictEqual(config.ytdlpPath, null);
+    assert.strictEqual(config.whisperPath, null);
+    assert.strictEqual(config.whisperModel, 'small.en');
+    assert.ok(config.transcribeTimeouts);
+    assert.strictEqual(config.transcribeTimeouts.subtitle, 30000);
+    assert.strictEqual(config.transcribeTimeouts.audio, 300000);
+    assert.strictEqual(config.transcribeTimeouts.whisper, 600000);
+  });
+
+  test('env var overrides work for transcription config', () => {
+    const origYtdlp = process.env.YTDLP_PATH;
+    const origWhisper = process.env.WHISPER_PATH;
+    const origModel = process.env.WHISPER_MODEL;
+    try {
+      process.env.YTDLP_PATH = '/custom/yt-dlp';
+      process.env.WHISPER_PATH = '/custom/whisper';
+      process.env.WHISPER_MODEL = 'tiny.en';
+      const config = loadConfig('./nonexistent.json');
+      assert.strictEqual(config.ytdlpPath, '/custom/yt-dlp');
+      assert.strictEqual(config.whisperPath, '/custom/whisper');
+      assert.strictEqual(config.whisperModel, 'tiny.en');
+    } finally {
+      if (origYtdlp === undefined) delete process.env.YTDLP_PATH;
+      else process.env.YTDLP_PATH = origYtdlp;
+      if (origWhisper === undefined) delete process.env.WHISPER_PATH;
+      else process.env.WHISPER_PATH = origWhisper;
+      if (origModel === undefined) delete process.env.WHISPER_MODEL;
+      else process.env.WHISPER_MODEL = origModel;
+    }
+  });
 });
diff --git a/test/fixtures/sample-subtitles.json3 b/test/fixtures/sample-subtitles.json3
new file mode 100644
index 0000000..49e577d
--- /dev/null
+++ b/test/fixtures/sample-subtitles.json3
@@ -0,0 +1,25 @@
+{
+  "events": [
+    {
+      "segs": [{"utf8": "Hello and welcome to this video."}]
+    },
+    {
+      "segs": [{"utf8": "Today we'll be talking about"}]
+    },
+    {
+      "segs": [{"utf8": "Today we'll be talking about"}]
+    },
+    {
+      "segs": [{"utf8": " artificial intelligence."}]
+    },
+    {
+      "segs": [{"utf8": "Let's dive right in."}]
+    },
+    {
+      "tStartMs": 5000
+    },
+    {
+      "segs": [{"utf8": "First, let's define what AI means."}]
+    }
+  ]
+}
diff --git a/test/fixtures/sample-subtitles.vtt b/test/fixtures/sample-subtitles.vtt
new file mode 100644
index 0000000..b7ccc0f
--- /dev/null
+++ b/test/fixtures/sample-subtitles.vtt
@@ -0,0 +1,21 @@
+WEBVTT
+Kind: captions
+Language: en
+
+00:00:00.000 --> 00:00:02.500
+Hello and welcome to this video.
+
+00:00:02.500 --> 00:00:05.000
+<c.colorCCCCCC>Today</c> we&apos;ll be talking about
+
+00:00:05.000 --> 00:00:07.500
+<00:00:05.500>artificial intelligence.
+
+00:00:07.500 --> 00:00:10.000
+Today we'll be talking about
+
+00:00:10.000 --> 00:00:12.500
+Let&#39;s dive right in.
+
+00:00:12.500 --> 00:00:15.000
+First, let&amp;s define what AI means.
diff --git a/test/processor.test.js b/test/processor.test.js
index c8ffce8..130f160 100644
--- a/test/processor.test.js
+++ b/test/processor.test.js
@@ -4,7 +4,16 @@ import fs from 'fs';
 import path from 'path';
 import { execSync } from 'child_process';
 import { fileURLToPath } from 'url';
-import { isPaywalled, stripQuerystring, fetchXArticleContent } from '../src/processor.js';
+import {
+  isPaywalled,
+  stripQuerystring,
+  fetchXArticleContent,
+  findYtDlp,
+  findWhisper,
+  parseJson3Transcript,
+  parseVttTranscript,
+  fetchTranscriptContent
+} from '../src/processor.js';
 
 const __dirname = path.dirname(fileURLToPath(import.meta.url));
 
@@ -579,6 +588,194 @@ describe('fetchBookmarks count truncation', () => {
   });
 });
 
+// Check if yt-dlp is available for integration tests
+function hasYtDlp() {
+  try {
+    execSync('which yt-dlp', { encoding: 'utf8', timeout: 5000, stdio: 'pipe' });
+    return true;
+  } catch {
+    return false;
+  }
+}
+
+function hasWhisper() {
+  try {
+    execSync('which whisper', { encoding: 'utf8', timeout: 5000, stdio: 'pipe' });
+    return true;
+  } catch {
+    return false;
+  }
+}
+
+const YT_DLP_AVAILABLE = hasYtDlp();
+const WHISPER_AVAILABLE = hasWhisper();
+
+describe('findYtDlp', () => {
+  test('returns null when tool is not installed (mocked)', () => {
+    const result = findYtDlp({
+      execSyncFn: () => { throw new Error('not found'); },
+      configPath: null
+    });
+    assert.strictEqual(result, null);
+  });
+
+  test('returns configured path when it exists', () => {
+    const result = findYtDlp({
+      configPath: '/usr/bin/env'
+    });
+    assert.strictEqual(result, '/usr/bin/env');
+  });
+
+  test('returns null for non-existent configured path', () => {
+    const result = findYtDlp({
+      execSyncFn: () => { throw new Error('not found'); },
+      configPath: '/nonexistent/path/yt-dlp'
+    });
+    assert.strictEqual(result, null);
+  });
+});
+
+describe('findWhisper', () => {
+  test('returns null when tool is not installed (mocked)', () => {
+    const result = findWhisper({
+      execSyncFn: () => { throw new Error('not found'); },
+      configPath: null
+    });
+    assert.strictEqual(result, null);
+  });
+
+  test('returns configured path when it exists', () => {
+    const result = findWhisper({
+      configPath: '/usr/bin/env'
+    });
+    assert.strictEqual(result, '/usr/bin/env');
+  });
+});
+
+describe('parseJson3Transcript', () => {
+  test('produces clean text from fixture', () => {
+    const content = fs.readFileSync(path.join(__dirname, 'fixtures/sample-subtitles.json3'), 'utf8');
+    const result = parseJson3Transcript(content);
+    assert.ok(result.includes('Hello and welcome to this video.'));
+    assert.ok(result.includes('artificial intelligence.'));
+    assert.ok(result.includes("Let's dive right in."));
+    assert.ok(result.includes("First, let's define what AI means."));
+  });
+
+  test('deduplicates adjacent identical lines', () => {
+    const content = fs.readFileSync(path.join(__dirname, 'fixtures/sample-subtitles.json3'), 'utf8');
+    const result = parseJson3Transcript(content);
+    const lines = result.split('\n');
+    const talkingAbout = lines.filter(l => l === "Today we'll be talking about");
+    assert.strictEqual(talkingAbout.length, 1, 'duplicate adjacent lines should be removed');
+  });
+
+  test('skips events without segs', () => {
+    const content = fs.readFileSync(path.join(__dirname, 'fixtures/sample-subtitles.json3'), 'utf8');
+    const result = parseJson3Transcript(content);
+    assert.ok(!result.includes('undefined'));
+  });
+
+  test('returns empty string for malformed content', () => {
+    assert.strictEqual(parseJson3Transcript('not json'), '');
+    assert.strictEqual(parseJson3Transcript(''), '');
+    assert.strictEqual(parseJson3Transcript('{}'), '');
+  });
+});
+
+describe('parseVttTranscript', () => {
+  test('produces clean text from fixture', () => {
+    const content = fs.readFileSync(path.join(__dirname, 'fixtures/sample-subtitles.vtt'), 'utf8');
+    const result = parseVttTranscript(content);
+    assert.ok(result.includes('Hello and welcome to this video.'));
+    assert.ok(result.includes('artificial intelligence.'));
+    assert.ok(result.includes("Let's dive right in."));
+  });
+
+  test('strips VTT formatting tags', () => {
+    const content = fs.readFileSync(path.join(__dirname, 'fixtures/sample-subtitles.vtt'), 'utf8');
+    const result = parseVttTranscript(content);
+    assert.ok(!result.includes('<c.colorCCCCCC>'), 'color tags should be stripped');
+    assert.ok(!result.includes('<00:'), 'inline timestamps should be stripped');
+  });
+
+  test('decodes HTML entities', () => {
+    const content = fs.readFileSync(path.join(__dirname, 'fixtures/sample-subtitles.vtt'), 'utf8');
+    const result = parseVttTranscript(content);
+    assert.ok(result.includes('&s define'), 'should decode &amp; to &');
+    assert.ok(result.includes("Let's"), 'should decode &#39; to apostrophe');
+  });
+
+  test('excludes WEBVTT header and timestamp lines', () => {
+    const content = fs.readFileSync(path.join(__dirname, 'fixtures/sample-subtitles.vtt'), 'utf8');
+    const result = parseVttTranscript(content);
+    assert.ok(!result.includes('WEBVTT'), 'header should be excluded');
+    assert.ok(!result.includes('-->'), 'timestamps should be excluded');
+    assert.ok(!result.includes('Kind:'), 'metadata should be excluded');
+  });
+
+  test('deduplicates adjacent lines', () => {
+    const content = fs.readFileSync(path.join(__dirname, 'fixtures/sample-subtitles.vtt'), 'utf8');
+    const result = parseVttTranscript(content);
+    const lines = result.split('\n');
+    for (let i = 1; i < lines.length; i++) {
+      assert.notStrictEqual(lines[i], lines[i - 1], `adjacent lines should not be identical: "${lines[i]}"`);
+    }
+  });
+
+  test('returns empty string for null/empty input', () => {
+    assert.strictEqual(parseVttTranscript(null), '');
+    assert.strictEqual(parseVttTranscript(''), '');
+    assert.strictEqual(parseVttTranscript(undefined), '');
+  });
+});
+
+describe('fetchTranscriptContent', () => {
+  test('returns null with log when no tools available', async () => {
+    const result = await fetchTranscriptContent(
+      'https://www.youtube.com/watch?v=dQw4w9WgXcQ',
+      { transcribeTimeouts: {} },
+      { ytdlp: null, whisper: null }
+    );
+    assert.strictEqual(result, null);
+  });
+
+  test('returns null when toolPaths is undefined', async () => {
+    const result = await fetchTranscriptContent(
+      'https://www.youtube.com/watch?v=dQw4w9WgXcQ',
+      {},
+      undefined
+    );
+    assert.strictEqual(result, null);
+  });
+});
+
+describe('podcast URL classification', () => {
+  test('Apple Podcasts URL is detected as podcast type', () => {
+    const url = 'https://podcasts.apple.com/us/podcast/episode-name/id123456';
+    assert.ok(url.includes('podcasts.apple.com'));
+  });
+
+  test('Spotify episode URL is detected as podcast type', () => {
+    const url = 'https://open.spotify.com/episode/abc123';
+    assert.ok(url.includes('spotify.com/episode'));
+  });
+
+  test('Overcast URL is detected as podcast type', () => {
+    const url = 'https://overcast.fm/+abc123';
+    assert.ok(url.includes('overcast.fm'));
+  });
+
+  test('YouTube URL is NOT classified as podcast', () => {
+    const url = 'https://www.youtube.com/watch?v=abc123';
+    assert.ok(!url.includes('podcasts.apple.com'));
+    assert.ok(!url.includes('spotify.com/episode'));
+    assert.ok(!url.includes('overcast.fm'));
+    assert.ok(!url.includes('pocketcasts.com'));
+    assert.ok(!url.includes('castro.fm'));
+  });
+});
+
 // Integration tests - only run when bird CLI has valid credentials
 describe('X article integration tests (requires bird credentials)', { skip: !BIRD_AVAILABLE }, () => {
   // These tests require actual Twitter/X API access via bird CLI