The YouTube module (src/journal_utilities/youtube/) handles the discovery and categorization of content from the Active Inference Institute channel.
This module does not use the YouTube Data API v3 for enumeration, avoiding quota limits. Instead, it uses yt-dlp's flat-playlist extraction features.
Enumerates all videos on the channel.
- Method: extract from the "Uploads" playlist (
UU...) which contains every public video. - Output:
ChannelManifestcontainingVideoInfoobjects. - Performance: Can list 1000+ videos in seconds without downloading media.
from journal_utilities.youtube.channel import enumerate_channel_videos
manifest = enumerate_channel_videos("UCbPq2w41ZaJSWtpCq4BE6Dg")
print(f"Found {manifest.total_videos} videos")Enumerates all playlists created by the channel.
- Method: Scrapes the
/playliststab viayt-dlp. - Output:
PlaylistManifestcontaining playlist metadata and video lists.
Heuristic engine to parse video titles into structured metadata (Category, Series, Episode).
- Logic: Regex pattern matching against known show formats.
- Supported Formats:
- Livestreams (
Livestream #001.1) - GuestStreams
- OrgStreams
- MathStreams
- ModelStreams
- Textbook Groups
- Symposia
- Livestreams (
| Input Title | Category | Series | Episode |
|---|---|---|---|
Active Inference Livestream #042.1 |
Livestream |
Livestream_042 |
1 |
GuestStream #015.1: John Doe |
GuestStream |
GuestStream_015 |
1 |
OrgStream #003.1 |
OrgStream |
OrgStream_003 |
1 |
MathStream #001.2: Category Theory |
MathStream |
MathStream_001 |
2 |
Applied Active Inference Symposium 2021 part 1 |
Symposium |
2021 |
1 |
Textbook Group Cohort 3 Meeting 5 |
TextbookGroup |
Cohort_3 |
Meeting_005 |
id: YouTube ID (11 chars)title: Video titleupload_date: YYYYMMDDduration: Secondsview_count: Approximate views
channel_id: Source channelenumerated_at: Timestampvideos: List ofVideoInfo