What needs to be done
Extend ingest.py to accept RSS feed URLs and automatically extract article links from the feed.
Which file(s) to modify
ingest.py — add RSS parsing logic
Proposed approach
- Detect if an input URL is an RSS/Atom feed (check Content-Type or try parsing as XML)
- Parse the feed using
xml.etree.ElementTree (stdlib, no new dependencies)
- Extract
<link> elements from each <item> / <entry>
- Pass extracted URLs through the existing
normalize_url() → ingest_urls() flow
Example usage
python3 ingest.py https://simonwillison.net/atom/everything/
python3 ingest.py --rss https://news.ycombinator.com/rss
Acceptance criteria
What needs to be done
Extend
ingest.pyto accept RSS feed URLs and automatically extract article links from the feed.Which file(s) to modify
ingest.py— add RSS parsing logicProposed approach
xml.etree.ElementTree(stdlib, no new dependencies)<link>elements from each<item>/<entry>normalize_url()→ingest_urls()flowExample usage
Acceptance criteria
xml.etree.ElementTree)tests/test_ingest.py