Problem
Echo's evolve analyser cannot measure how quickly a reply was sent after the original tweet was posted. This is critical because X's algorithm heavily rewards engagement velocity — the first 30 minutes of a tweet's life are when replies get the most algorithmic boost.
Currently, baseline reply nodes store posted_at (from X Analytics CSV, date-only granularity) and post_id (which is the reply's own tweet ID, not the parent tweet ID). There is no in_reply_to_status_id or parent tweet timestamp stored.
What We Have
- Reply snowflake ID (
node.title) — encodes the exact timestamp the reply was created (millisecond precision via (id >> 22) + 1288834974657)
- Parent tweet ID — NOT stored. The
post_id field on baseline replies is the reply's own ID, not the parent's
What We Need
- Store
in_reply_to_id on reply nodes — the parent tweet's snowflake ID
- Compute
time_to_reply_seconds from the two snowflake IDs (no API call needed, pure math: ((reply_id >> 22) - (parent_id >> 22)) / 1000)
- Backfill existing replies — scrape
in_reply_to_id for the ~311 baseline replies already in Cortex
Approaches to Get Parent Tweet ID
Option A: Browser scrape via xbot-browser
For each reply URL (e.g., https://x.com/DarlingtonDev/status/2028479083399516226), navigate to it and extract the parent tweet link from the conversation thread. Slow (~3s per reply) but works without API access.
Option B: X API v2
GET /2/tweets/:id?tweet.fields=conversation_id,in_reply_to_user_id,referenced_tweets returns referenced_tweets[0].id which is the parent tweet ID. Fast, batch-capable (100 tweets per request), but requires API access.
Option C: Embed in csv_import pipeline
X Analytics CSV doesn't include in_reply_to_id, so this can't be done at import time without a supplementary API/scrape call.
Recommendation: Option B if API access is available, Option A as fallback. Either way, build as a one-time backfill script + ongoing enrichment in the csv_import pipeline.
Impact on Evolve Analysis
Once timing data is available:
- Correlate reply speed with engagement — do replies within 5 minutes of the original tweet get more impressions than replies sent hours later?
- Optimal reply window — find the sweet spot (e.g., "replies sent 2-15 min after original tweet average 3x more impressions")
- Feed into compose prioritisation — Echo should prioritise replying to tweets that are fresh (< 30 min old) over older ones
- Add
time_to_reply_seconds to the digest prompt so Claude can factor timing into its pattern analysis
Data Model Changes
# On reply nodes in Cortex:
{
"in_reply_to_id": "2028475000000000000", # parent tweet snowflake ID
"time_to_reply_seconds": 342, # computed from snowflake delta
# ... existing fields
}
# In ReplyRecord (echo/evolve/collector.py):
@dataclass
class ReplyRecord:
# ... existing fields
time_to_reply_seconds: int | None # already exists, just always None
in_reply_to_id: str | None = None # new
Files Affected
echo/analytics/csv_import.py — enrich with in_reply_to_id during import (if API available)
echo/evolve/collector.py — pass time_to_reply_seconds through to ReplyRecord
echo/evolve/analyser.py — add timing correlation analysis
echo/evolve/digest.py — include timing data in Claude digest prompt
- New:
echo/scripts/backfill_reply_timing.py — one-time backfill for existing replies
Problem
Echo's evolve analyser cannot measure how quickly a reply was sent after the original tweet was posted. This is critical because X's algorithm heavily rewards engagement velocity — the first 30 minutes of a tweet's life are when replies get the most algorithmic boost.
Currently, baseline reply nodes store
posted_at(from X Analytics CSV, date-only granularity) andpost_id(which is the reply's own tweet ID, not the parent tweet ID). There is noin_reply_to_status_idor parent tweet timestamp stored.What We Have
node.title) — encodes the exact timestamp the reply was created (millisecond precision via(id >> 22) + 1288834974657)post_idfield on baseline replies is the reply's own ID, not the parent'sWhat We Need
in_reply_to_idon reply nodes — the parent tweet's snowflake IDtime_to_reply_secondsfrom the two snowflake IDs (no API call needed, pure math:((reply_id >> 22) - (parent_id >> 22)) / 1000)in_reply_to_idfor the ~311 baseline replies already in CortexApproaches to Get Parent Tweet ID
Option A: Browser scrape via xbot-browser
For each reply URL (e.g.,
https://x.com/DarlingtonDev/status/2028479083399516226), navigate to it and extract the parent tweet link from the conversation thread. Slow (~3s per reply) but works without API access.Option B: X API v2
GET /2/tweets/:id?tweet.fields=conversation_id,in_reply_to_user_id,referenced_tweetsreturnsreferenced_tweets[0].idwhich is the parent tweet ID. Fast, batch-capable (100 tweets per request), but requires API access.Option C: Embed in csv_import pipeline
X Analytics CSV doesn't include
in_reply_to_id, so this can't be done at import time without a supplementary API/scrape call.Recommendation: Option B if API access is available, Option A as fallback. Either way, build as a one-time backfill script + ongoing enrichment in the csv_import pipeline.
Impact on Evolve Analysis
Once timing data is available:
time_to_reply_secondsto the digest prompt so Claude can factor timing into its pattern analysisData Model Changes
Files Affected
echo/analytics/csv_import.py— enrich within_reply_to_idduring import (if API available)echo/evolve/collector.py— passtime_to_reply_secondsthrough to ReplyRecordecho/evolve/analyser.py— add timing correlation analysisecho/evolve/digest.py— include timing data in Claude digest promptecho/scripts/backfill_reply_timing.py— one-time backfill for existing replies