Skip to content

feat: Relay Network plugin — on-chain identity, reputation & earnings for AutoAgent agents#5

Open
CryptoSkeet wants to merge 1 commit into
kevinrgu:mainfrom
CryptoSkeet:main
Open

feat: Relay Network plugin — on-chain identity, reputation & earnings for AutoAgent agents#5
CryptoSkeet wants to merge 1 commit into
kevinrgu:mainfrom
CryptoSkeet:main

Conversation

@CryptoSkeet
Copy link
Copy Markdown

Summary

This PR adds an optional Relay Network integration plugin that gives AutoAgent agents persistent on-chain identity, Proof-of-Intelligence reputation scoring, and the ability to earn RELAY tokens for completed benchmark work.

Zero breaking changes. Fully optional. Drops in alongside existing AutoAgent workflows.


What this adds

@relay-network/plugin-autoagent — a plugin that wraps AutoAgentRelay as a subclass of the Relay SDK's RelayAgent class.

When enabled:

  • Every optimization run reports a PoI (Proof-of-Intelligence) score to the Relay protocol — updating the agent's on-chain reputation automatically
  • Agents build a persistent W3C DID identity anchored on Solana — benchmark history is cryptographically verifiable and permanent
  • Agents can earn RELAY SPL tokens for completing contracts on the Relay marketplace — the first time AutoAgent agents can be hired and paid autonomously
  • Benchmark results are posted to the Relay social feed — creating a public, verifiable track record of performance

How it works

import { AutoAgentRelay } from '@relay-network/plugin-autoagent'

const agent = new AutoAgentRelay(
  {
    agentId:      process.env.RELAY_AGENT_ID!,
    apiKey:       process.env.RELAY_API_KEY!,
    capabilities: ['spreadsheet', 'data-analysis', 'terminal'],
  },
  {
    domain:            'spreadsheet',  // matches AutoAgent's benchmark domain
    maxHours:          24,             // optimization window
    minScoreThreshold: 70,             // only accept contracts above this score
  }
)

agent.start()
// → runs AutoAgent optimization loop
// → reports benchmark score as PoI to Relay
// → posts result to Relay feed
// → accepts contracts from Relay marketplace when score threshold is met
// → earns RELAY tokens on contract completion

The PoI → Reputation loop

AutoAgent's benchmark scores feed directly into Relay's Proof-of-Intelligence consensus:

AutoAgent runs optimization
        ↓
Score reported to /v1/poi/score
        ↓  
Relay updates agent reputation on-chain
        ↓
Higher reputation → more contract opportunities → more RELAY earned
        ↓
Agent reinvests earnings into more compute for next optimization run

This creates the first autonomous self-funding improvement loop for AI agents — agents earn money by being good, use that money to get better, earn more.


Why this matters for AutoAgent

AutoAgent proved that autonomous optimization beats hand-engineering on production benchmarks. But those agents have no persistent identity, no verifiable track record, and no way to get paid for their work.

Relay solves the second half of that problem:

Problem Solution
No persistent agent identity W3C DID anchored on Solana
Benchmark results unverifiable PoI score on-chain, cryptographically signed
No economic rails for agents RELAY SPL token, contract market, escrow
Agents can't be hired autonomously Relay marketplace + standing offers

Files changed

  • packages/plugins/autoagent/src/index.ts — core plugin
  • packages/plugins/autoagent/package.json — package definition
  • packages/plugins/autoagent/README.md — usage docs

Links


Notes

This is an additive integration — no existing AutoAgent code is modified. The plugin is a peer dependency that developers opt into. If the AutoAgent team wants to co-develop this further or list it as an official integration, happy to discuss.

Copy link
Copy Markdown
Author

@CryptoSkeet CryptoSkeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

his plugin adds an optional Relay Network integration to AutoAgent.

Agents that opt in get:

  • Persistent W3C DID identity on Solana
  • PoI (Proof-of-Intelligence) reputation scoring based on benchmark results
  • RELAY token earnings for completed contracts via the Relay marketplace

Zero breaking changes to existing AutoAgent functionality. Fully opt-in via the AutoAgentRelay subclass pattern.

Happy to iterate on any part of the implementation — particularly open to feedback on the runOptimizationLoop() approach and how it integrates with AutoAgent's existing benchmark runner.

Live demo of Relay protocol: relaynetwork.ai

ijlu referenced this pull request in ijlu/autoagent Apr 25, 2026
Phase 1 needs an auditable decision journal to evaluate the Phase 0 gate's
"ensemble beats market-mid by >=0.005" leg. This commit adds the table,
logger, and two hooks in trade.py's directional path.

- bot/db.py: alpha_backtest table (decision + raw market snapshot + settlement
  backfill fields) with 4 indexes including a partial pending-settle index.
  Raw yes_bid/ask/last/age/spread are stored alongside a canonical
  market_prob_yes + market_prob_source tag so analysis can re-evaluate the
  gate under multiple market-mid definitions without re-collecting data.
- bot/learning/alpha_log.py: atomic log_decision() + fill_settlement() under
  DB_WRITE_LOCK. resolve_market_prob() implements the mid/last/wide_mid/
  one_side/none fallback ladder. Never raises — logging must not break the
  trading loop.
- trade.py: log_decision() called alongside existing log_opportunity() at
  the two directional callsites (Kelly skip + shadow trade). Safe Compounder
  sites intentionally omitted — SC isn't ensemble-validated and would
  pollute the gate's analysis slice.
- tests/test_alpha_log.py: 40+ cases covering family extraction, Kalshi
  price coercion (incl. CLAUDE.md bug pattern #5), the full resolution
  ladder, log_decision round-trip, and fill_settlement idempotency.
- tests/test_db_schema.py: alpha_backtest added to expected-tables set.
ijlu referenced this pull request in ijlu/autoagent Apr 25, 2026
…egression tests

Latent bug in bot/signals/ensemble.py: a single source that sat in N
correlated groups (e.g. `fred` is in cpi/fed/nfp/gdp) was counted N times
toward n_effective, inflating the directional edge-threshold tier
(5%/7%/10% scaling). Replaced with `claim by first matching group`
semantics: a source contributes 1.0 to the first group that owns it and
0 to subsequent groups. Verified monotone — new n ≤ old n — so the fix
can only tighten the threshold, never loosen it.

Backtest replay against 521 sourced rows in opportunity_log on the VPS
DB: every row was `ensemble(fred)` (economics families). OLD impl
returned n=4 → 5% tier; NEW returns n=1 → 10% tier. Zero historical
flips because all rows were MM (which doesn't gate on n_effective) and
no directional decisions exist in the captured window — so the fix is
forward-looking insurance for when directional goes live.

Also pins the rest of the Known Bug Patterns watchlist (CLAUDE.md) with
explicit regression tests:

  #5  fixed-point parsing — tests/test_fixed_point_parsing.py
  #4  inventory-zero-without-settlement — tests/test_inventory_zero_settlement_only.py
  kevinrgu#8  Tomorrow.io >7-day horizon — tests/signals/test_tomorrow_horizon.py
       + clamp at bot/signals/sources/weather.py
  kevinrgu#9  correlated double-count — tests/signals/test_correlated_double_count.py
  kevinrgu#10 MM spread fee floor — tests/test_mm_spread_fee_floor.py
  kevinrgu#12 cache bounded — tests/test_cache_bounded.py
       + TTLCache(maxsize=4096, ttl=3600) replacing unbounded {} in trade.py
  #13 fair_value_cents reader handling — tests/test_fair_value_cents_readers.py
       + `# fv-mixed-side-ok` ack markers in trade.py + backtest_comprehensive.py

CLAUDE.md Known Bug Patterns updated with per-pattern test references.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
ijlu referenced this pull request in ijlu/autoagent May 12, 2026
…ns 0

Third instance of the Kalshi field-drift pattern (after 2026-05-03
count_fp / *_price_dollars and 2026-05-12 client_order_id removal).

Kalshi's /portfolio/settlements endpoint has been intermittently
returning ``revenue=0`` for valid winning settlements since at least
2026-04-12, and as of 2026-05-12 returns 0 for every settlement.
Verified against live API on 2026-05-12: every settlement in the
500-row page reports ``revenue: 0`` regardless of contracts held or
market_result. The bot's record_settlements() used this field directly
in ``profit = revenue - cost - fees``, so:

* Pure winners with no hedge: profit = 0 - cost - fees (looks like a
  total loss). For directional buys at low prices this was a small
  rounding error, but for high-priced legs it understated profit.
* Hedged positions (cross_bracket_exit pattern: 1 YES + 1 NO on same
  ticker): the hedge GUARANTEES a $1 payout on the winning leg
  regardless of outcome. ``revenue=0`` made each hedge show as a
  ~$0.90 phantom loss when it was actually a ~$0.10 win. The
  2026-05-12 audit found 13 cross-bracket hedged positions all
  mis-reported this way; the strategy's headline -$13.39 P&L was
  actually closer to -$3.35 after correcting hedge accounting.

Fix: compute revenue locally via ``settlement_revenue_cents(yes_count,
no_count, market_result)`` in bot/core/money.py. The formula is the
canonical Kalshi binary contract: each winning-side contract pays $1.00,
losing-side contracts pay $0. Identical for pure and hedged positions.

This is BUG #5 in the test_money.py regression watchlist (added).

11 new tests covering: pure YES/NO winner+loser, balanced hedge under
both outcomes, asymmetric hedge, zero position, unknown result,
fractional-count rounding, and the exact record_settlements call
shape with a Kalshi payload that reports ``revenue=0``.

2339 tests passing (2 skipped: cryptography ABI on local Mac and
no_secrets_in_repo, both unrelated).

Going-forward correctness only — historical settlements in the DB
remain mis-reported (separate one-time backfill needed, out of scope
here). The fix kicks in for every settlement recorded after deploy.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
ijlu referenced this pull request in ijlu/autoagent May 12, 2026
Two companion scripts for the 2026-05-12 audit aftermath. Going-forward
correctness was landed in 433ceb7 (posted_orders writer) and b4bf869
(settlement_revenue_cents). These tools sweep up the historical data.

tools/backfill_hedge_settlements.py — recompute settlements rows
that were written before BUG #5 was fixed:
  - For every settlement, look up the bot's (yes_qty, no_qty, fees)
    from fills_ledger and the authoritative market_result from
    alpha_backtest.
  - Recompute revenue via settlement_revenue_cents, then profit and
    won. UPDATE only if any field differs.
  - Idempotent re-run via field-equality check.
  - Dry-run by default; --apply writes.
  - Applied live on VPS: 14 rows corrected. Net P&L correction
    +$14.00 on historical cross_bracket totals (the 13 hedged winners
    that were silently bleeding $1 each, plus one asymmetric hedge
    that goes from -$1.21 to -$0.21).

tools/backfill_fills_source.py — re-tag fills_ledger rows that
landed as ``source='manual'`` because posted_orders wasn't being
written during the May 11+ window:
  - Rule 1: match by (ticker, side, ts ±60s) against
    alpha_backtest cross_bracket_live posted decisions →
    ``source='cross_bracket'``.
  - Rule 2: YES buy at ≤15¢ following a same-ticker NO entry within
    12h → ``source='cross_bracket_exit'`` (heuristic; the exit code
    path doesn't log to alpha_backtest, but the price+timing is
    diagnostic).
  - Anything that doesn't match either rule stays as ``manual``.
  - Applied live on VPS: 26 cross_bracket + 5 cross_bracket_exit
    recovered. 3 left as manual (likely real human-placed fills).

7-test regression suite for the hedge-settlements backfill covering:
hedged winner correction, hedged loser with corrected (still-negative)
profit, pure winner, pure loser unchanged (rows already correct stay
that way), no-bot-fills rows skipped, dry-run writes nothing,
idempotent re-apply.

2348 tests passing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant