Skip to content

fix(wrap): detect stale proxy on target port and auto-cleanup before starting#1356

Open
lennney wants to merge 1 commit into
headroomlabs-ai:mainfrom
lennney:fix/wrap-port-conflict-cleanup
Open

fix(wrap): detect stale proxy on target port and auto-cleanup before starting#1356
lennney wants to merge 1 commit into
headroomlabs-ai:mainfrom
lennney:fix/wrap-port-conflict-cleanup

Conversation

@lennney

@lennney lennney commented Jun 24, 2026

Copy link
Copy Markdown

Description

When a terminal running headroom wrap <agent> is killed without proper cleanup (window close, SSH timeout, crash), the background proxy process becomes orphaned and continues holding the proxy port. The next headroom wrap on the same port waits 30-45 seconds (the proxy cold-start window) and then fails with a confusing RuntimeError("Proxy failed to start...").

This PR adds stale proxy detection and auto-cleanup at the top of _start_proxy(). Before spawning uvicorn, it checks whether the target port is already occupied by a headroom proxy (orphaned or otherwise). If so, it kills the old proxy and starts fresh. If the port is held by a non-headroom process, it reports a clear error.

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Performance improvement
  • Code refactoring (no functional changes)

Changes Made

  • headroom/cli/wrap.py: Add _ensure_port_free(), _find_process_on_port(), _linux_find_process_on_port(), _resolve_inode_to_pid(), _is_headroom_proxy(), _kill_process(). New functions detect stale headroom proxy on the target port and clean it up before _start_proxy() spawns uvicorn. Uses only /proc/net/tcp + /proc/*/fd/ (Linux) or lsof (macOS fallback). Zero new dependencies.
  • tests/test_cli/test_wrap_helpers.py: Add 11 new tests covering all helper functions, /proc/net/tcp parsing, socket symlink matching, tcp6 fallback, stale detection, non-headroom rejection, and kill escalation.

Testing

  • Unit tests pass (pytest): python -m pytest tests/test_cli/test_wrap_helpers.py::TestEnsurePortFree -v — 11 passed
  • Linting passes (ruff check .)
  • Type checking passes (mypy headroom)
  • New tests added for new functionality
  • Manual testing performed

Test Output

> python -m pytest tests/test_cli/test_wrap_helpers.py::TestEnsurePortFree -v --no-header
============================= 11 passed in 0.34s ==============================
test_ensure_port_free_port_is_free PASSED
test_ensure_port_free_stale_headroom PASSED
test_ensure_port_free_non_headroom PASSED
test_linux_find_process_on_port_empty PASSED
test_linux_find_process_on_port_found PASSED
test_is_headroom_proxy_true PASSED
test_is_headroom_proxy_false PASSED
test_kill_process_terminates PASSED
test_kill_process_force_kill PASSED
test_resolve_inode_to_pid_matches_symlink PASSED
test_linux_find_process_on_port_tcp6 PASSED

Real Behavior Proof

  • Environment: Ubuntu 24.04 x86_64, Python 3.12.3, headroom 0.26.0 editable install in Hermes venv
  • Exact command / steps: Start headroom proxy on port 18794 via headroom proxy --port 18794, verify bound via socket probe, call _ensure_port_free(18794) from the modified code, verify return value True, verify socket probe returns None (port free), verify old process dead via os.kill(pid, 0) → OSError
  • Observed result: _ensure_port_free detected the stale proxy via /proc/net/tcp + /proc/ PID resolution, sent SIGTERM, waited 3s, verified port free, returned True. Port 18794 was free and the old PID was confirmed dead. For non-headroom processes (e.g. HTTP server), _ensure_port_free returned False with no kill.
  • Not tested: Windows (/proc/net/tcp unavailable, lsof not guaranteed — gracefully returns None). macOS with lsof fallback (code path exists but no macOS CI). Headroom proxy on macOS (no macOS runner available).

Review Readiness

  • I have performed a self-review
  • This PR is ready for human review

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have updated the CHANGELOG.md if applicable

@github-actions

github-actions Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

PR governance

This PR follows the template and is marked ready for human review.

@github-actions github-actions Bot added status: needs author action Pull request body or readiness checklist still needs author updates status: has conflicts Pull request has merge conflicts with the base branch labels Jun 24, 2026
@lennney lennney force-pushed the fix/wrap-port-conflict-cleanup branch from 1e8bafa to 72513ea Compare June 24, 2026 07:30
@github-actions github-actions Bot removed the status: has conflicts Pull request has merge conflicts with the base branch label Jun 24, 2026
…starting

Adds _ensure_port_free() and helpers to _start_proxy() that:
- Detect if target port is already in use (via existing _port_bind_error)
- Find the owning process via /proc/net/tcp (Linux) or lsof (macOS)
- Only kills processes identified as stale headroom proxies
- Reports clear error for non-headroom processes
- Uses zero new dependencies

Fixes the case where a terminal is killed, leaving an orphaned headroom
proxy on the port — the next wrap would wait 30s then fail with a
confusing RuntimeError. Now it auto-cleans and restarts.

Tests: 11 new tests for all helpers + edge cases (50 total, all pass)
@lennney lennney force-pushed the fix/wrap-port-conflict-cleanup branch from 72513ea to 630ca1c Compare June 24, 2026 07:33
@github-actions github-actions Bot added status: ready for review Pull request body is complete and the author marked it ready for human review and removed status: needs author action Pull request body or readiness checklist still needs author updates labels Jun 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status: ready for review Pull request body is complete and the author marked it ready for human review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant