Insufficient test coverage and CI coverage for production validator#81
Conversation
WalkthroughA comprehensive testing and monitoring update adds GitHub Actions CI/CD with automated testing, a health monitoring script with Discord alerting, enhanced API health endpoint with database status, extensive test suite covering configuration/database/API/models/exceptions, and pytest fixtures for test infrastructure. Changes
Sequence Diagram(s)sequenceDiagram
participant HealthMonitor as Health Monitor<br/>(Script)
participant ValidatorAPI as Validator API<br/>(/health)
participant Database as Database
participant Discord as Discord<br/>Webhook
loop Every CHECK_INTERVAL_SECONDS
HealthMonitor->>ValidatorAPI: GET /health (timeout: 10s)
ValidatorAPI->>Database: get_validator_status()
Database-->>ValidatorAPI: status, evaluations_1h, queue_depth
ValidatorAPI-->>HealthMonitor: {status, db_connected, ...}
alt Response OK
HealthMonitor->>HealthMonitor: evaluate_health(response)
alt Healthy
alt Was Alerting
HealthMonitor->>Discord: POST recovery (green)
HealthMonitor->>HealthMonitor: is_alerting = false
else Normal
HealthMonitor->>HealthMonitor: log healthy status
end
else Unhealthy
alt Cooldown Expired
HealthMonitor->>Discord: POST alert (red)
HealthMonitor->>HealthMonitor: is_alerting = true
else In Cooldown
HealthMonitor->>HealthMonitor: log failure (no notify)
end
end
else Network/Error
HealthMonitor->>HealthMonitor: evaluate_health({error: "Unreachable"})
alt Cooldown Expired
HealthMonitor->>Discord: POST alert (red)
end
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (7)
tests/test_config.py (1)
15-19:netuid == 3is brittle for a defaults-loading test.Asserting an exact repo config value at Line 18 can create noisy failures when defaults are intentionally updated. Consider asserting validity constraints instead.
Example adjustment
def test_load_from_project_hparams(self): """Loads real hparams.json from repository.""" hparams = HParams.load() - assert hparams.netuid == 3 + assert isinstance(hparams.netuid, int) + assert hparams.netuid >= 0 assert hparams.evaluation_runs >= 1 assert hparams.eval_timeout > 0 assert hparams.benchmark_model_name != ""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_config.py` around lines 15 - 19, The test test_load_from_project_hparams currently asserts a brittle exact value (hparams.netuid == 3); instead update the assertions to check validity constraints: use HParams.load() to get hparams and replace the exact equality with checks like isinstance(hparams.netuid, int) and hparams.netuid >= 0 (or membership in an allowed set) while keeping the existing assertion for hparams.evaluation_runs >= 1; locate this in the test_load_from_project_hparams function and modify the hparams.netuid assertion accordingly.docker/compose.yml (1)
123-125: Avoid startup race for monitor alerts.Using
condition: service_startedcan start the monitor before the validator endpoint is actually ready, which may cause false alerts right after boot.Suggested compose tweak
depends_on: validator: - condition: service_started + condition: service_healthy🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker/compose.yml` around lines 123 - 125, The monitor currently depends on the validator using depends_on: validator with condition: service_started which can cause a race; add a healthcheck to the validator service (e.g., a curl/HTTP check against its readiness endpoint) and change the monitor's depends_on to wait for condition: service_healthy for the validator so the monitor only starts after the validator is actually ready; update the validator service block (add a healthcheck section) and update the monitor depends_on entry accordingly.scripts/health_monitor.py (2)
154-159: Consider handling SIGTERM for graceful Docker shutdown.Docker sends
SIGTERMbeforeSIGKILLwhen stopping containers. Currently onlyKeyboardInterrupt(SIGINT) is handled. The container will still stop, but adding SIGTERM handling would allow for cleaner log messages.🔧 Suggested SIGTERM handling
+import signal + +def handle_signal(signum, frame): + logger.info(f"Received signal {signum}, shutting down") + sys.exit(0) + if __name__ == "__main__": + signal.signal(signal.SIGTERM, handle_signal) try: run() except KeyboardInterrupt: logger.info("Monitor stopped") sys.exit(0)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/health_monitor.py` around lines 154 - 159, Add SIGTERM handling in the module's __main__ shutdown path so Docker stops are logged and cleaned up like KeyboardInterrupt; import the signal module, define a handler (e.g., shutdown_handler) that logs "Monitor stopped" via logger and calls sys.exit(0), register it with signal.signal(signal.SIGTERM, shutdown_handler), and keep the existing try/except around run() for KeyboardInterrupt so both run() and logger are used for graceful shutdown.
87-106: Consider handling unknown status explicitly.An
"unknown"status (the fallback on line 96) currently returns(True, "healthy"), which may mask issues if the endpoint returns an unexpected status. Consider treating unknown statuses as unhealthy or at least logging a warning.🔧 Alternative: treat unknown as unhealthy
if status == "degraded": return False, ( f"Validator is degraded: 0 evaluations in last hour, " f"queue_depth={data.get('queue_depth', '?')}" ) + if status != "healthy": + return False, f"Unexpected status: {status}" + return True, "healthy"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/health_monitor.py` around lines 87 - 106, The evaluate_health function currently treats the fallback status "unknown" as healthy; update evaluate_health(data: dict) to explicitly handle unexpected statuses by treating any status not in {"healthy","degraded","unhealthy"} as unhealthy (or at least return False) and return a descriptive reason including the unexpected status (use the local variable status and the input data for context); optionally emit or call a logger/warning when status is unknown to surface the anomaly instead of returning True, "healthy".tests/test_health_monitor.py (3)
49-53: Consider explicitly asserting no network calls in noop test.The test verifies no error is raised, but doesn't explicitly confirm
urlopenwasn't called. This would make the test more robust.🔧 Suggested improvement
def test_noop_when_no_webhook_url(self): """post_discord is a no-op when DISCORD_WEBHOOK_URL is empty.""" with patch.object(health_monitor, "DISCORD_WEBHOOK_URL", ""): - # Should return without error and without making any network calls - health_monitor.post_discord("test alert") + with patch.object(health_monitor, "urlopen") as mock_urlopen: + health_monitor.post_discord("test alert") + mock_urlopen.assert_not_called()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_health_monitor.py` around lines 49 - 53, Update the test_noop_when_no_webhook_url to not only patch health_monitor.DISCORD_WEBHOOK_URL to "" but also patch the network call (e.g., urllib.request.urlopen) and assert it was not called; specifically, wrap health_monitor.post_discord("test alert") with a patch on urllib.request.urlopen (or the exact urlopen symbol used by health_monitor) and add an assertion like urlopen.assert_not_called() after calling post_discord to ensure no network activity occurred.
90-100: Move URLError import to module level.The
URLErrorimport inside the test method works but is non-idiomatic. Moving it to the module-level imports improves consistency.🔧 Suggested change
Add to imports at top of file:
from urllib.error import URLErrorThen remove line 92.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_health_monitor.py` around lines 90 - 100, Move the URLError import from inside the test method to the module-level imports: add "from urllib.error import URLError" to the top of the test file and remove the inline import in test_returns_error_on_url_error; this keeps the test_returns_error_on_url_error function unchanged (it still patches health_monitor.urlopen and asserts probe_health returns an error) while making URLError a top-level import for consistency.
10-12: Consider using a pytest plugin orconftest.pyfor path setup.The
sys.path.insertapproach works but is fragile and duplicates path setup if other test files need the same import. Consider:
- Adding a
conftest.pyin the tests directory that does this once- Using
pytest-pythonpathor similar plugin- Making
scripts/a proper package with__init__.py🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_health_monitor.py` around lines 10 - 12, The test currently mutates sys.path inside tests/test_health_monitor.py via sys.path.insert to import health_monitor; remove that direct insertion and centralize path setup by either (A) adding a tests/conftest.py that performs the same sys.path.insert so all tests can import health_monitor without duplication, or (B) make scripts a proper package by adding an __init__.py in scripts and adjusting imports to use the package, or (C) configure pytest to use a pythonpath plugin; implement one of these options and delete the sys.path.insert line from tests/test_health_monitor.py so imports rely on the centralized setup instead of per-test manipulation of sys.path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/test.yml:
- Around line 30-36: The CI coverage gate currently sets thresholdAll: 60 which
doesn't meet the linked objective of >80%; update the workflow action block
using orgoro/coverage@v3.2 (the Upload coverage step that references
coverageFile: coverage.xml) to set thresholdAll to a value greater than 80
(e.g., 81) so the coverage gate enforces the >80% requirement.
In `@scripts/health_monitor.py`:
- Around line 37-38: Remove the unused environment variable binding
STALE_THRESHOLD (from STALE_THRESHOLD_SECONDS) and any references in the module
docstring: delete the line defining STALE_THRESHOLD =
int(os.getenv("STALE_THRESHOLD_SECONDS", "3600")) and remove the
STALE_THRESHOLD_SECONDS mention in the file-level docstring so only
actually-used config (e.g., ALERT_COOLDOWN) remains; confirm no other code
references STALE_THRESHOLD or STALE_THRESHOLD_SECONDS before removing.
In `@src/crusades/api/server.py`:
- Around line 132-145: The health endpoint currently sets "db_connected": True
even when get_db_client() returns a MockClient; update the logic in the health
handler (where get_db_client(), client.get_validator_status(),
client.get_queue_stats() and evaluations_1h are used) to detect mocked clients
and report db_connected=False for them—e.g., check isinstance(client,
MockClient) or a boolean flag like client.is_mock (prefer the existing property
if present), and set db_connected = False in the returned dict when the client
is a MockClient; otherwise keep db_connected = True and preserve the existing
status/evaluations_1h logic.
In `@tests/test_api.py`:
- Around line 12-32: The fixtures client and authed_client mutate the
module-level singleton server_module.app.state.api_key which can leak between
tests; change each fixture to create or clone a fresh FastAPI app instance (or
restore the original state in a finalizer) instead of directly mutating
server_module.app.state, e.g., obtain a new app copy for TestClient or save the
original value of server_module.app.state.api_key and restore it after the yield
(using a try/finally or fixture teardown) while continuing to patch
server_module._db_client with MockClient so tests remain isolated.
---
Nitpick comments:
In `@docker/compose.yml`:
- Around line 123-125: The monitor currently depends on the validator using
depends_on: validator with condition: service_started which can cause a race;
add a healthcheck to the validator service (e.g., a curl/HTTP check against its
readiness endpoint) and change the monitor's depends_on to wait for condition:
service_healthy for the validator so the monitor only starts after the validator
is actually ready; update the validator service block (add a healthcheck
section) and update the monitor depends_on entry accordingly.
In `@scripts/health_monitor.py`:
- Around line 154-159: Add SIGTERM handling in the module's __main__ shutdown
path so Docker stops are logged and cleaned up like KeyboardInterrupt; import
the signal module, define a handler (e.g., shutdown_handler) that logs "Monitor
stopped" via logger and calls sys.exit(0), register it with
signal.signal(signal.SIGTERM, shutdown_handler), and keep the existing
try/except around run() for KeyboardInterrupt so both run() and logger are used
for graceful shutdown.
- Around line 87-106: The evaluate_health function currently treats the fallback
status "unknown" as healthy; update evaluate_health(data: dict) to explicitly
handle unexpected statuses by treating any status not in
{"healthy","degraded","unhealthy"} as unhealthy (or at least return False) and
return a descriptive reason including the unexpected status (use the local
variable status and the input data for context); optionally emit or call a
logger/warning when status is unknown to surface the anomaly instead of
returning True, "healthy".
In `@tests/test_config.py`:
- Around line 15-19: The test test_load_from_project_hparams currently asserts a
brittle exact value (hparams.netuid == 3); instead update the assertions to
check validity constraints: use HParams.load() to get hparams and replace the
exact equality with checks like isinstance(hparams.netuid, int) and
hparams.netuid >= 0 (or membership in an allowed set) while keeping the existing
assertion for hparams.evaluation_runs >= 1; locate this in the
test_load_from_project_hparams function and modify the hparams.netuid assertion
accordingly.
In `@tests/test_health_monitor.py`:
- Around line 49-53: Update the test_noop_when_no_webhook_url to not only patch
health_monitor.DISCORD_WEBHOOK_URL to "" but also patch the network call (e.g.,
urllib.request.urlopen) and assert it was not called; specifically, wrap
health_monitor.post_discord("test alert") with a patch on urllib.request.urlopen
(or the exact urlopen symbol used by health_monitor) and add an assertion like
urlopen.assert_not_called() after calling post_discord to ensure no network
activity occurred.
- Around line 90-100: Move the URLError import from inside the test method to
the module-level imports: add "from urllib.error import URLError" to the top of
the test file and remove the inline import in test_returns_error_on_url_error;
this keeps the test_returns_error_on_url_error function unchanged (it still
patches health_monitor.urlopen and asserts probe_health returns an error) while
making URLError a top-level import for consistency.
- Around line 10-12: The test currently mutates sys.path inside
tests/test_health_monitor.py via sys.path.insert to import health_monitor;
remove that direct insertion and centralize path setup by either (A) adding a
tests/conftest.py that performs the same sys.path.insert so all tests can import
health_monitor without duplication, or (B) make scripts a proper package by
adding an __init__.py in scripts and adjusting imports to use the package, or
(C) configure pytest to use a pythonpath plugin; implement one of these options
and delete the sys.path.insert line from tests/test_health_monitor.py so imports
rely on the centralized setup instead of per-test manipulation of sys.path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ad5763a3-dad0-40bd-9751-754375e16108
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (13)
.github/workflows/test.ymldocker/compose.ymlpyproject.tomlscripts/health_monitor.pysrc/crusades/api/server.pytests/__init__.pytests/conftest.pytests/test_api.pytests/test_config.pytests/test_database.pytests/test_exceptions.pytests/test_health_monitor.pytests/test_models.py
| - name: Upload coverage | ||
| if: github.event_name == 'pull_request' | ||
| uses: orgoro/coverage@v3.2 | ||
| with: | ||
| coverageFile: coverage.xml | ||
| token: ${{ secrets.GITHUB_TOKEN }} | ||
| thresholdAll: 60 |
There was a problem hiding this comment.
Coverage gate does not meet the linked objective.
Line 36 sets thresholdAll: 60, but the linked issue objective specifies coverage target >80%. This leaves the requirement only partially implemented.
Minimal fix
- name: Upload coverage
if: github.event_name == 'pull_request'
uses: orgoro/coverage@v3.2
with:
coverageFile: coverage.xml
token: ${{ secrets.GITHUB_TOKEN }}
- thresholdAll: 60
+ thresholdAll: 80📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Upload coverage | |
| if: github.event_name == 'pull_request' | |
| uses: orgoro/coverage@v3.2 | |
| with: | |
| coverageFile: coverage.xml | |
| token: ${{ secrets.GITHUB_TOKEN }} | |
| thresholdAll: 60 | |
| - name: Upload coverage | |
| if: github.event_name == 'pull_request' | |
| uses: orgoro/coverage@v3.2 | |
| with: | |
| coverageFile: coverage.xml | |
| token: ${{ secrets.GITHUB_TOKEN }} | |
| thresholdAll: 80 |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/test.yml around lines 30 - 36, The CI coverage gate
currently sets thresholdAll: 60 which doesn't meet the linked objective of >80%;
update the workflow action block using orgoro/coverage@v3.2 (the Upload coverage
step that references coverageFile: coverage.xml) to set thresholdAll to a value
greater than 80 (e.g., 81) so the coverage gate enforces the >80% requirement.
| STALE_THRESHOLD = int(os.getenv("STALE_THRESHOLD_SECONDS", "3600")) | ||
| ALERT_COOLDOWN = int(os.getenv("ALERT_COOLDOWN_SECONDS", "300")) |
There was a problem hiding this comment.
STALE_THRESHOLD_SECONDS is defined but unused.
The environment variable STALE_THRESHOLD_SECONDS is documented and loaded but never referenced in the code. The staleness detection appears to be handled by the /health endpoint itself (which returns status: "degraded" when evaluations_1h == 0), making this variable dead code.
🗑️ Suggested fix: Remove unused variable
DISCORD_WEBHOOK_URL = os.getenv("DISCORD_WEBHOOK_URL", "")
CHECK_INTERVAL = int(os.getenv("CHECK_INTERVAL_SECONDS", "60"))
-STALE_THRESHOLD = int(os.getenv("STALE_THRESHOLD_SECONDS", "3600"))
ALERT_COOLDOWN = int(os.getenv("ALERT_COOLDOWN_SECONDS", "300"))Also remove from the docstring (line 15).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| STALE_THRESHOLD = int(os.getenv("STALE_THRESHOLD_SECONDS", "3600")) | |
| ALERT_COOLDOWN = int(os.getenv("ALERT_COOLDOWN_SECONDS", "300")) | |
| DISCORD_WEBHOOK_URL = os.getenv("DISCORD_WEBHOOK_URL", "") | |
| CHECK_INTERVAL = int(os.getenv("CHECK_INTERVAL_SECONDS", "60")) | |
| ALERT_COOLDOWN = int(os.getenv("ALERT_COOLDOWN_SECONDS", "300")) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/health_monitor.py` around lines 37 - 38, Remove the unused
environment variable binding STALE_THRESHOLD (from STALE_THRESHOLD_SECONDS) and
any references in the module docstring: delete the line defining STALE_THRESHOLD
= int(os.getenv("STALE_THRESHOLD_SECONDS", "3600")) and remove the
STALE_THRESHOLD_SECONDS mention in the file-level docstring so only
actually-used config (e.g., ALERT_COOLDOWN) remains; confirm no other code
references STALE_THRESHOLD or STALE_THRESHOLD_SECONDS before removing.
| client = get_db_client() | ||
| if client is None: | ||
| return {"status": "healthy", "db_connected": False} | ||
|
|
||
| validator_status = client.get_validator_status() | ||
| queue = client.get_queue_stats() | ||
|
|
||
| evaluations_1h = validator_status.get("evaluations_completed_1h", 0) | ||
| status = "healthy" if evaluations_1h > 0 else "degraded" | ||
|
|
||
| return { | ||
| "status": status, | ||
| "db_connected": True, | ||
| "evaluations_1h": evaluations_1h, |
There was a problem hiding this comment.
db_connected can be incorrectly true when running on MockClient.
At Line 132, get_db_client() may return MockClient (DB missing path), but Line 144 still reports "db_connected": True. That can mask real DB outages in /health.
Suggested fix
try:
client = get_db_client()
if client is None:
return {"status": "healthy", "db_connected": False}
+ db_connected = not isinstance(client, MockClient)
validator_status = client.get_validator_status()
queue = client.get_queue_stats()
evaluations_1h = validator_status.get("evaluations_completed_1h", 0)
status = "healthy" if evaluations_1h > 0 else "degraded"
return {
"status": status,
- "db_connected": True,
+ "db_connected": db_connected,
"evaluations_1h": evaluations_1h,
"queue_depth": queue.get("queued_count", 0),
"current_evaluation": validator_status.get("current_evaluation"),
"validator_status": validator_status.get("status", "unknown"),
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| client = get_db_client() | |
| if client is None: | |
| return {"status": "healthy", "db_connected": False} | |
| validator_status = client.get_validator_status() | |
| queue = client.get_queue_stats() | |
| evaluations_1h = validator_status.get("evaluations_completed_1h", 0) | |
| status = "healthy" if evaluations_1h > 0 else "degraded" | |
| return { | |
| "status": status, | |
| "db_connected": True, | |
| "evaluations_1h": evaluations_1h, | |
| client = get_db_client() | |
| if client is None: | |
| return {"status": "healthy", "db_connected": False} | |
| db_connected = not isinstance(client, MockClient) | |
| validator_status = client.get_validator_status() | |
| queue = client.get_queue_stats() | |
| evaluations_1h = validator_status.get("evaluations_completed_1h", 0) | |
| status = "healthy" if evaluations_1h > 0 else "degraded" | |
| return { | |
| "status": status, | |
| "db_connected": db_connected, | |
| "evaluations_1h": evaluations_1h, | |
| "queue_depth": queue.get("queued_count", 0), | |
| "current_evaluation": validator_status.get("current_evaluation"), | |
| "validator_status": validator_status.get("status", "unknown"), | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/crusades/api/server.py` around lines 132 - 145, The health endpoint
currently sets "db_connected": True even when get_db_client() returns a
MockClient; update the logic in the health handler (where get_db_client(),
client.get_validator_status(), client.get_queue_stats() and evaluations_1h are
used) to detect mocked clients and report db_connected=False for them—e.g.,
check isinstance(client, MockClient) or a boolean flag like client.is_mock
(prefer the existing property if present), and set db_connected = False in the
returned dict when the client is a MockClient; otherwise keep db_connected =
True and preserve the existing status/evaluations_1h logic.
| @pytest.fixture | ||
| def client(): | ||
| """TestClient using the module-level app with MockClient injected.""" | ||
| mock = MockClient() | ||
| app = server_module.app | ||
| # Ensure no API key is required | ||
| app.state.api_key = None | ||
| with patch.object(server_module, "_db_client", mock): | ||
| with TestClient(app) as c: | ||
| yield c | ||
|
|
||
|
|
||
| @pytest.fixture | ||
| def authed_client(): | ||
| """TestClient with API key required.""" | ||
| mock = MockClient() | ||
| app = server_module.app | ||
| app.state.api_key = "test-secret-key" | ||
| with patch.object(server_module, "_db_client", mock): | ||
| with TestClient(app) as c: | ||
| yield c |
There was a problem hiding this comment.
Potential test isolation issue with shared app state.
Both fixtures directly modify server_module.app.state.api_key. Since app is a module-level singleton, this mutation persists across tests. If tests run in parallel or in unexpected order, they may interfere with each other.
Consider using a fresh app instance per test or resetting the state in a fixture finalizer.
🔧 Suggested improvement using fixture cleanup
`@pytest.fixture`
def client():
"""TestClient using the module-level app with MockClient injected."""
mock = MockClient()
app = server_module.app
+ original_api_key = app.state.api_key
# Ensure no API key is required
app.state.api_key = None
with patch.object(server_module, "_db_client", mock):
with TestClient(app) as c:
yield c
+ app.state.api_key = original_api_key📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @pytest.fixture | |
| def client(): | |
| """TestClient using the module-level app with MockClient injected.""" | |
| mock = MockClient() | |
| app = server_module.app | |
| # Ensure no API key is required | |
| app.state.api_key = None | |
| with patch.object(server_module, "_db_client", mock): | |
| with TestClient(app) as c: | |
| yield c | |
| @pytest.fixture | |
| def authed_client(): | |
| """TestClient with API key required.""" | |
| mock = MockClient() | |
| app = server_module.app | |
| app.state.api_key = "test-secret-key" | |
| with patch.object(server_module, "_db_client", mock): | |
| with TestClient(app) as c: | |
| yield c | |
| `@pytest.fixture` | |
| def client(): | |
| """TestClient using the module-level app with MockClient injected.""" | |
| mock = MockClient() | |
| app = server_module.app | |
| original_api_key = app.state.api_key | |
| # Ensure no API key is required | |
| app.state.api_key = None | |
| with patch.object(server_module, "_db_client", mock): | |
| with TestClient(app) as c: | |
| yield c | |
| app.state.api_key = original_api_key | |
| `@pytest.fixture` | |
| def authed_client(): | |
| """TestClient with API key required.""" | |
| mock = MockClient() | |
| app = server_module.app | |
| app.state.api_key = "test-secret-key" | |
| with patch.object(server_module, "_db_client", mock): | |
| with TestClient(app) as c: | |
| yield c |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_api.py` around lines 12 - 32, The fixtures client and
authed_client mutate the module-level singleton server_module.app.state.api_key
which can leak between tests; change each fixture to create or clone a fresh
FastAPI app instance (or restore the original state in a finalizer) instead of
directly mutating server_module.app.state, e.g., obtain a new app copy for
TestClient or save the original value of server_module.app.state.api_key and
restore it after the yield (using a try/finally or fixture teardown) while
continuing to patch server_module._db_client with MockClient so tests remain
isolated.
Summary
/healthendpoint insrc/crusades/api/server.pyto report DB connectivity, evaluations in the last hour, queue depth, and degraded/unhealthy status instead of a static{"status": "healthy"}scripts/health_monitor.py: a standalone loop that probes the validator/healthendpoint on a configurable interval and posts Discord webhook alerts on failure, with cooldown and recovery notificationshealth-monitorservice todocker/compose.ymlunder amonitoringprofile, wired to the validator container.github/workflows/test.ymlrunning lint (ruff), pytest with coverage (--cov), and coverage reporting on PRs (threshold 60%)tests/test_api.py,test_health_monitor.py,test_config.py,test_database.py,test_exceptions.py,test_models.py,conftest.py) — 147 tests passing covering API endpoints, health monitor logic, config loading, database operations, exception handling, and modelsValidation
pytest tests/ -v— 147 passed, 0 failedevaluate_health()across 4 status paths,post_discord()with and without webhook, andprobe_health()for success/URLError/generic exception scenariosRisks
health-monitorDocker service depends onscripts/health_monitor.pybeing present at the repo root; renaming or moving the script without updating the compose volume mount will silently break monitoringCloses #79
Summary by CodeRabbit
Release Notes
New Features
Chores