Security fix#23
Conversation
- Add env, __main__, miner_train, logging to forbidden modules - Add AST Attribute.attr scanning against forbidden strings - Hide env module from sys.modules during inner_steps() execution - Add CUDA event-based timing as third untamperable timer source - Add three-way timer cross-check (perf_counter vs monotonic vs CUDA events) - Add timer function identity check (id() comparison post-execution) - Block torch._C access and torch._dynamo/_inductor config writes - Expand forbidden strings with _CACHE, initial_state, _hidden_modules, _sensitive_keys - Add MFU sanity cap (75%) to reject physically impossible scores - Formatting cleanup (black/ruff)
- Add env, __main__, miner_train, logging to forbidden modules - Add AST Attribute.attr scanning against forbidden strings - Block torch._C access and torch._dynamo/_inductor config writes - Expand forbidden strings with _CACHE, initial_state, _hidden_modules, _sensitive_keys
…alog - Document full validation pipeline (4 layers) - Document known timer manipulation attack and how it evaded each defense - Catalog all anticheat mechanisms and verification checks - List potential bypass vectors with risk ratings - Track completed and open security recommendations
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. WalkthroughAdds a new security analysis document and implements runtime integrity hardening: expanded static AST/string anti-tamper checks, three-way timing (perf_counter, monotonic, CUDA) with reconciliation and MFU cap, module-hiding/cleanup, and SHA‑256 code-hash propagation and verification across submission/evaluation flows. Changes
Sequence DiagramsequenceDiagram
participant Validator
participant Miner
participant CUDA
participant Torch
Validator->>Validator: Hide sensitive modules (env, logging, etc.)
Validator->>Validator: Start perf_counter & monotonic timers
Validator->>CUDA: Record CUDA start event
Validator->>Miner: Execute miner.inner_steps()
Miner->>Torch: Run training ops (forward/backward/optim)
Validator->>CUDA: Record CUDA end event
Validator->>Validator: Stop perf_counter & monotonic timers
Validator->>Validator: Extract CUDA elapsed time
Validator->>Validator: Reconcile perf_counter, monotonic, CUDA
alt timers inconsistent
Validator->>Validator: Abort with timer_tampering error
else timers consistent
Validator->>Validator: Compute canonical wall_time and MFU
alt MFU > MAX_PLAUSIBLE_MFU
Validator->>Validator: Flag as tampering / fail
else
Validator->>Validator: Continue gradient & weight validation
end
end
Validator->>Validator: Restore sys.modules and cleanup
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@docs/security-analysis.md`:
- Around line 149-163: The table and text incorrectly state a 6% gradient
relative-error threshold for Check 3, but the code defaults use
gradient_norm_ratio_max = 1.10 (10%); update the docs to either state the actual
default (10%) or explicitly note that the table values are config-dependent and
reference the runtime parameter gradient_norm_ratio_max (used in env.py and
verify.py) so readers know to consult hparams.json or that variable for the
deployed threshold.
- Around line 39-45: Update the markdown fenced code blocks that currently lack
language specifiers by adding appropriate languages: change the plain backtick
fences around the identifier lists (the blocks containing ctypes, _ctypes, gc,
subprocess, sys, os, pathlib, io, socket, http, urllib, requests, shutil,
tempfile, signal, threading, multiprocessing, inspect, ast, dis, code, codeop,
compileall, pickle, shelve, marshal, builtins, _builtins, operator, types,
codecs, base64, pdb, pprint, env, __main__, miner_train, logging and the other
similar identifier lists) to use ```text, and change the Docker/CLI block to use
```bash so rendering and accessibility improve (look for the triple-backtick
blocks in the document and update their opening fence to the appropriate
language).
- Around line 105-116: The docker example uses inline “#” comments placed after
backslash line continuations (e.g., the "docker run" block and flags like
"--network none \", "--cap-drop ALL \", "--read-only \"), which is invalid Bash
and will break when copy-pasted; update the snippet by moving each explanatory
comment to its own line immediately above the corresponding flag (or
remove/comment that the snippet is pseudocode), ensuring no inline comments
follow a trailing "\" and preserving the same flags and tmpfs mounts like
"--tmpfs /tmp:rw,exec,nosuid,size=4g" and volume mount "-v
miner_train.py:/app/scripts/miner_train.py:ro" so the example is valid shell
syntax.
In `@environments/templar/env.py`:
- Around line 706-710: The blocklist _FORBIDDEN_STRINGS currently includes
"initial_state" which, combined with the attribute-name scanning logic, will
falsely reject legitimate miner attributes like self.initial_state; remove
"initial_state" from _FORBIDDEN_STRINGS (or alternatively exclude it from the
attribute-name scan) and rely on the string-constant/getattr scanning to catch
accesses to the validator's internal variable; if you need attribute-level
protection keep the attribute-name scan but narrow its blocklist to
validator-specific identifiers (or only flag attributes used in getattr calls)
so that common names like initial_state are not rejected by the attribute-name
scanner.
In `@local_test/verify.py`:
- Around line 415-424: The attribute-scan is too broad: it currently flags any
ast.Attribute whose node.attr equals an entry in _FORBIDDEN_STRINGS (e.g.,
initial_state) and causes false positives; narrow the check so only truly
private/dangerous attribute names are matched — for example introduce a focused
set like _FORBIDDEN_ATTRIBUTE_NAMES (or require pattern.startswith("_")) and use
that for the ast.Attribute loop (the code using scan_tree, node.attr,
_FORBIDDEN_STRINGS, and violations), leaving the string-constant getattr(...)
scan to catch explicit reflective access; apply the same change where the
duplicate logic exists (the second occurrence mentioned).
🧹 Nitpick comments (5)
local_test/verify.py (1)
281-298:torch._dynamo.config/torch._inductor.configcheck doesn't verify the root istorch.The AST pattern walks three levels of
ast.Attributenesting but never verifies that the outermost object istorch(i.e.,target.value.value.valueisast.Namewithid == "torch"). This meansany_object._dynamo.config.foo = 1would also be flagged. Acceptable as a conservative security measure, but noting it for awareness since thetorch._Ccheck on line 283 does verify the root.This same pattern exists in
environments/templar/env.py(lines 467-480).🔒 Optional: add root-object check for consistency with `torch._C` pattern
# Block torch._dynamo.config and torch._inductor.config writes if isinstance(node, ast.Assign): for target in node.targets: if isinstance(target, ast.Attribute) and isinstance(target.value, ast.Attribute): if target.value.attr == "config" and isinstance( target.value.value, ast.Attribute ): if target.value.value.attr in ("_dynamo", "_inductor"): - line = getattr(node, "lineno", "?") - violations.append( - f"Line {line}: modifying torch.{target.value.value.attr}.config is forbidden" - ) + if isinstance(target.value.value.value, ast.Name) and target.value.value.value.id == "torch": + line = getattr(node, "lineno", "?") + violations.append( + f"Line {line}: modifying torch.{target.value.value.attr}.config is forbidden" + )docs/security-analysis.md (1)
1-378: Consider whether this detailed security analysis should be in a public repository.This document comprehensively describes the validator's security mechanisms, known attack vectors, mitigations, and remaining weaknesses. While transparency is valuable, it also provides a detailed roadmap for attackers (e.g., the "Potential Bypass Vectors" section,
str.join()evasion,torch.opssurface). If this repo is public, consider moving sensitive details (especially open bypass vectors and TODO items) to a private internal doc.environments/templar/env.py (3)
459-480:torch._dynamo.config/torch._inductor.configpattern doesn't verify the root istorch.Same issue as noted in
local_test/verify.py: the check walks three levels ofast.Attributenesting but doesn't verifytarget.value.value.valueisast.Name(id="torch"). Anyx._dynamo.config.y = zwould be flagged. As a security sandbox, over-matching is acceptable, but it's inconsistent with thetorch._Ccheck (line 460-463) which does verify the root.
2028-2042: Module hidingforloop sits outside thetry/finally— popped modules could be lost on unexpected failure.If an exception occurs during the
forloop (lines 2030-2032) after some modules are popped but before thetryat line 2033, those modules won't be restored by thefinally. The risk is extremely low (iterating and popping known keys fromsys.modulesshouldn't fail), but for a security-critical path, wrapping the pop loop inside thetry/finallywould be more robust. Same pattern appears for the timed eval (lines 2174-2187).🛡️ Optional: wrap pop loop inside try/finally
- _hidden_warmup = {} - for _mk in list(sys.modules): - if _mk == "env" or _mk == __name__ or _mk.endswith(".env"): - _hidden_warmup[_mk] = sys.modules.pop(_mk) - try: - warmup_result = miner_module.inner_steps( + _hidden_warmup = {} + try: + for _mk in list(sys.modules): + if _mk == "env" or _mk == __name__ or _mk.endswith(".env"): + _hidden_warmup[_mk] = sys.modules.pop(_mk) + warmup_result = miner_module.inner_steps( model=model, data_iterator=data_iter_warmup, optimizer=optimizer_warmup,Apply the same change to lines 2174-2187.
2189-2212: CUDA end event is recorded after the timer identity check — minor timing skew.The CUDA end event (
_cuda_end_event.record(), line 2209) is placed after the identity check (lines 2191-2206) and module restoration (line 2187). This means the CUDA elapsed time includes a small amount of post-miner overhead (identity check, module restore). For any non-trivial workload the skew is negligible, but for maximum measurement accuracy, the end event could be recorded immediately afterinner_stepsreturns (inside thefinallyblock, after module restore but before the identity check):This is fine as-is given the 10% divergence threshold, but worth noting for future tightening.
| ``` | ||
| docker run --rm \ | ||
| --network none \ # NO network access | ||
| --cap-drop ALL \ # Drop all Linux capabilities | ||
| --security-opt no-new-privileges \ | ||
| --read-only \ # Read-only filesystem | ||
| --pids-limit 1024 \ # Limit process spawning | ||
| --tmpfs /tmp:rw,exec,nosuid,size=4g \ # Writable /tmp (torch.compile) | ||
| --tmpfs /home/appuser/.triton:rw,exec,size=2g # Triton cache | ||
| --memory 80g --shm-size 32g \ | ||
| -v miner_train.py:/app/scripts/miner_train.py:ro # Read-only mount | ||
| ``` |
There was a problem hiding this comment.
Docker command example uses inline comments after line continuations — this is invalid bash syntax.
In bash, a # comment after a \ continuation does not work as expected. The \ escapes the # or the comment breaks the continuation. This example would fail if copy-pasted into a shell.
Consider reformatting with comments on their own lines above each flag, or note that this is pseudocode.
🧰 Tools
🪛 markdownlint-cli2 (0.20.0)
[warning] 105-105: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
In `@docs/security-analysis.md` around lines 105 - 116, The docker example uses
inline “#” comments placed after backslash line continuations (e.g., the "docker
run" block and flags like "--network none \", "--cap-drop ALL \", "--read-only
\"), which is invalid Bash and will break when copy-pasted; update the snippet
by moving each explanatory comment to its own line immediately above the
corresponding flag (or remove/comment that the snippet is pseudocode), ensuring
no inline comments follow a trailing "\" and preserving the same flags and tmpfs
mounts like "--tmpfs /tmp:rw,exec,nosuid,size=4g" and volume mount "-v
miner_train.py:/app/scripts/miner_train.py:ro" so the example is valid shell
syntax.
| | # | Check | Threshold | What It Catches | | ||
| |---|-------|-----------|----------------| | ||
| | 1 | Token count == expected | Exact match | Skipping batches, changing batch size | | ||
| | 2 | Loss valid & close to reference | `\|diff\| < 0.3` | NaN injection, wrong loss function, skipping backward | | ||
| | 3 | Gradient relative error | `\|g - g_ref\| / \|g_ref\| < 6%` | Modified backward pass, frozen layers, gradient manipulation | | ||
| | 4 | Final weight relative error | `\|w - w_ref\| / \|w_ref\| < 0.8%` | Wrong optimizer, skipped steps, modified optimizer state | | ||
| | 5 | All layers have gradients | 100% coverage | Layer freezing | | ||
| | 6 | Parameter elements changed | ≥80% | No-op training | | ||
| | 7 | Sequence length matches expected | Exact match | Sequence truncation (cheap speedup) | | ||
| | 8 | Logits shape is 3D | `(batch, seq, vocab)` | Returning dummy data | | ||
| | 9 | Backend settings unchanged | Exact match | Disabling flash attention, changing precision | | ||
| | 10 | Timer integrity (3-way) | All pairs within 10% | Timer function replacement (see below) | | ||
| | 11 | Timer function identity | `id()` unchanged | Direct timer reference replacement | | ||
| | 12 | Optimizer wrapper type unchanged | `type(opt) is GradientCapturingOptimizer` | Replacing the wrapper | | ||
| | 13 | MFU sanity cap | ≤75% | Physically impossible scores from any timing bypass | |
There was a problem hiding this comment.
Gradient tolerance stated as 6% may not match code defaults (10%).
Check 3 in the table states |g - g_ref| / |g_ref| < 6%, and line 269 repeats "6% relative error." However, the code default in both env.py and verify.py is gradient_norm_ratio_max = 1.10, which yields a 10% threshold. If this 6% comes from production hparams.json, consider noting that the table values are config-dependent, or verify the doc matches the deployed config.
🤖 Prompt for AI Agents
In `@docs/security-analysis.md` around lines 149 - 163, The table and text
incorrectly state a 6% gradient relative-error threshold for Check 3, but the
code defaults use gradient_norm_ratio_max = 1.10 (10%); update the docs to
either state the actual default (10%) or explicitly note that the table values
are config-dependent and reference the runtime parameter gradient_norm_ratio_max
(used in env.py and verify.py) so readers know to consult hparams.json or that
variable for the deployed threshold.
| # Prevent access to validator env internals | ||
| "_CACHE", | ||
| "initial_state", | ||
| "_hidden_modules", | ||
| "_sensitive_keys", |
There was a problem hiding this comment.
initial_state in _FORBIDDEN_STRINGS combined with new attribute-name scanning may cause false positives.
Same concern as in local_test/verify.py: with the attribute-name scan (lines 752-761), any miner code using self.initial_state as an attribute name will be rejected. This is a common name in training code for saving pre-training weights.
If the intent is solely to block access to the validator's internal initial_state variable, the string-constant scan (catching getattr(x, "initial_state")) may suffice. Consider whether the attribute-name scan should use a narrower blocklist than _FORBIDDEN_STRINGS.
🤖 Prompt for AI Agents
In `@environments/templar/env.py` around lines 706 - 710, The blocklist
_FORBIDDEN_STRINGS currently includes "initial_state" which, combined with the
attribute-name scanning logic, will falsely reject legitimate miner attributes
like self.initial_state; remove "initial_state" from _FORBIDDEN_STRINGS (or
alternatively exclude it from the attribute-name scan) and rely on the
string-constant/getattr scanning to catch accesses to the validator's internal
variable; if you need attribute-level protection keep the attribute-name scan
but narrow its blocklist to validator-specific identifiers (or only flag
attributes used in getattr calls) so that common names like initial_state are
not rejected by the attribute-name scanner.
| # Scan attribute names (e.g. _e._perf_counter) — AST string scan only | ||
| # catches ast.Constant values, not ast.Attribute.attr names | ||
| for node in ast.walk(scan_tree): | ||
| if isinstance(node, ast.Attribute): | ||
| for pattern in _FORBIDDEN_STRINGS: | ||
| if node.attr == pattern: | ||
| line = getattr(node, "lineno", "?") | ||
| violations.append( | ||
| f"Line {line}: forbidden attribute name '{node.attr}'" | ||
| ) |
There was a problem hiding this comment.
Attribute-name scan uses exact match — initial_state may cause false positives for legitimate miner code.
The new attribute scan checks node.attr == pattern against _FORBIDDEN_STRINGS. Since initial_state is in that list, any miner code using self.initial_state or result.initial_state will be rejected. This is a common variable name in training code. If the intent is only to block access to the validator's initial_state, the string-constant scan (which catches getattr(x, "initial_state")) may be sufficient, while the attribute scan may be too aggressive for this particular entry.
Same consideration applies in environments/templar/env.py (lines 752-761).
🤖 Prompt for AI Agents
In `@local_test/verify.py` around lines 415 - 424, The attribute-scan is too
broad: it currently flags any ast.Attribute whose node.attr equals an entry in
_FORBIDDEN_STRINGS (e.g., initial_state) and causes false positives; narrow the
check so only truly private/dangerous attribute names are matched — for example
introduce a focused set like _FORBIDDEN_ATTRIBUTE_NAMES (or require
pattern.startswith("_")) and use that for the ast.Attribute loop (the code using
scan_tree, node.attr, _FORBIDDEN_STRINGS, and violations), leaving the
string-constant getattr(...) scan to catch explicit reflective access; apply the
same change where the duplicate logic exists (the second occurrence mentioned).
Miner side:
- validate_code_url() now computes sha256 of downloaded code
- commit_to_chain() includes code_hash in commitment JSON
- Backward-compatible: code_hash is optional in commitment
Commitment format:
{"code_url": "...", "code_hash": "sha256hex..."}
Validator side:
- CodeUrlInfo gains code_hash field (parsed from commitment)
- Submission stores committed hash in code_hash column
- After downloading code, validator verifies sha256 matches
- Old commitments without code_hash are skipped (backward-compatible)
- Hash mismatch fails submission with clear error message
Detects patterns like: "".join(["s","e","t","a","t","t","r"]) Reconstructs the joined string at AST analysis time and checks it against _FORBIDDEN_STRINGS. Handles both list and tuple args with any separator string. Mirrored to local_test/verify.py.
…completed - Bypass vector #2 (no code hash): updated from REAL RISK to MITIGATED - Bypass vector #5 (string construction): updated from PARTIALLY MITIGATED to MITIGATED - Moved code hash and str.join to Completed recommendations - Removed code hash from Moderate Concerns - Added backward-compat note to Moderate Concerns
…Layer 1, add str.join scan section - Layer 1 now documents commitment format with code_hash field - Verification checks table split into pre/post-evaluation, reordered to match actual code flow - Added code hash verification as check #0 (pre-evaluation) - Added str.join() obfuscation detection section under AST analysis - Added code hash and str.join to Strong posture summary - Fixed timing log to include CUDA event time
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@docs/security-analysis.md`:
- Line 29: Add a deprecation timeline and migration guidance for commitments
missing the code_hash to the security analysis: update the section that
currently states "Backward-compatible: old commitments without `code_hash` are
accepted" and the "moderate concern" note to include (1) a clear deprecation
date or staged schedule for rejecting commitments that omit `code_hash`, (2)
required client/server changes and upgrade windows, and (3) interim mitigations
(audit logs, warnings, optional enforcement flags) and rollout steps to prevent
a permanent code-swap attack vector; reference the existing phrases
"Backward-compatible: old commitments without `code_hash` are accepted" and the
"moderate concern" entry when inserting this timeline and migration guidance.
🧹 Nitpick comments (3)
environments/templar/env.py (3)
502-544:_forbidden_modulesand_forbidden_builtinsare re-allocated on every AST node iteration.Both sets are defined inside the
for node in ast.walk(tree)loop body (starting at line 381), so they're reconstructed for every node visited. Hoist them above the loop (or to module level) to avoid repeated allocation.Not a correctness issue since this runs once per evaluation, but it's wasteful for large ASTs.
♻️ Suggested refactor (hoist sets before the loop)
def _scan_for_dangerous_patterns(tree: ast.AST) -> tuple[bool, str | None]: """AST scan to reject forbidden code patterns.""" main_guard_nodes = _collect_main_guard_nodes(tree) + + _forbidden_modules = { + "ctypes", "_ctypes", "gc", "subprocess", "sys", "os", "pathlib", + "io", "socket", "http", "urllib", "requests", "shutil", "tempfile", + "signal", "threading", "multiprocessing", "inspect", "ast", "dis", + "code", "codeop", "compileall", "pickle", "shelve", "marshal", + "builtins", "_builtins", "operator", "types", "codecs", "base64", + "pdb", "pprint", "env", "__main__", "miner_train", "logging", + } + + _forbidden_builtins = { + "setattr", "getattr", "delattr", "vars", "dir", "globals", + "locals", "type", "memoryview", "open", "chr", "ord", + "breakpoint", "input", "classmethod", "staticmethod", "property", + } + for node in ast.walk(tree): if id(node) in main_guard_nodes: continue # ... existing checks ... - _forbidden_modules = { - "ctypes", - ... - } # (use _forbidden_modules and _forbidden_builtins defined above)
652-711: Consider converting_FORBIDDEN_STRINGSto afrozensetfor O(1) lookups.
_FORBIDDEN_STRINGSis alistscanned linearly in multiple places (lines 745-746, 756-757, 790-791, 811-812, 843-844). Since the checks use exact==for attribute names andinfor substring matching, you'd need to keep the list for the substring checks, but the attribute-name scan at lines 754-761 uses exact equality (node.attr == pattern) and would benefit from a set lookup.Consider splitting into a
frozensetfor exact-match checks and keeping the list for substring searches.
2257-2258:_timer_divergenceshould guard against both arguments being near-zero.If both
aandbare very small (e.g., an extremely fast evaluation),max(b, 1e-9)prevents division by zero, butabs(a - b)could be dominated by floating-point noise, causing a spurious tampering detection. In practice, training steps take seconds so this is unlikely, but amin(a, b) < some_floorguard would be more robust.
|
|
||
| - Miner computes `sha256(code)` at submit time and includes it in the timelock-encrypted commitment | ||
| - Validator verifies hash matches downloaded code before evaluation (rejects on mismatch) | ||
| - Backward-compatible: old commitments without `code_hash` are accepted |
There was a problem hiding this comment.
Backward-compatible hash skip weakens code integrity guarantees.
Line 29 notes that old commitments without code_hash are accepted (hash check skipped). Line 382 lists this as a "moderate concern." Consider documenting a deprecation timeline for commitments without code_hash — without one, the code-swap attack vector remains permanently open for anyone who omits the field.
🤖 Prompt for AI Agents
In `@docs/security-analysis.md` at line 29, Add a deprecation timeline and
migration guidance for commitments missing the code_hash to the security
analysis: update the section that currently states "Backward-compatible: old
commitments without `code_hash` are accepted" and the "moderate concern" note to
include (1) a clear deprecation date or staged schedule for rejecting
commitments that omit `code_hash`, (2) required client/server changes and
upgrade windows, and (3) interim mitigations (audit logs, warnings, optional
enforcement flags) and rollout steps to prevent a permanent code-swap attack
vector; reference the existing phrases "Backward-compatible: old commitments
without `code_hash` are accepted" and the "moderate concern" entry when
inserting this timeline and migration guidance.
Replaces JSON commitment format with packed binary-ish protocol: <32 hex hash>:<url> e.g. cf4817d9793e92a0...1354f9:https://example.com/train.py - 33 bytes overhead (vs 46-78 for JSON), leaves 95 bytes for URL - SHA256 truncated to 128-bit (32 hex chars) — 2^128 collision resistance - CLI rejects commitments exceeding 128 bytes at submit time - Parser detects packed format via regex, falls back to JSON (legacy) - Validator truncates computed hash to match committed length - Updated security-analysis.md with packed format documentation
Packed format (hash:url) is now the only accepted format: - Miner: always packs as hash:url, removed JSON fallback - Parser: only accepts packed regex, rejects everything else - Validator: always verifies 32-char truncated hash, no skip logic - Removed unused json import from miner.py and commitments.py - Updated docs to reflect strict format requirement
Summary by CodeRabbit
Documentation
New Features
Bug Fixes