Skip to content

🛡️ Sentinel: [HIGH] Harden Unit Test Gate and Fix Telemetry#319

Open
heidi-dang wants to merge 1 commit into
feat/bootstrap-scaffoldfrom
sentinel/harden-unit-test-gate-16036854680546890242
Open

🛡️ Sentinel: [HIGH] Harden Unit Test Gate and Fix Telemetry#319
heidi-dang wants to merge 1 commit into
feat/bootstrap-scaffoldfrom
sentinel/harden-unit-test-gate-16036854680546890242

Conversation

@heidi-dang
Copy link
Copy Markdown
Owner

🚨 Severity: HIGH
💡 Vulnerability: Environment leakage and sandbox escape risks in the unit test gate.
🎯 Impact: Maliciously generated code could exfiltrate sensitive environment variables (e.g., OPENAI_API_KEY) or escape the basic regex sandbox.
🔧 Fix: Implemented strict environment variable filtering for the test subprocess and expanded the dangerous pattern list. Also fixed several bugs preventing the unit test gate and telemetry system from functioning correctly.
✅ Verification: Verified via pytest tests/ and a custom hardening verification script.


PR created automatically by Jules for task 16036854680546890242 started by @heidi-dang

- Filter subprocess environment in 03_unit_test_gate.py to prevent secret leakage
- Expand DANGEROUS_PATTERNS to include common sandbox escapes
- Fix IndentationError in code wrapping logic using textwrap.indent
- Fix NameError in telemetry.py get_state() due to undefined variable
- Fix directory creation bug in save_jsonl when path has no directory part
- Improve code block extraction regex robustness
@google-labs-jules
Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses security vulnerabilities and improves the robustness of the unit test gate and telemetry modules. Key changes include implementing environment variable filtering for subprocesses to prevent secret leakage, adding sandbox escape detection patterns, and refining code block extraction regex. Additionally, a caching optimization was removed from telemetry, and directory creation logic was hardened. Feedback focuses on further securing code execution by using exec() with restricted namespaces, refining regex patterns with word boundaries and negative lookaheads to avoid redundant matches, and simplifying code indentation logic.


# Execute the user's code
{code}
{indented_code}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The current implementation executes untrusted code by directly injecting it into the wrapper script's source. This causes namespace leakage, allowing the generated code to access and potentially manipulate the wrapper's internal variables like sys, io, stdout_capture, and original_stdout.

A more secure approach is to execute the code in a restricted global namespace using exec(). By passing an empty dictionary (or one containing only __builtins__), you isolate the user code from the wrapper's scope.

Suggested change
{indented_code}
exec({repr(code)}, {{'__builtins__': __builtins__}})

CODE_BLOCK_PATTERNS = [
# Markdown code blocks: ```python ... ```
r"```python\n(.*?)```",
r"```python\s*(.*?)\s*```",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This regex can be improved to handle optional whitespace between the backticks and the python identifier, making it more robust against different markdown formatting styles.

Suggested change
r"```python\s*(.*?)\s*```",
r"```\s*python\s*(.*?)\s*```",

r"```python\s*(.*?)\s*```",
# Markdown code blocks without language: ``` ... ```
r"```\n(.*?)```",
r"```\s*(.*?)\s*```",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current generic code block pattern will also match blocks that start with python, but it will include the word python as part of the extracted code. This leads to redundant tests and syntax errors when the script tries to execute python\n... as Python code. Using a negative lookahead ensures this pattern only matches blocks that do not specify the language.

Suggested change
r"```\s*(.*?)\s*```",
r"```(?!\s*python)\s*(.*?)\s*```",

Comment on lines +89 to +91
r"__subclasses__",
r"__globals__",
r"__builtins__",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

Adding word boundaries (\b) to these patterns is recommended to prevent false positives (e.g., matching a variable named my__globals__) while still effectively catching sandbox escape attempts.

Suggested change
r"__subclasses__",
r"__globals__",
r"__builtins__",
r"\b__subclasses__\b",
r"\b__globals__\b",
r"\b__builtins__\b",

Comment on lines +221 to +222
# Indent the code to fit into the try block
indented_code = textwrap.indent(code, " ")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Manual indentation of the user code is unnecessary if the code is executed via exec() using a string literal, which also provides better isolation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant