🛡️ Sentinel: [CRITICAL] Fix LaTeX injection RCE vulnerability in PDF compilation#274
🛡️ Sentinel: [CRITICAL] Fix LaTeX injection RCE vulnerability in PDF compilation#274
Conversation
Co-authored-by: anchapin <[email protected]>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
Reviewer's GuideAdds defense-in-depth against LaTeX-based RCE and DoS in PDF compilation by enforcing no-shell-escape flags for pdflatex/pandoc, introducing 30s timeouts with cleanup for all LaTeX compilation subprocesses, and documenting the incident in the Sentinel security log. Sequence diagram for hardened LaTeX PDF compilation subprocess handlingsequenceDiagram
participant Caller as Caller
participant Subproc as Subprocess
participant Pdflatex as pdflatex
participant Pandoc as pandoc
Caller->>Subproc: Popen(["pdflatex","-interaction=nonstopmode","-no-shell-escape",tex_path_name])
activate Subproc
Subproc->>Pdflatex: execute pdflatex
alt pdflatex finishes within 30s
Subproc-->>Caller: communicate(timeout=30)
Caller->>Caller: check returncode or output_path_exists
alt success
Caller->>Caller: pdf_created True
else failure
Caller->>Subproc: Popen(["pandoc",tex_path,"-o",output_path,"--pdf-engine=xelatex","--pdf-engine-opt=-no-shell-escape"])
activate Subproc
Subproc->>Pandoc: execute pandoc
alt pandoc finishes within 30s
Subproc-->>Caller: communicate(timeout=30)
Caller->>Caller: check returncode or output_path_exists
alt success
Caller->>Caller: pdf_created True
else failure
Caller->>Caller: pdf_created False
end
else pandoc timeout
Subproc-->>Caller: TimeoutExpired
Caller->>Subproc: kill()
Caller->>Subproc: communicate()
Caller->>Caller: pdf_created False
end
end
else pdflatex timeout
Subproc-->>Caller: TimeoutExpired
Caller->>Subproc: kill()
Caller->>Subproc: communicate()
Caller->>Caller: pdf_created False
Caller->>Caller: optionally try pandoc fallback
end
Flow diagram for LaTeX PDF compilation with no_shell_escape and timeoutflowchart TD
Start([Start PDF compilation]) --> PDFA[Run pdflatex with flags]
PDFA["Popen pdflatex with -interaction=nonstopmode and -no-shell-escape"] --> PDFCOMM["communicate(timeout=30)"]
PDFCOMM -->|finished| PDFRC{pdflatex success or output_path exists}
PDFCOMM -->|TimeoutExpired| PDFTO[Kill pdflatex process and communicate]
PDFTO --> PDFFAIL[Mark pdflatex attempt failed]
PDFRC -->|yes| OK[Return success]
PDFRC -->|no| PANDOCCHK{Use pandoc fallback available}
PANDOCCHK -->|no| FAIL[Return failure]
PANDOCCHK -->|yes| PANDA["Popen pandoc with --pdf-engine=xelatex and --pdf-engine-opt=-no-shell-escape"]
PANDA --> PANCOMM["communicate(timeout=30)"]
PANCOMM -->|finished| PANRC{pandoc success or output_path exists}
PANCOMM -->|TimeoutExpired| PANTO[Kill pandoc process and communicate]
PANTO --> PANFAIL[Mark pandoc attempt failed]
PANFAIL --> FAIL
PANRC -->|yes| OK
PANRC -->|no| FAIL
OK --> End([End: PDF compiled safely])
FAIL --> End
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- Consider extracting the 30-second timeout value into a shared constant so it’s easier to tune and keep consistent across both pdflatex and pandoc call sites.
- The timeout handling logic (try/except TimeoutExpired, kill, communicate) is duplicated in multiple places; you could factor this into a small helper function to reduce repetition and keep behavior consistent.
- On timeout you currently just return False or set pdf_created = False; adding some minimal logging or error propagation from the subprocess stderr/stdout would make diagnosing problematic inputs or environments much easier.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider extracting the 30-second timeout value into a shared constant so it’s easier to tune and keep consistent across both pdflatex and pandoc call sites.
- The timeout handling logic (try/except TimeoutExpired, kill, communicate) is duplicated in multiple places; you could factor this into a small helper function to reduce repetition and keep behavior consistent.
- On timeout you currently just return False or set pdf_created = False; adding some minimal logging or error propagation from the subprocess stderr/stdout would make diagnosing problematic inputs or environments much easier.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Co-authored-by: anchapin <[email protected]>
🚨 Severity: CRITICAL
💡 Vulnerability: The
cli/pdf/converter.pyandcli/generators/cover_letter_generator.pymodules usedsubprocess.Popento callpdflatexandpandocwithout-no-shell-escapeflags or process timeouts, enabling Remote Code Execution (RCE) and Denial of Service (DoS) attacks from untrusted LaTeX inputs.🎯 Impact: Attackers could execute arbitrary system commands on the host machine or hang the application indefinitely.
🔧 Fix: Appended
-no-shell-escapeand--pdf-engine-opt=-no-shell-escapetopdflatexandpandocsubprocess arguments respectively, added a 30-second timeout toprocess.communicate(), and ensured process cleanup.✅ Verification: Ran unit tests via
pytestto confirm successful timeout detection and proper argument forwarding.PR created automatically by Jules for task 7895040161886264509 started by @anchapin
Summary by Sourcery
Harden PDF compilation against LaTeX-based RCE and DoS by constraining external tools and enforcing execution time limits.
Bug Fixes:
Documentation: