Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .jules/sentinel.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,8 @@
**Vulnerability:** The `CoverLetterGenerator` used a standard Jinja2 environment (intended for HTML/XML or plain text) to render LaTeX templates. This allowed malicious user input (or AI hallucinations) containing LaTeX control characters (e.g., `\input{...}`) to be injected directly into the LaTeX source, leading to potential Local File Inclusion (LFI) or other exploits.
**Learning:** Jinja2's default `autoescape` is context-aware based on file extensions, but usually only for HTML/XML. It does NOT automatically escape LaTeX special characters. Relying on manual filters (like `| latex_escape`) in templates is error-prone and brittle, as developers might forget to apply them to every variable.
**Prevention:** Always use a dedicated Jinja2 environment for LaTeX generation that enforces auto-escaping via a `finalize` hook (e.g., `tex_env.finalize = latex_escape`). This ensures *all* variable output is sanitized by default, providing defense-in-depth even if the template author forgets explicit filters.

## 2025-05-07 - [Critical] RCE Vulnerability in PDF Compilation
**Vulnerability:** The application used `subprocess.Popen` to compile LaTeX files using `pdflatex` and `pandoc` without the `-no-shell-escape` flag. While interaction mode was non-stop, omitting this flag allows malicious `.tex` content (e.g., injected via unsanitized user input or AI hallucinations) to execute arbitrary shell commands via the `\write18` feature or similar LaTeX exploits. Furthermore, the compilation lacked process timeouts, allowing for potential Denial of Service (DoS) attacks via infinite compilation loops.
**Learning:** External compilation tools that interpret complex document formats (like LaTeX or Markdown) often have built-in shell execution capabilities for advanced features. These must be explicitly disabled when compiling untrusted input. Additionally, blocking processes must always have bounded execution times to prevent resource exhaustion.
**Prevention:** Always append `-no-shell-escape` to `pdflatex` commands (and equivalent options like `--pdf-engine-opt=-no-shell-escape` for `pandoc`). Always implement explicit timeouts (e.g., `process.communicate(timeout=30)`) combined with cleanup logic (`process.kill()`) when calling external blocking executables.
31 changes: 25 additions & 6 deletions cli/generators/cover_letter_generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -771,30 +771,49 @@ def _compile_pdf(self, output_path: Path, tex_content: str) -> bool:
try:
# Use Popen with explicit cleanup to avoid double-free issues
process = subprocess.Popen(
["pdflatex", "-interaction=nonstopmode", tex_path.name],
["pdflatex", "-interaction=nonstopmode", "-no-shell-escape", tex_path.name],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
cwd=tex_path.parent,
)
stdout, stderr = process.communicate()
try:
stdout, stderr = process.communicate(timeout=30)
except subprocess.TimeoutExpired:
process.kill()
stdout, stderr = process.communicate()
raise RuntimeError("PDF compilation timed out")

if process.returncode == 0 or output_path.exists():
pdf_created = True
except (subprocess.CalledProcessError, FileNotFoundError):
except (subprocess.CalledProcessError, FileNotFoundError, RuntimeError):
# Check if PDF was created anyway
if output_path.exists():
pdf_created = True
else:
# Fallback to pandoc
try:
process = subprocess.Popen(
["pandoc", str(tex_path), "-o", str(output_path), "--pdf-engine=xelatex"],
[
"pandoc",
str(tex_path),
"-o",
str(output_path),
"--pdf-engine=xelatex",
"--pdf-engine-opt=-no-shell-escape",
],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
stdout, stderr = process.communicate()
try:
stdout, stderr = process.communicate(timeout=30)
except subprocess.TimeoutExpired:
process.kill()
stdout, stderr = process.communicate()
raise RuntimeError("PDF compilation timed out")

if process.returncode == 0 or output_path.exists():
pdf_created = True
except (subprocess.CalledProcessError, FileNotFoundError):
except (subprocess.CalledProcessError, FileNotFoundError, RuntimeError):
pass

if not pdf_created or not output_path.exists():
Expand Down
29 changes: 23 additions & 6 deletions cli/pdf/converter.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,16 +86,21 @@ def _compile_pdflatex(
"""
try:
process = subprocess.Popen(
["pdflatex", "-interaction=nonstopmode", tex_path.name],
["pdflatex", "-interaction=nonstopmode", "-no-shell-escape", tex_path.name],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
cwd=working_dir,
)
stdout, stderr = process.communicate()
try:
stdout, stderr = process.communicate(timeout=30)
except subprocess.TimeoutExpired:
process.kill()
stdout, stderr = process.communicate()
raise RuntimeError("PDF compilation timed out")

if process.returncode == 0 or output_path.exists():
return True
except (subprocess.CalledProcessError, FileNotFoundError):
except (subprocess.CalledProcessError, FileNotFoundError, RuntimeError):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): Timeout and other failures are fully suppressed, which makes diagnosing compilation issues difficult.

In this pandoc path, RuntimeError (including timeouts) and FileNotFoundError are now silently swallowed, and the caller only sees False. Please at least log the exception type and a brief message (and possibly stderr) so timeouts vs. missing binaries/misconfigurations can be distinguished in production without modifying code.

Suggested implementation:

        except (subprocess.CalledProcessError, FileNotFoundError, RuntimeError) as exc:
            # Log the failure so timeouts, missing binaries, etc. can be diagnosed
            logger.error(
                "PDF compilation failed (%s): %s",
                type(exc).__name__,
                str(exc),
                exc_info=True,
            )
            # Log stderr if it was captured
            if "stderr" in locals() and stderr:
                logger.error("PDF compilation stderr:\n%s", stderr)

            # Check if PDF was created anyway (pdflatex returns non-zero for warnings)
            if output_path.exists():
                return True
  1. Ensure there is a module-level logger defined in cli/pdf/converter.py, for example:
    logger = logging.getLogger(__name__).
  2. If not already present, import the logging module at the top of the file:
    import logging.

# Check if PDF was created anyway (pdflatex returns non-zero for warnings)
if output_path.exists():
return True
Expand All @@ -121,16 +126,28 @@ def _compile_pandoc(
"""
try:
process = subprocess.Popen(
["pandoc", str(tex_path), "-o", str(output_path), "--pdf-engine=xelatex"],
[
"pandoc",
str(tex_path),
"-o",
str(output_path),
"--pdf-engine=xelatex",
"--pdf-engine-opt=-no-shell-escape",
],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
cwd=working_dir,
)
stdout, stderr = process.communicate()
try:
stdout, stderr = process.communicate(timeout=30)
except subprocess.TimeoutExpired:
process.kill()
stdout, stderr = process.communicate()
raise RuntimeError("PDF compilation timed out")

if process.returncode == 0 or output_path.exists():
return True
except (subprocess.CalledProcessError, FileNotFoundError):
except (subprocess.CalledProcessError, FileNotFoundError, RuntimeError):
pass

return False
Expand Down
Loading