Skip to content

fix(generate-pdf): inline local fonts as data: URLs so they actually embed#952

Open
shuhia wants to merge 3 commits into
santifer:mainfrom
shuhia:fix/951-inline-fonts
Open

fix(generate-pdf): inline local fonts as data: URLs so they actually embed#952
shuhia wants to merge 3 commits into
santifer:mainfrom
shuhia:fix/951-inline-fonts

Conversation

@shuhia

@shuhia shuhia commented Jun 11, 2026

Copy link
Copy Markdown

Fixes #951

Stacked on #949 — this branch includes that PR's commit (14e65cc) because both touch the font-injection lines. Once #949 merges, this PR reduces to the single font-inlining commit (32fe58b). Happy to rebase either way.

What

Custom fonts have never been embedded in generated PDFs (issue #951): renderHtmlToPdf() uses page.setContent(), which leaves the page at about:blank, and Chromium blocks file:// subresource loads from non-file pages — so the file:// font URLs injected by generatePDF() were always silently dropped and every PDF fell back to system fonts, on all platforms.

This PR takes option 1 from the issue:

  • inlineLocalFonts(html) (new, exported): replaces url('./fonts/<file>') references with base64 data: URLs read from the repo's fonts/ directory. data: URLs carry no origin restriction, so they load from any page — no change to how the document is loaded. Missing files keep their original reference (with a warning); .. traversal outside fonts/ is refused.
  • Called from renderHtmlToPdf(), not generatePDF(), so generate-cover-letter.mjs — which calls renderHtmlToPdf() directly and previously had no font handling at all — is covered too.
  • The old two-step file:// regex injection in generatePDF() (prefix replace + quote-fixup) is removed; it was both the broken mechanism and dead weight once inlining happens in the renderer.
  • 3 new checks in test-all.mjs (section 17): inlining produces data:font/woff2 URLs, missing fonts keep their reference, traversal is refused.

Verification (Windows 11, Node v24.14.0, Playwright 1.58.1)

Embedded fonts in a PDF rendered from templates/cv-template.html, extracted from the PDF's font descriptors:

/FontName entries
before (main and #949) AAAAAA+Arial-BoldMT, BAAAAA+ArialMT — system fallback
after AAAAAA+Space-Grotesk-Light-Light, BAAAAA+DM-Sans-9pt-14pt-Thin, CAAAAA+Space-Grotesk-Light-Light, DAAAAA+DM-Sans-9pt-14pt-Thin

Chromium emits the variable woff2 fonts as Type3 glyph programs with /ToUnicode maps (4 present). ATS parseability is unchanged: text extracted with pdf.js from the before/after PDFs is byte-identical (386 chars, same content).

node test-all.mjs: 248 passed (245 + the 3 new checks) — remaining failures on this machine are the same pre-existing Windows-environment issues as on unmodified main (Go toolchain, SKILL.md symlink checkout, chmod); Linux CI should be green.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • PDF generation now inlines locally stored font files into documents as base64 data URLs, replacing external file references while leaving missing fonts unchanged and preventing directory-escape inlining.
  • Tests

    • Added tests validating font inlining, correct data URL embedding, preservation of missing references, and enforcement of path-based access restrictions.

shuhia and others added 2 commits June 11, 2026 20:28
The main-module guard compared import.meta.url against a hand-built
`file://${resolve(process.argv[1])}` string. On Windows, resolve()
returns backslash paths (C:\... or \\wsl.localhost\... UNC), so the
comparison never matched: the CLI entry point never ran and the script
exited 0 silently without generating anything.

Use pathToFileURL() like the other CLI scripts already do (scan.mjs,
scan-ats-full.mjs, update-system.mjs, followup-cadence.mjs), and fix
the two other hand-built file:// URLs in the same file (fonts
injection and the Playwright baseURL), which produced invalid
backslash URLs on Windows.

Fixes santifer#948

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…embed

renderHtmlToPdf() loads documents with page.setContent(), which leaves
the page at about:blank. Chromium blocks file:// subresource loads from
non-file pages ('Not allowed to load local resource'), so the file://
font URLs injected by generatePDF() never loaded and every PDF silently
fell back to system fonts -- on all platforms, since the tool shipped.

Replace the file:// URL injection with inlineLocalFonts(), which inlines
url('./fonts/...') references as base64 data: URLs inside
renderHtmlToPdf(). data: URLs carry no origin restriction, and doing it
in the renderer also covers generate-cover-letter.mjs, which calls
renderHtmlToPdf() directly without any font handling.

Verified on Windows: PDFs now embed Space Grotesk + DM Sans subsets
(Type3 with ToUnicode maps; extracted text is byte-identical to the
fallback PDFs, so ATS parseability is unchanged).

Fixes santifer#951

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 68ce86ea-e3a8-44ea-8308-fd5392bdd8a1

📥 Commits

Reviewing files that changed from the base of the PR and between 32fe58b and ae5f4d5.

📒 Files selected for processing (2)
  • generate-pdf.mjs
  • test-all.mjs

📝 Walkthrough

Walkthrough

This PR inlines local ./fonts/* references into base64 data: URLs via a new exported inlineLocalFonts(html), removes the previous file:// rewrite, integrates inlining into renderHtmlToPdf() (using pathToFileURL(baseDir).href), updates the isMain check, and adds tests covering inlining, missing files, and path-escape rejection.

Changes

Font inlining for PDF generation

Layer / File(s) Summary
Imports and path helpers
generate-pdf.mjs
Adds pathToFileURL import and path helper utilities used by the font inlining routine.
inlineLocalFonts implementation
generate-pdf.mjs
Exports inlineLocalFonts(html) that scans for url('./fonts/<name>'), reads matching local font files, maps MIME types, replaces references with data: base64 URLs, and warns/keeps originals when missing or when paths escape fonts/.
Remove previous file:// font rewrite
generate-pdf.mjs
Deletes prior code and comment that rewrote ./fonts/ URL references into absolute file://... paths; inlining replaces that approach.
PDF rendering integration and isMain update
generate-pdf.mjs
renderHtmlToPdf() calls inlineLocalFonts() before rendering and uses pathToFileURL(baseDir).href for page.setContent() baseURL. The module entry check (isMain) now compares against pathToFileURL(resolve(process.argv[1])).href.
Font inlining tests
test-all.mjs
Adds test #17 asserting: existing ./fonts/*.woff2 is converted to data:font/woff2;base64, missing font files are left unchanged, and path-traversal attempts such as ./fonts/../... or ./fonts//etc/passwd are not inlined.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • santifer/career-ops#929: Both PRs modify generate-pdf.mjs's PDF rendering pipeline—this PR adds font inlining and adjusts baseURL construction, while that PR changes the waitUntil strategy for the same rendering step.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: converting font URLs to base64 data: URLs for embedding in PDFs.
Linked Issues check ✅ Passed The PR implements the preferred solution from #951 by inlining fonts as base64 data: URLs, with proper handling of missing files and path traversal security.
Out of Scope Changes check ✅ Passed All changes directly address the font embedding issue from #951; modifications to generate-pdf.mjs and test-all.mjs are within scope.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@generate-pdf.mjs`:
- Around line 244-261: inlineLocalFonts currently only checks for '..' in the
filename which allows absolute/path-traversal escapes; fix by computing fontsDir
= resolve(__dirname, 'fonts') and for each name compute resolved =
resolve(fontsDir, name) and verify containment (e.g. path.relative(fontsDir,
resolved) doesn't start with '..' or using startsWith on resolved normalized
paths) before calling readFile; if the resolved path is outside fontsDir, skip
and warn (keep original reference), otherwise readFile(resolved) and continue
using the resolved buffer when building data URL.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 198c69ba-7824-4b5a-b17d-3863a32c941a

📥 Commits

Reviewing files that changed from the base of the PR and between 0d57994 and 32fe58b.

📒 Files selected for processing (2)
  • generate-pdf.mjs
  • test-all.mjs

Comment thread generate-pdf.mjs
The previous guard only rejected names containing '..', but the regex
admits names starting with '/', and resolve(fontsDir, '/etc/passwd')
returns the absolute path verbatim -- escaping fonts/. Resolve each
reference and require it to stay inside fonts/ via path.relative().

Flagged by CodeRabbit on santifer#952.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Custom fonts are never embedded in PDFs - setContent() leaves the page at about:blank, Chromium blocks all file:// font loads

1 participant