POC: Convert static HTML output to PDF #8266

cwarnermm · 2025-08-01T18:58:47Z

This PR introduces a robust and flexible workflow for converting the static HTML output generated by Sphinx into clean, styled PDF documents, suitable for air-gapped and offline customer environments.

Key Technologies and Approach

WeasyPrint is used as the core HTML-to-PDF rendering engine. It offers:

Full support for modern HTML and CSS
Clean typography and print layout control
Offline operation (no dependency on external assets or CDNs) for air-gapped and restricted access

BeautifulSoup is used to:

Strip out internet-only elements (e.g., deployment badges, external links)
Remove navigation components like "On This Page" sidebars
Normalize inline image behavior and heading structures

Custom PDF builder script (generate_pdfs.py):

Merges multiple HTML sections into a single printable HTML document per guide
Injects a styled table of contents with page estimates
Applies print-optimized CSS and layout rules
Outputs clean PDFs into a dedicated /pdfs directory

Output Guides - 2 PDFs are generated:

Operations content (Deployment, Security, Administration)
Application content (Use Cases, End User, Integrations)

Each PDF is self-contained, styled, and optimized for offline distribution. More iteration is needed on PDF look and feel.

cwarnermm added 2 commits August 1, 2025 14:54

Initial iteration: generate PDFs from HTML output

6eb10e3

Sample generated PDFs

945e51f

cwarnermm added Work In Progress Not yet ready for review Guidance labels Aug 1, 2025

Merge branch 'master' into export-as-pdf

2b5d52e

cwarnermm changed the title ~~Convert static HTML output to PDF~~ POC: Convert static HTML output to PDF Aug 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

POC: Convert static HTML output to PDF #8266

POC: Convert static HTML output to PDF #8266

Uh oh!

cwarnermm commented Aug 1, 2025

Uh oh!

Uh oh!

POC: Convert static HTML output to PDF #8266

Are you sure you want to change the base?

POC: Convert static HTML output to PDF #8266

Uh oh!

Conversation

cwarnermm commented Aug 1, 2025

Uh oh!

Uh oh!