Skip to content

feat(compiler): WIP - HTML export with CDN interactivity#13

Open
2233admin wants to merge 1 commit into
mainfrom
feature/html-export-draft
Open

feat(compiler): WIP - HTML export with CDN interactivity#13
2233admin wants to merge 1 commit into
mainfrom
feature/html-export-draft

Conversation

@2233admin
Copy link
Copy Markdown
Owner

@2233admin 2233admin commented May 12, 2026

Summary

WIP Draft PR for HTML export feature - integrates huashu-md-html patterns.

Features

  • HTML export via Pandoc with 4 themes (article/report/reading/interactive)
  • Wikilink conversion: [[note]]<a href="...">
  • CDN libraries: Prism.js (code highlighting), Mermaid.js (diagrams), Chart.js (charts)
  • JS enhancements: copy buttons, callouts, tabs, dark mode toggle

CLI Usage

python compiler/compile.py vault/topic --export-html --theme reading

Known Issues (TODO)

  • Mermaid blocks need pre-processing to <div class="mermaid"> for proper rendering
  • Chart blocks (```chart`) not yet implemented
  • Offline mode not supported (requires CDN)
  • Need more test coverage
  • 4 themes need差异化处理

References

WIP draft - integrates huashu-md-html patterns with CDN libraries:

- HTML export via Pandoc with 4 themes (article/report/reading/interactive)
- Wikilink conversion: [[note]] → <a href=...>
- CDN libraries: Prism.js, Mermaid.js, Chart.js
- JS enhancements: copy buttons, callouts, tabs
- CLI: --export-html --theme <name>

Known issues (TODO):
- Mermaid/Chart blocks need pre-processing to <div class=mermaid>
- Offline mode not supported (requires CDN)
- More testing needed

Refs: https://thariqs.github.io/html-effectiveness
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR introduces a complete HTML export feature to the wiki compiler. Users trigger HTML export via new CLI flags (--export-html, --theme, --html-output-dir). The export pipeline preprocesses markdown (converting wikilinks to anchors, transforming callouts/tabs, removing Obsidian syntax), runs Pandoc to produce HTML, injects theme CSS and interactive JavaScript, and reports results. Browser-side enhancements add copy buttons, collapsible callouts, tabbed panels, TOC scroll spy, slide navigation, and dark-mode support. Four distinct visual themes (article, interactive, reading, report) provide different presentations via CSS.

Changes

HTML Export Pipeline

Layer / File(s) Summary
CLI Integration & Orchestration
compiler/compile.py
New CLI flags --export-html, --theme, --html-output-dir control export. The step_html_export() function wraps the export, handles graceful degradation when Pandoc is unavailable, and integrates results into the final report. HTML export runs even when no files are dirty. _print_report() now accepts optional export statistics.
Package API
compiler/html_export/__init__.py
Package initializer re-exports public types (ExportOptions, ExportReport) and functions (export_to_html, wikilinks_to_html) from submodules.
Export Core Engine & Wikilinks
compiler/html_export/exporter.py, compiler/html_export/wikilink_converter.py
Main export workflow: validates Pandoc, resolves theme CSS, preprocesses markdown (wikilink→anchor conversion, callout/tab rewriting, Obsidian cleanup), runs Pandoc with theme metadata, post-processes HTML to inject assets, and collects export statistics. Wikilink converter transforms Obsidian-style [[target]] and [[target|display]] into HTML anchors with automatic slug-based paths.
Browser Enhancements
compiler/html_export/static/wiki.js, compiler/html_export/static/wiki.css
JavaScript enhances rendered pages with Prism syntax highlighting, copy-code buttons, collapsible callouts (with typed icons), tabbed navigation, TOC scroll spy, keyboard-driven slide navigation, smooth anchor scrolling, and dark-mode toggle with persistence. CSS styles interactive elements and defines print-friendly overrides to hide UI controls.
Theme Templates
compiler/html_export/templates/{article,interactive,reading,report}/style.css
Four independent visual themes using CSS custom properties and responsive media queries. Article: editorial/Tufte-inspired with sidenotes. Interactive: book/tutorial with fixed sidebar. Reading: clean document layout with dark-mode support. Report: professional whitepaper with numbered sections and TOC.
Tests
compiler/tests/test_html_export.py
Unit tests covering wikilink conversion (simple, aliased, sections), slug generation, callout transformation, and end-to-end markdown processing.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 A wizard's quill dips into markdown wells,
Weaving wikilinks into glowing HTML spells,
Callouts collapse, tabs dance, dark modes gleam—
Four themes paint the wiki's creative dream!
Copy buttons whisper, slides glide and flow,
Watch the rabbit's warren light up the show! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 78.85% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly indicates the main feature being added: HTML export with CDN interactivity. It accurately reflects the primary changes in the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The pull request description clearly relates to the changeset, explaining the HTML export feature with themes, wikilink conversion, CLI usage, and known issues.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds an optional HTML export module to the compiler, enabling the generation of styled documentation from wiki content using Pandoc. The update includes several CSS themes, interactive JavaScript features, and pre-processing for Obsidian-specific syntax like callouts and wikilinks. Review feedback identifies several high-severity issues, including hardcoded relative paths for static assets and CSS that will fail in subdirectories, and flawed logic for processing tabs and collapsed callouts. Additionally, the reviewer pointed out that wikilinks are incorrectly hardcoded to the concepts directory and that code block language detection is currently non-functional.


def _inject_assets(html_content: str, has_interactive: bool = False) -> str:
"""Inject wiki.js, wiki.css, and CDN libraries into HTML content."""
static_url = "static/"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The static_url is hardcoded as a relative path static/. This will break for HTML files generated in subdirectories like concepts/ or summaries/, as they would expect ../static/. Consider calculating the relative path based on the file's depth relative to the output root.

]

if css_file and css_file.exists():
cmd.append(f"--css=css/style.css")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The CSS path is hardcoded as css/style.css. Similar to the static assets issue, this relative path will be broken for HTML files located in subdirectories. It should be calculated relative to the output_file location.

Comment on lines +281 to +297
if tabs:
output.append(text[last_end:])
remaining = "".join(output)

# Wrap all consecutive tabs in a tab-set div
# This is a simplified approach - full implementation would need
# to preserve surrounding content properly
tab_content = ['<div class="tab-set">']
for label, content in tabs:
safe_label = re.sub(r"[^\w\s-]", "", label)
tab_content.append(f'<div class="tab" data-label="{label}">')
tab_content.append(content)
tab_content.append("</div>")
tab_content.append("</div>")

return "".join(tab_content) + remaining

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The logic for converting tabs is flawed. It collects all tab blocks found in the file and appends them to the top of the document (before the rest of the content), which breaks the original document structure and merges unrelated tab sets. Additionally, it leaves the outer ```tabs markers in the text. This should be implemented using re.sub with a callback to replace tab sets in their original positions.

lang = match.group(1).lower()
content = match.group(2)
# Use language-xxx class for Prism
return f'```python\n{content}```' # Let Pandoc handle the language
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The code block conversion hardcodes the language as python, ignoring the language detected in the markdown. Additionally, the replace_code_block function is defined but never called within _convert_code_blocks, which currently returns the original text unchanged. If the intention is to let the JavaScript library handle highlighting, this dead code should be removed; otherwise, it should be corrected to use the detected language.

Comment on lines +324 to +336
# Convert > content lines to paragraphs
paragraphs: list[str] = []
for line in content.strip().split("\n"):
line = line.lstrip("> ").strip()
if line:
paragraphs.append(f"<p>{line}</p>")

return (
f'<div class="callout callout-{callout_type}">\n'
f"<strong>{title}</strong>\n"
+ "\n".join(paragraphs) +
f"\n</div>\n"
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Wrapping every line of a callout in <p> tags prevents Pandoc from correctly parsing block-level markdown elements (like lists or multiple paragraphs) inside the callout. It is better to strip the blockquote markers and let Pandoc process the inner content as a single block.

        # Strip blockquote markers and let Pandoc handle the markdown
        inner = "\n".join(l.lstrip("> ").strip() for l in content.strip().split("\n"))

        return (
            f'<div class="callout callout-{callout_type}">\n'
            f"<strong>{title}</strong>\n"
            f"{inner}\n"
            f"\n</div>\n"
        )

Comment on lines +346 to +349
collapsed_re = re.compile(
r'<div class="callout callout-collapsed">\s*<p></p>\s*(.*?)\s*</div>',
re.DOTALL
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The regex for identifying collapsed callouts expects <p></p> which is not produced by _convert_callouts. Additionally, _convert_callouts includes a <strong> tag for the title that this regex does not account for, causing the conversion to details/summary tags to fail.

return text


def _generate_index(wiki_dir: Path, output_dir: Path) -> str:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The _generate_index function is defined but never called. The current export process relies on a pre-existing _index.md file, but the logic to dynamically generate a list of concepts and summaries is unused and would likely be more robust for automated exports.

folder = parts[0] + "/"
name = parts[1]
else:
folder = "concepts/"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Wikilinks are hardcoded to point to the concepts/ folder. This will result in broken links if the target is actually a summary (which are stored in summaries/). The converter should verify the target type or use a more flexible path resolution strategy.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@compiler/html_export/exporter.py`:
- Around line 114-145: The injected asset URLs use a relative "static/" path
which breaks for HTML files in subdirectories; in the _inject_assets function
(and the other injection sites referenced), change static_url to an absolute
path ("/static/") or otherwise build root-absolute URLs so links to "wiki.css"
and "wiki.js" always resolve (update the static_url variable and any occurrences
that construct f'{static_url}wiki.css' / f'{static_url}wiki.js' and the similar
injections at the other locations mentioned).
- Around line 261-297: The current loop over tab_block_re finds all tab blocks
and accumulates them into a single tabs list, which reorders content and merges
separate tab groups; change the logic in the for match in
tab_block_re.finditer(text) loop to build the output incrementally: append
non-matching slices to output as before, but detect contiguous runs of tab
blocks (use a flag like in_group and a current_group list) so when you see the
first tab in a consecutive run you start a new group, keep adding subsequent tab
matches to that group while their match.start() == last_end (or they are
immediately adjacent in the source), and when a non-adjacent match or gap is
encountered flush the current_group by emitting a single <div
class="tab-set">...</div> (constructing safe_label and inner tab divs using the
same label/content handling) into output, then continue; after the loop flush
any open group and append the trailing text (text[last_end:]) so original order
and separate tab groups are preserved (refer to tab_block_re, tabs/last_end
variables, and the tab-set construction).
- Line 10: The file has Ruff lint failures: remove unused imports and locals and
eliminate redundant f-string prefixes; specifically, if dataclass and field
(from the top import) are not used anywhere in exporter.py, delete that import,
otherwise apply them where intended (e.g., on any classes meant to be
dataclasses). Search for unused local variables flagged by Ruff and either use
them or remove them, and replace any f"..." strings that contain no
interpolations with regular string literals. Update symbols like dataclass,
field, and any functions/classes in exporter.py that currently declare unused
locals or use redundant f-strings so the linter errors for
unused-import/unused-variable and redundant-fstring are resolved.
- Around line 422-424: The code in exporter.py always yields wiki_dir /
"_index.md" when options.include_index is true, causing noise if that file
doesn't exist; update the branch that checks options.include_index to first
verify that (wiki_dir / "_index.md").exists() (or .is_file()) and only yield the
path and content when the file is present, leaving behavior unchanged otherwise.
- Around line 193-195: The hardcoded "--css=css/style.css" breaks for nested
outputs; instead compute the CSS path relative to the HTML output location and
append that to cmd. When css_file.exists() is true, compute relative_css =
os.path.relpath(css_file, start=html_output_parent) (or
css_file.relative_to(html_output_parent) using pathlib) and then
cmd.append(f"--css={relative_css}"), ensuring you use the actual output
directory/parent for the start path so nested files in concepts/ or summaries/
resolve the theme correctly.
- Around line 346-355: The current collapsed_re and replace_collapsed in
exporter.py expect an exact "<p></p>" inside div.callout-collapsed but upstream
_convert_callouts emits other tags (e.g. <strong> and <p> blocks), so the regex
never matches; update the regex used by collapsed_re (and keep
replace_collapsed) to allow any leading inner HTML (use DOTALL with a non-greedy
capture for the inner content instead of requiring "<p></p>") so it will match
the structure produced by _convert_callouts (reference collapsed_re and
replace_collapsed).
- Around line 153-157: The subprocess.run calls that invoke Pandoc (e.g., the
call with args ["pandoc", "--version"] and the other Pandoc invocation around
line ~199) must include a timeout to avoid hangs; add a timeout parameter
(preferably via a module-level constant like PANDOC_TIMEOUT) to both
subprocess.run(...) calls and wrap them to catch subprocess.TimeoutExpired so
you can log an error (or raise a controlled exception) and clean up rather than
blocking indefinitely. Ensure you update both occurrences that call
subprocess.run with ["pandoc", ...] and use the same timeout and exception
handling pattern for consistency.

In `@compiler/html_export/static/wiki.css`:
- Around line 124-130: The CSS rule under selector details.callout
summary::before uses the keyword "currentColor" which violates the stylelint
keyword-case rule; change the background property value from "currentColor" to
the lowercase "currentcolor" in that rule (selector: details.callout
summary::before) so it complies with the linter.

In `@compiler/html_export/static/wiki.js`:
- Around line 58-70: The click handler assumes navigator.clipboard.writeText
exists; wrap that call in a feature check and fallback so offline/non-secure
contexts don't break the UI: in the btn click listener (where btn and code are
used) first test if navigator.clipboard && typeof navigator.clipboard.writeText
=== "function" and call it, otherwise perform a safe fallback copy (e.g., create
a temporary textarea, select its contents and use document.execCommand('copy'))
and resolve/reject the same success/error flows so the existing success SVG swap
and failure "Failed" text still run; also ensure any thrown errors are caught
and routed to the failure branch instead of letting them bubble.
- Around line 82-117: The current loop in
document.querySelectorAll(".callout").forEach processes elements that are
already converted into collapsible blocks (i.e., <details class="callout ...">),
causing double-wrapping; update the iteration to skip nodes that are already
<details> (check callout.tagName or callout.matches('details') and
continue/return for those) so only non-<details> .callout elements are
transformed; keep the rest of the transformation (creating
details/summary/content, using getCalloutIcon, and replacing the node)
unchanged.
- Around line 137-170: The tabpanel IDs and aria-controls are reused per tab set
causing duplicate IDs; update the tab-set initialization to generate unique IDs
per set (e.g., include the tab set index or a unique prefix). Use the outer
forEach index (tabSet, setIndex) or a counter and change both the
btn.setAttribute("aria-controls", ...) and tab.id assignments from "tabpanel-" +
i to a unique name like "tabpanel-" + setIndex + "-" + i so each tab and its
control remain uniquely linked across multiple .tab-set elements.
- Around line 31-36: The Mermaid initialization currently sets securityLevel to
"loose" which allows HTML and click directives and can enable XSS; update the
mermaid configuration in the mermaid.initialize call to set securityLevel to
"strict" (i.e., change the securityLevel property in the
mermaid.initialize({...}) block) so Mermaid disables embedded HTML/click
handlers and prevents stored XSS in exported HTML.

In `@compiler/tests/test_html_export.py`:
- Line 9: The import statement importing wikilinks_to_html and slugify from
html_export.wikilink_converter is not alphabetically ordered; update the import
order so it sorts identifiers alphabetically (e.g., import slugify before
wikilinks_to_html) in the test file (referencing the symbols slugify and
wikilinks_to_html) to satisfy Ruff I001.
- Around line 92-93: Remove the redundant local import of Path inside the
test_markdown_processing test: locate the test function
(test_markdown_processing) and delete the inner "from pathlib import Path"
statement since Path is already imported at the top of the file; ensure there
are no remaining references that require re-importing and run the linter/tests
to confirm the Ruff I001 issue is resolved.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 049ba549-a458-436d-ad09-e3b2c347d019

📥 Commits

Reviewing files that changed from the base of the PR and between 51d3659 and c261391.

📒 Files selected for processing (11)
  • compiler/compile.py
  • compiler/html_export/__init__.py
  • compiler/html_export/exporter.py
  • compiler/html_export/static/wiki.css
  • compiler/html_export/static/wiki.js
  • compiler/html_export/templates/article/style.css
  • compiler/html_export/templates/interactive/style.css
  • compiler/html_export/templates/reading/style.css
  • compiler/html_export/templates/report/style.css
  • compiler/html_export/wikilink_converter.py
  • compiler/tests/test_html_export.py


import subprocess
import sys
from dataclasses import dataclass, field
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Resolve Ruff CI blockers before merge.

Lines 10, 63-76, 194, 221, 290-291, 335, and 598 match the current lint failures (unused import/locals and redundant f prefixes). This blocks CI.

Also applies to: 63-76, 194-194, 221-221, 290-291, 335-335, 598-598

🧰 Tools
🪛 GitHub Actions: CI / 3_lint-python.txt

[error] 10-10: Ruff F401: dataclasses.field imported but unused. Remove unused import dataclasses.field.

🪛 GitHub Actions: CI / lint-python

[error] 10-10: Ruff F401: dataclasses.field imported but unused. Remove unused import dataclasses.field.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@compiler/html_export/exporter.py` at line 10, The file has Ruff lint
failures: remove unused imports and locals and eliminate redundant f-string
prefixes; specifically, if dataclass and field (from the top import) are not
used anywhere in exporter.py, delete that import, otherwise apply them where
intended (e.g., on any classes meant to be dataclasses). Search for unused local
variables flagged by Ruff and either use them or remove them, and replace any
f"..." strings that contain no interpolations with regular string literals.
Update symbols like dataclass, field, and any functions/classes in exporter.py
that currently declare unused locals or use redundant f-strings so the linter
errors for unused-import/unused-variable and redundant-fstring are resolved.

Comment on lines +114 to +145
def _inject_assets(html_content: str, has_interactive: bool = False) -> str:
"""Inject wiki.js, wiki.css, and CDN libraries into HTML content."""
static_url = "static/"

# Build CDN links based on needs
cdn_links: list[str] = []

# Prism.js for code highlighting
cdn_links.append(f' <link rel="stylesheet" href="{CDN_LINKS["prism"]["css"]}">')
cdn_links.append(f' <script src="{CDN_LINKS["prism"]["js"]}"></script>')

# Mermaid.js for diagrams
cdn_links.append(f' <script src="{CDN_LINKS["mermaid"]["js"]}"></script>')

# Chart.js for charts
cdn_links.append(f' <script src="{CDN_LINKS["chart"]["js"]}"></script>')

cdn_html = "\n ".join(cdn_links) + "\n"

# Inject CSS before </head>
css_link = f' <link rel="stylesheet" href="{static_url}wiki.css">\n'
if "</head>" in html_content:
html_content = html_content.replace("</head>", css_link + cdn_html + "</head>")
else:
html_content = css_link + cdn_html + html_content

# Inject JS before </body>
js_script = f' <script src="{static_url}wiki.js" defer></script>\n'
if "</body>" in html_content:
html_content = html_content.replace("</body>", js_script + "</body>")
else:
html_content = html_content + "\n" + js_script
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Injected static/ asset URLs break on subdirectory pages.

Lines 116/134/141 assume static/ relative to every HTML file. Files under concepts/ and summaries/ need ../static/..., otherwise wiki.css and wiki.js fail to load.

Proposed fix
-def _inject_assets(html_content: str, has_interactive: bool = False) -> str:
+def _inject_assets(html_content: str, static_prefix: str = "") -> str:
     """Inject wiki.js, wiki.css, and CDN libraries into HTML content."""
-    static_url = "static/"
+    static_url = f"{static_prefix}static/"
...
-def _run_pandoc(
+def _run_pandoc(
     input_file: Path,
     output_file: Path,
     css_file: Path | None,
     title: str = "",
+    static_prefix: str = "",
 ) -> bool:
...
-        html_content = _inject_assets(html_content)
+        html_content = _inject_assets(html_content, static_prefix=static_prefix)
-            if _run_pandoc(temp_md, output_file, css_dest, title):
+            static_prefix = "../" if subdir else ""
+            if _run_pandoc(temp_md, output_file, css_dest, title, static_prefix):

Also applies to: 204-208, 503-510

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@compiler/html_export/exporter.py` around lines 114 - 145, The injected asset
URLs use a relative "static/" path which breaks for HTML files in
subdirectories; in the _inject_assets function (and the other injection sites
referenced), change static_url to an absolute path ("/static/") or otherwise
build root-absolute URLs so links to "wiki.css" and "wiki.js" always resolve
(update the static_url variable and any occurrences that construct
f'{static_url}wiki.css' / f'{static_url}wiki.js' and the similar injections at
the other locations mentioned).

Comment on lines +153 to +157
result = subprocess.run(
["pandoc", "--version"],
capture_output=True,
text=True,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify subprocess calls and whether they include timeout in this file.
rg -n 'subprocess\.run\(' compiler/html_export/exporter.py -C3

Repository: 2233admin/obsidian-llm-wiki

Length of output: 624


🏁 Script executed:

# Get full context around the subprocess calls
cat -n compiler/html_export/exporter.py | sed -n '140,210p'

Repository: 2233admin/obsidian-llm-wiki

Length of output: 2569


🏁 Script executed:

# Count all subprocess.run calls in the file to ensure we identified all of them
rg -c 'subprocess\.run\(' compiler/html_export/exporter.py

Repository: 2233admin/obsidian-llm-wiki

Length of output: 73


🏁 Script executed:

# Check if subprocess is imported and how it's used
head -20 compiler/html_export/exporter.py

Repository: 2233admin/obsidian-llm-wiki

Length of output: 464


🏁 Script executed:

# Check error handling context around line 199 to see if TimeoutError should be caught
rg -n 'except|raise' compiler/html_export/exporter.py -B2 -A2

Repository: 2233admin/obsidian-llm-wiki

Length of output: 1126


Add timeout parameters to Pandoc subprocess calls to prevent indefinite hangs.

The subprocess.run() calls at lines 153 and 199 lack timeout parameters, which can cause the export pipeline to block indefinitely if Pandoc becomes unresponsive.

Proposed fix
         result = subprocess.run(
             ["pandoc", "--version"],
             capture_output=True,
             text=True,
+            timeout=10,
         )
...
-    result = subprocess.run(cmd, capture_output=True, text=True)
+    result = subprocess.run(cmd, capture_output=True, text=True, timeout=120)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
result = subprocess.run(
["pandoc", "--version"],
capture_output=True,
text=True,
)
result = subprocess.run(
["pandoc", "--version"],
capture_output=True,
text=True,
timeout=10,
)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@compiler/html_export/exporter.py` around lines 153 - 157, The subprocess.run
calls that invoke Pandoc (e.g., the call with args ["pandoc", "--version"] and
the other Pandoc invocation around line ~199) must include a timeout to avoid
hangs; add a timeout parameter (preferably via a module-level constant like
PANDOC_TIMEOUT) to both subprocess.run(...) calls and wrap them to catch
subprocess.TimeoutExpired so you can log an error (or raise a controlled
exception) and clean up rather than blocking indefinitely. Ensure you update
both occurrences that call subprocess.run with ["pandoc", ...] and use the same
timeout and exception handling pattern for consistency.

Comment on lines +193 to +195
if css_file and css_file.exists():
cmd.append(f"--css=css/style.css")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Theme CSS path is incorrect for nested HTML outputs.

Line 194 hardcodes --css=css/style.css. For outputs in concepts/ and summaries/, this resolves to a non-existent path and drops theme styling.

Proposed fix
+import os
...
     if css_file and css_file.exists():
-        cmd.append(f"--css=css/style.css")
+        rel_css = Path(os.path.relpath(css_file, output_file.parent)).as_posix()
+        cmd.append(f"--css={rel_css}")
🧰 Tools
🪛 GitHub Actions: CI / 3_lint-python.txt

[error] 194-194: Ruff F541: f-string without any placeholders (f"--css=css/style.css"). Remove extraneous f prefix.

🪛 GitHub Actions: CI / lint-python

[error] 194-195: Ruff F541: f-string without any placeholders. Remove extraneous f prefix from f"--css=css/style.css".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@compiler/html_export/exporter.py` around lines 193 - 195, The hardcoded
"--css=css/style.css" breaks for nested outputs; instead compute the CSS path
relative to the HTML output location and append that to cmd. When
css_file.exists() is true, compute relative_css = os.path.relpath(css_file,
start=html_output_parent) (or css_file.relative_to(html_output_parent) using
pathlib) and then cmd.append(f"--css={relative_css}"), ensuring you use the
actual output directory/parent for the start path so nested files in concepts/
or summaries/ resolve the theme correctly.

Comment on lines +261 to +297
tab_block_re = re.compile(
r"```tab:([^\n]+)\n(.*?)```",
re.DOTALL
)

tabs: list[tuple[str, str]] = []
output: list[str] = []
last_end = 0

for match in tab_block_re.finditer(text):
# Collect any text before this match
if match.start() > last_end:
output.append(text[last_end:match.start()])

label = match.group(1).strip()
content = match.group(2).strip()
tabs.append((label, content))
last_end = match.end()

# If we found tabs, wrap them
if tabs:
output.append(text[last_end:])
remaining = "".join(output)

# Wrap all consecutive tabs in a tab-set div
# This is a simplified approach - full implementation would need
# to preserve surrounding content properly
tab_content = ['<div class="tab-set">']
for label, content in tabs:
safe_label = re.sub(r"[^\w\s-]", "", label)
tab_content.append(f'<div class="tab" data-label="{label}">')
tab_content.append(content)
tab_content.append("</div>")
tab_content.append("</div>")

return "".join(tab_content) + remaining

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Tab conversion currently reorders content and merges unrelated tab groups.

This implementation collects every tab: block in the document, then prepends one <div class="tab-set">...</div> and appends all remaining text. It changes original order and collapses separate groups into one.

🧰 Tools
🪛 GitHub Actions: CI / 3_lint-python.txt

[error] 290-291: Ruff F841: Local variable safe_label is assigned to but never used. Remove assignment to unused variable safe_label.

🪛 GitHub Actions: CI / lint-python

[error] 290-291: Ruff F841: Local variable safe_label is assigned to but never used. Remove assignment to unused variable safe_label.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@compiler/html_export/exporter.py` around lines 261 - 297, The current loop
over tab_block_re finds all tab blocks and accumulates them into a single tabs
list, which reorders content and merges separate tab groups; change the logic in
the for match in tab_block_re.finditer(text) loop to build the output
incrementally: append non-matching slices to output as before, but detect
contiguous runs of tab blocks (use a flag like in_group and a current_group
list) so when you see the first tab in a consecutive run you start a new group,
keep adding subsequent tab matches to that group while their match.start() ==
last_end (or they are immediately adjacent in the source), and when a
non-adjacent match or gap is encountered flush the current_group by emitting a
single <div class="tab-set">...</div> (constructing safe_label and inner tab
divs using the same label/content handling) into output, then continue; after
the loop flush any open group and append the trailing text (text[last_end:]) so
original order and separate tab groups are preserved (refer to tab_block_re,
tabs/last_end variables, and the tab-set construction).

Comment on lines +58 to +70
btn.addEventListener("click", function () {
var text = code.textContent || code.innerText;
navigator.clipboard.writeText(text).then(
function () {
btn.innerHTML = '<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><polyline points="20 6 9 17 4 12"/></svg>';
setTimeout(function () {
btn.innerHTML = '<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><rect x="9" y="9" width="13" height="13" rx="2"/><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"/></svg>';
}, 2000);
},
function () {
btn.textContent = "Failed";
}
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard clipboard writes for offline/non-secure contexts.

Line 60 assumes navigator.clipboard.writeText is always available. In file:// exports or non-secure contexts this can fail at runtime and break copy UX.

Proposed fix
       btn.addEventListener("click", function () {
         var text = code.textContent || code.innerText;
+        if (
+          !window.isSecureContext ||
+          !navigator.clipboard ||
+          !navigator.clipboard.writeText
+        ) {
+          btn.textContent = "Unavailable";
+          return;
+        }
         navigator.clipboard.writeText(text).then(
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
btn.addEventListener("click", function () {
var text = code.textContent || code.innerText;
navigator.clipboard.writeText(text).then(
function () {
btn.innerHTML = '<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><polyline points="20 6 9 17 4 12"/></svg>';
setTimeout(function () {
btn.innerHTML = '<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><rect x="9" y="9" width="13" height="13" rx="2"/><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"/></svg>';
}, 2000);
},
function () {
btn.textContent = "Failed";
}
);
btn.addEventListener("click", function () {
var text = code.textContent || code.innerText;
if (
!window.isSecureContext ||
!navigator.clipboard ||
!navigator.clipboard.writeText
) {
btn.textContent = "Unavailable";
return;
}
navigator.clipboard.writeText(text).then(
function () {
btn.innerHTML = '<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><polyline points="20 6 9 17 4 12"/></svg>';
setTimeout(function () {
btn.innerHTML = '<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2"><rect x="9" y="9" width="13" height="13" rx="2"/><path d="M5 15H4a2 2 0 0 1-2-2V4a2 2 0 0 1 2-2h9a2 2 0 0 1 2 2v1"/></svg>';
}, 2000);
},
function () {
btn.textContent = "Failed";
}
);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@compiler/html_export/static/wiki.js` around lines 58 - 70, The click handler
assumes navigator.clipboard.writeText exists; wrap that call in a feature check
and fallback so offline/non-secure contexts don't break the UI: in the btn click
listener (where btn and code are used) first test if navigator.clipboard &&
typeof navigator.clipboard.writeText === "function" and call it, otherwise
perform a safe fallback copy (e.g., create a temporary textarea, select its
contents and use document.execCommand('copy')) and resolve/reject the same
success/error flows so the existing success SVG swap and failure "Failed" text
still run; also ensure any thrown errors are caught and routed to the failure
branch instead of letting them bubble.

Comment on lines +82 to +117
document.querySelectorAll(".callout").forEach(function (callout) {
var type = "";
if (callout.classList.contains("callout-note")) type = "Note";
else if (callout.classList.contains("callout-tip")) type = "Tip";
else if (callout.classList.contains("callout-warning"))
type = "Warning";
else if (callout.classList.contains("callout-info")) type = "Info";
else if (callout.classList.contains("callout-example"))
type = "Example";

// Check if already converted
if (callout.querySelector("details")) return;

var details = document.createElement("details");
details.className = callout.className;

var summary = document.createElement("summary");
summary.innerHTML =
'<span class="callout-icon">' +
getCalloutIcon(type) +
"</span> " +
"<strong>" +
type +
"</strong>";

// Move callout content into details
var content = document.createElement("div");
content.className = "callout-content";
while (callout.firstChild) {
content.appendChild(callout.firstChild);
}

details.appendChild(summary);
details.appendChild(content);
callout.parentNode.replaceChild(details, callout);
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Skip already-collapsible <details.callout> blocks to avoid double transformation.

Line 82 targets all .callout nodes, including <details class="callout ..."> produced by collapsed-callout conversion. Re-wrapping these breaks the intended structure.

Proposed fix
   function initCollapsibleCallouts() {
     document.querySelectorAll(".callout").forEach(function (callout) {
+      if (callout.tagName === "DETAILS") return;
       var type = "";
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
document.querySelectorAll(".callout").forEach(function (callout) {
var type = "";
if (callout.classList.contains("callout-note")) type = "Note";
else if (callout.classList.contains("callout-tip")) type = "Tip";
else if (callout.classList.contains("callout-warning"))
type = "Warning";
else if (callout.classList.contains("callout-info")) type = "Info";
else if (callout.classList.contains("callout-example"))
type = "Example";
// Check if already converted
if (callout.querySelector("details")) return;
var details = document.createElement("details");
details.className = callout.className;
var summary = document.createElement("summary");
summary.innerHTML =
'<span class="callout-icon">' +
getCalloutIcon(type) +
"</span> " +
"<strong>" +
type +
"</strong>";
// Move callout content into details
var content = document.createElement("div");
content.className = "callout-content";
while (callout.firstChild) {
content.appendChild(callout.firstChild);
}
details.appendChild(summary);
details.appendChild(content);
callout.parentNode.replaceChild(details, callout);
});
document.querySelectorAll(".callout").forEach(function (callout) {
if (callout.tagName === "DETAILS") return;
var type = "";
if (callout.classList.contains("callout-note")) type = "Note";
else if (callout.classList.contains("callout-tip")) type = "Tip";
else if (callout.classList.contains("callout-warning"))
type = "Warning";
else if (callout.classList.contains("callout-info")) type = "Info";
else if (callout.classList.contains("callout-example"))
type = "Example";
// Check if already converted
if (callout.querySelector("details")) return;
var details = document.createElement("details");
details.className = callout.className;
var summary = document.createElement("summary");
summary.innerHTML =
'<span class="callout-icon">' +
getCalloutIcon(type) +
"</span> " +
"<strong>" +
type +
"</strong>";
// Move callout content into details
var content = document.createElement("div");
content.className = "callout-content";
while (callout.firstChild) {
content.appendChild(callout.firstChild);
}
details.appendChild(summary);
details.appendChild(content);
callout.parentNode.replaceChild(details, callout);
});
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@compiler/html_export/static/wiki.js` around lines 82 - 117, The current loop
in document.querySelectorAll(".callout").forEach processes elements that are
already converted into collapsible blocks (i.e., <details class="callout ...">),
causing double-wrapping; update the iteration to skip nodes that are already
<details> (check callout.tagName or callout.matches('details') and
continue/return for those) so only non-<details> .callout elements are
transformed; keep the rest of the transformation (creating
details/summary/content, using getCalloutIcon, and replacing the node)
unchanged.

Comment on lines +137 to +170
document.querySelectorAll(".tab-set").forEach(function (tabSet) {
if (tabSet.querySelector(".tab-nav")) return; // Already initialized

var tabs = tabSet.querySelectorAll(".tab");
if (tabs.length === 0) return;

var nav = document.createElement("div");
nav.className = "tab-nav";
nav.setAttribute("role", "tablist");

tabs.forEach(function (tab, i) {
var btn = document.createElement("button");
btn.textContent = tab.getAttribute("data-label") || "Tab " + (i + 1);
btn.setAttribute("role", "tab");
btn.setAttribute("aria-selected", i === 0 ? "true" : "false");
btn.setAttribute("aria-controls", "tabpanel-" + i);
btn.className = i === 0 ? "active" : "";
btn.addEventListener("click", function () {
nav.querySelectorAll("button").forEach(function (b) {
b.classList.remove("active");
b.setAttribute("aria-selected", "false");
});
tabs.forEach(function (t) {
t.style.display = "none";
});
btn.classList.add("active");
btn.setAttribute("aria-selected", "true");
tab.style.display = "block";
});
nav.appendChild(btn);

tab.id = "tabpanel-" + i;
tab.setAttribute("role", "tabpanel");
if (i > 0) tab.style.display = "none";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Generate unique tabpanel IDs across tab sets.

Lines 152 and 168 reuse tabpanel-0, tabpanel-1, etc. per tab set. Multiple tab sets on a page create duplicate IDs and invalid ARIA relationships.

Proposed fix
-    document.querySelectorAll(".tab-set").forEach(function (tabSet) {
+    document.querySelectorAll(".tab-set").forEach(function (tabSet, tabSetIndex) {
...
-        btn.setAttribute("aria-controls", "tabpanel-" + i);
+        var panelId = "tabpanel-" + tabSetIndex + "-" + i;
+        btn.setAttribute("aria-controls", panelId);
...
-        tab.id = "tabpanel-" + i;
+        tab.id = panelId;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@compiler/html_export/static/wiki.js` around lines 137 - 170, The tabpanel IDs
and aria-controls are reused per tab set causing duplicate IDs; update the
tab-set initialization to generate unique IDs per set (e.g., include the tab set
index or a unique prefix). Use the outer forEach index (tabSet, setIndex) or a
counter and change both the btn.setAttribute("aria-controls", ...) and tab.id
assignments from "tabpanel-" + i to a unique name like "tabpanel-" + setIndex +
"-" + i so each tab and its control remain uniquely linked across multiple
.tab-set elements.

# Add parent to path for imports
sys.path.insert(0, str(Path(__file__).parent.parent))

from html_export.wikilink_converter import wikilinks_to_html, slugify
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix import sorting on line 9.

The import order is not alphabetical, causing a Ruff I001 linter failure.

🔧 Proposed fix
-from html_export.wikilink_converter import wikilinks_to_html, slugify
+from html_export.wikilink_converter import slugify, wikilinks_to_html
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from html_export.wikilink_converter import wikilinks_to_html, slugify
from html_export.wikilink_converter import slugify, wikilinks_to_html
🧰 Tools
🪛 GitHub Actions: CI / 3_lint-python.txt

[error] 9-9: Ruff I001: Import block is un-sorted or un-formatted. Organize imports.

🪛 GitHub Actions: CI / lint-python

[error] 9-9: Ruff I001: Import block is un-sorted or un-formatted. Organize imports.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@compiler/tests/test_html_export.py` at line 9, The import statement importing
wikilinks_to_html and slugify from html_export.wikilink_converter is not
alphabetically ordered; update the import order so it sorts identifiers
alphabetically (e.g., import slugify before wikilinks_to_html) in the test file
(referencing the symbols slugify and wikilinks_to_html) to satisfy Ruff I001.

Comment on lines +92 to +93
from html_export.exporter import _process_markdown
from pathlib import Path
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove duplicate import inside function.

Path is already imported at line 4. The import on line 93 inside test_markdown_processing is redundant and causes a Ruff I001 linter failure.

🔧 Proposed fix
 def test_markdown_processing():
     """Test full markdown processing pipeline."""
     from html_export.exporter import _process_markdown
-    from pathlib import Path
 
     text = """# Test Document
🧰 Tools
🪛 GitHub Actions: CI / 3_lint-python.txt

[error] 92-93: Ruff I001: Import block is un-sorted or un-formatted. Organize imports.

🪛 GitHub Actions: CI / lint-python

[error] 92-93: Ruff I001: Import block is un-sorted or un-formatted. Organize imports.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@compiler/tests/test_html_export.py` around lines 92 - 93, Remove the
redundant local import of Path inside the test_markdown_processing test: locate
the test function (test_markdown_processing) and delete the inner "from pathlib
import Path" statement since Path is already imported at the top of the file;
ensure there are no remaining references that require re-importing and run the
linter/tests to confirm the Ruff I001 issue is resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant