Skip to content

feat: add new PDF document creation and image rendering capabilities#20

Merged
balazs-szucs merged 3 commits intogrimmory-tools:mainfrom
balazs-szucs:new-stuff
Mar 29, 2026
Merged

feat: add new PDF document creation and image rendering capabilities#20
balazs-szucs merged 3 commits intogrimmory-tools:mainfrom
balazs-szucs:new-stuff

Conversation

@balazs-szucs
Copy link
Copy Markdown
Member

@balazs-szucs balazs-szucs commented Mar 29, 2026

Summary by CodeRabbit

  • New Features

    • Create new PDF documents.
    • Export pages as JPEG/PNG bytes (adjustable JPEG quality).
    • Read arbitrary metadata keys from PDFs.
    • Detect/enumerate embedded images and render individual embedded images.
    • Page-level blank detection (isBlank) and improved page rendering APIs.
  • Tests

    • Added coverage for rendering, encoding, metadata, image introspection, and format validation.
  • Chores

    • Project version bumped to 0.9.0 and internal parsing/regex improvements.

@coderabbitai

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

@balazs-szucs balazs-szucs merged commit 0ee1f5e into grimmory-tools:main Mar 29, 2026
1 check passed
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/main/java/org/grimmory/pdfium4j/PdfDocument.java`:
- Around line 621-622: The Javadoc for renderPageToBytes promises an
IllegalArgumentException for invalid pageIndex but the implementation calls
page(pageIndex) which throws PdfiumException; update the implementation to
explicitly validate pageIndex (use getPageCount()/pageCount or existing
pageCount field) before calling page(pageIndex) and throw an
IllegalArgumentException with a clear message when out of range, so the method
contract matches the docs (apply the same explicit validation to any other
methods in this file that call page(pageIndex) such as the overloads at lines
around 635–648).
- Around line 609-611: renderAllPages currently calls renderPages(0, pageCount()
- 1, dpi) which throws for an empty document; modify PdfDocument.renderAllPages
to check pageCount() first and return Collections.emptyMap() (or new
HashMap<>()) when pageCount() == 0, otherwise call renderPages(0, pageCount() -
1, dpi). Reference: renderAllPages, renderPages, and pageCount().

In `@src/main/java/org/grimmory/pdfium4j/PdfSaver.java`:
- Around line 32-37: Reformat the Pattern constant declarations in PdfSaver
(METADATA_REF_PATTERN, ROOT_REF_PATTERN, INFO_REF_PATTERN) to satisfy project
Spotless rules—either run ./gradlew spotlessApply or update the three
declarations to match the code style (spacing/line breaks) enforced by Spotless
so the spotlessJavaCheck passes; after reformatting, verify the PdfSaver class
compiles and the constants remain initialized with Pattern.compile("/...") as
shown.
- Line 38: The OBJ_NUM_PATTERN in PdfSaver only matches object declarations with
generation 0 and can miss higher-generation objects; update the regex to match
any generation (e.g. change "(\\d+)\\s+0\\s+obj\\b" to a pattern that allows any
generation like "(\\d+)\\s+\\d+\\s+obj\\b") and ensure code that computes
nextObj (and any other usages at the other occurrence around the 412-416 region)
continues to use the first capture group (object number) to compute the maximum
object id.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 261765b6-e292-43d7-91d8-b9391096ef8d

📥 Commits

Reviewing files that changed from the base of the PR and between 8cdafc3 and a6f9cac.

📒 Files selected for processing (5)
  • src/main/java/org/grimmory/pdfium4j/PdfDocument.java
  • src/main/java/org/grimmory/pdfium4j/PdfPage.java
  • src/main/java/org/grimmory/pdfium4j/PdfSaver.java
  • src/main/java/org/grimmory/pdfium4j/model/PdfBookMetadata.java
  • src/main/java/org/grimmory/pdfium4j/model/RenderResult.java
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/main/java/org/grimmory/pdfium4j/model/RenderResult.java
  • src/main/java/org/grimmory/pdfium4j/PdfPage.java
📜 Review details
🧰 Additional context used
🪛 GitHub Actions: CI
src/main/java/org/grimmory/pdfium4j/PdfSaver.java

[error] 31-34: spotlessJavaCheck failed: formatting violations detected (ROOT_REF_PATTERN and INFO_REF_PATTERN line breaks/formatting differ from Spotless requirements). Run './gradlew spotlessApply' to fix.

🔇 Additional comments (7)
src/main/java/org/grimmory/pdfium4j/model/PdfBookMetadata.java (6)

45-47: Regex constants are clearer and better scoped.

Good rename/refinement of the metadata parsing patterns; this improves readability and intent.


103-109: Keyword parsing now consistently uses the shared separator pattern.

Nice consistency update; behavior remains clear and dedup logic is preserved.


158-165: Author fallback splitting is now aligned with keyword splitting.

Using the same separator pattern here reduces parsing drift between fields.


178-178: Locale-stable ISBN identifier matching looks correct.

toLowerCase(Locale.ROOT) is the right choice for deterministic metadata normalization.


191-192: Precompiled year pattern is a solid improvement.

Reusing FOUR_DIGIT_YEAR_PATTERN avoids repeated regex compilation in date parsing fallback.

Also applies to: 248-253


301-305: ISBN cleanup/validation updates are consistent with the renamed patterns.

This keeps the ISBN sanitation path coherent and easy to follow.

src/main/java/org/grimmory/pdfium4j/PdfDocument.java (1)

895-899: Nice fix on pending standard metadata key lookup.

Resolving through MetadataTag.fromKey(key) before checking pendingMetadata restores read-your-writes behavior for case-insensitive standard keys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant