feat: add new PDF document creation and image rendering capabilities by balazs-szucs · Pull Request #20 · grimmory-tools/PDFium4j

balazs-szucs · 2026-03-29T13:01:53Z

Summary by CodeRabbit

New Features
- Create new PDF documents.
- Export pages as JPEG/PNG bytes (adjustable JPEG quality).
- Read arbitrary metadata keys from PDFs.
- Detect/enumerate embedded images and render individual embedded images.
- Page-level blank detection (isBlank) and improved page rendering APIs.
Tests
- Added coverage for rendering, encoding, metadata, image introspection, and format validation.
Chores
- Project version bumped to 0.9.0 and internal parsing/regex improvements.

…extraction logic

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/main/java/org/grimmory/pdfium4j/PdfDocument.java`:
- Around line 621-622: The Javadoc for renderPageToBytes promises an
IllegalArgumentException for invalid pageIndex but the implementation calls
page(pageIndex) which throws PdfiumException; update the implementation to
explicitly validate pageIndex (use getPageCount()/pageCount or existing
pageCount field) before calling page(pageIndex) and throw an
IllegalArgumentException with a clear message when out of range, so the method
contract matches the docs (apply the same explicit validation to any other
methods in this file that call page(pageIndex) such as the overloads at lines
around 635–648).
- Around line 609-611: renderAllPages currently calls renderPages(0, pageCount()
- 1, dpi) which throws for an empty document; modify PdfDocument.renderAllPages
to check pageCount() first and return Collections.emptyMap() (or new
HashMap<>()) when pageCount() == 0, otherwise call renderPages(0, pageCount() -
1, dpi). Reference: renderAllPages, renderPages, and pageCount().

In `@src/main/java/org/grimmory/pdfium4j/PdfSaver.java`:
- Around line 32-37: Reformat the Pattern constant declarations in PdfSaver
(METADATA_REF_PATTERN, ROOT_REF_PATTERN, INFO_REF_PATTERN) to satisfy project
Spotless rules—either run ./gradlew spotlessApply or update the three
declarations to match the code style (spacing/line breaks) enforced by Spotless
so the spotlessJavaCheck passes; after reformatting, verify the PdfSaver class
compiles and the constants remain initialized with Pattern.compile("/...") as
shown.
- Line 38: The OBJ_NUM_PATTERN in PdfSaver only matches object declarations with
generation 0 and can miss higher-generation objects; update the regex to match
any generation (e.g. change "(\\d+)\\s+0\\s+obj\\b" to a pattern that allows any
generation like "(\\d+)\\s+\\d+\\s+obj\\b") and ensure code that computes
nextObj (and any other usages at the other occurrence around the 412-416 region)
continues to use the first capture group (object number) to compute the maximum
object id.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 261765b6-e292-43d7-91d8-b9391096ef8d

📥 Commits

Reviewing files that changed from the base of the PR and between 8cdafc3 and a6f9cac.

📒 Files selected for processing (5)

src/main/java/org/grimmory/pdfium4j/PdfDocument.java
src/main/java/org/grimmory/pdfium4j/PdfPage.java
src/main/java/org/grimmory/pdfium4j/PdfSaver.java
src/main/java/org/grimmory/pdfium4j/model/PdfBookMetadata.java
src/main/java/org/grimmory/pdfium4j/model/RenderResult.java

🚧 Files skipped from review as they are similar to previous changes (2)

src/main/java/org/grimmory/pdfium4j/model/RenderResult.java
src/main/java/org/grimmory/pdfium4j/PdfPage.java

📜 Review details

🧰 Additional context used

🪛 GitHub Actions: CI

src/main/java/org/grimmory/pdfium4j/PdfSaver.java

[error] 31-34: spotlessJavaCheck failed: formatting violations detected (ROOT_REF_PATTERN and INFO_REF_PATTERN line breaks/formatting differ from Spotless requirements). Run './gradlew spotlessApply' to fix.

🔇 Additional comments (7)

src/main/java/org/grimmory/pdfium4j/model/PdfBookMetadata.java (6)

45-47: Regex constants are clearer and better scoped.

Good rename/refinement of the metadata parsing patterns; this improves readability and intent.

103-109: Keyword parsing now consistently uses the shared separator pattern.

Nice consistency update; behavior remains clear and dedup logic is preserved.

158-165: Author fallback splitting is now aligned with keyword splitting.

Using the same separator pattern here reduces parsing drift between fields.

178-178: Locale-stable ISBN identifier matching looks correct.

toLowerCase(Locale.ROOT) is the right choice for deterministic metadata normalization.

191-192: Precompiled year pattern is a solid improvement.

Reusing FOUR_DIGIT_YEAR_PATTERN avoids repeated regex compilation in date parsing fallback.

Also applies to: 248-253

301-305: ISBN cleanup/validation updates are consistent with the renamed patterns.

This keeps the ISBN sanitation path coherent and easy to follow.

src/main/java/org/grimmory/pdfium4j/PdfDocument.java (1)

895-899: Nice fix on pending standard metadata key lookup.

Resolving through MetadataTag.fromKey(key) before checking pendingMetadata restores read-your-writes behavior for case-insensitive standard keys.

src/main/java/org/grimmory/pdfium4j/PdfDocument.java

src/main/java/org/grimmory/pdfium4j/PdfSaver.java

feat: add new PDF document creation and image rendering capabilities

8cdafc3

This comment was marked as resolved.

Sign in to view

coderabbitai bot added backend feature labels Mar 29, 2026

This comment was marked as resolved.

Sign in to view

balazs-szucs added 2 commits March 29, 2026 15:26

refactor: rename regex patterns for clarity and improve PDF metadata …

a6f9cac

…extraction logic

refactor: fix formatting

482cc9b

balazs-szucs merged commit 0ee1f5e into grimmory-tools:main Mar 29, 2026
1 check passed

coderabbitai bot reviewed Mar 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add new PDF document creation and image rendering capabilities#20

feat: add new PDF document creation and image rendering capabilities#20
balazs-szucs merged 3 commits intogrimmory-tools:mainfrom
balazs-szucs:new-stuff

balazs-szucs commented Mar 29, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

balazs-szucs commented Mar 29, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

balazs-szucs commented Mar 29, 2026 •

edited by coderabbitai bot

Loading