Skip to content

Conversation

@patrick91
Copy link
Contributor

We were discussing a bug with @bellini666 and decided to spin up a reproduction and throw it to claude code to see if found the issue, and I think it did :)

You can find the full Claude Code conversation here: https://gist.github.com/patrick91/beb523de5f3942ad6ce7636a2150c8e0

I think this closes #3450 😊

Related #2558

I can't share the PDFs we used because as templates, but I'm pretty sure it works with any PDFs, but this is the script that we used to reproduce it:

import pathlib
import sys

from pypdf import PaperSize, PdfReader, PdfWriter, Transformation

here = pathlib.Path(__file__).parent

# This comes from https://github.com/py-pdf/pypdf/issues/3450#issuecomment-3275913390
should_do_workaround = "workaround" in sys.argv
print(f"Using workaround: {should_do_workaround}")

output = PdfWriter()
readers = []
for i in range(0, 40, 4):
    page = output.add_blank_page(width=PaperSize.A4[0], height=PaperSize.A4[1])
    for j in range(4):
        if i + j >= 40:
            break
        pdf = PdfReader(str(here / f"pdfs/output{i + j}.pdf"))
        if should_do_workaround:
            readers.append(pdf)
        src_page = pdf.pages[0]
        ctm = Transformation()
        # scale if needed, considering the original size, or rotate if it is landscape
        if src_page.mediabox.width > src_page.mediabox.height:
            ctm = ctm.rotate(90)
            scale = min(
                PaperSize.A4[1] / src_page.mediabox.width,
                PaperSize.A4[0] / src_page.mediabox.height,
            )
        else:
            ctm = ctm.rotate(0)
            scale = min(
                PaperSize.A4[0] / 2 / src_page.mediabox.width,
                PaperSize.A4[1] / 2 / src_page.mediabox.height,
            )

        ctm = ctm.scale(scale)
        if j == 1:
            ctm = ctm.translate(PaperSize.A4[0] / 2, 0)
        elif j == 2:
            ctm = ctm.translate(0, PaperSize.A4[1] / 2)
        elif j == 3:
            ctm = ctm.translate(PaperSize.A4[0] / 2, PaperSize.A4[1] / 2)

        page.merge_transformed_page(src_page, ctm, expand=True)

python_version = f"{sys.version_info.major}_{sys.version_info.minor}_{sys.version_info.micro}"

if should_do_workaround:
    python_version += "_workaround"

output.write(f"output_{python_version}.pdf")

@patrick91 patrick91 changed the title Fix missing "PreventGC" when cloning BUG: Fix missing "PreventGC" when cloning Nov 13, 2025
@codecov
Copy link

codecov bot commented Nov 13, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.14%. Comparing base (5dd8a42) to head (5f590cf).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3520   +/-   ##
=======================================
  Coverage   97.14%   97.14%           
=======================================
  Files          57       57           
  Lines        9791     9792    +1     
  Branches     1775     1775           
=======================================
+ Hits         9511     9512    +1     
  Misses        168      168           
  Partials      112      112           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@bellini666 bellini666 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the issue for me! 🙏

Copy link
Collaborator

@stefan6419846 stefan6419846 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR.

Instead of a generic type ignore, could you please use the actual error category there? Additionally, do you see any way to test this (in a manner that it would fail before the change) to avoid accidentally breaking this during any refactoring, especially given that there only are few occasions where this can be reproduced?

@patrick91
Copy link
Contributor Author

Thanks for the PR.

Instead of a generic type ignore, could you please use the actual error category there? Additionally, do you see any way to test this (in a manner that it would fail before the change) to avoid accidentally breaking this during any refactoring, especially given that there only are few occasions where this can be reproduced?

I can try :)

For the type ignore, do you want me to update the other existing one?

@stefan6419846
Copy link
Collaborator

For the type ignore, do you want me to update the other existing one?

At least new code should use the "correct" approach, thus updating the existing one is not necessary, but you can improve it as well.

@bellini666
Copy link
Contributor

@stefan6419846 I pushed 2 tests, one for the new fix and another one for the old one.

Both will fail if I comment out the PreventGC line

@stefan6419846 stefan6419846 merged commit 103f0f9 into py-pdf:main Nov 14, 2025
16 checks passed
stefan6419846 added a commit that referenced this pull request Nov 16, 2025
## What's new

### New Features (ENH)
- Wrap and align text in flattened PDF forms (#3465) by @PJBrs

### Bug Fixes (BUG)
- Fix missing "PreventGC" when cloning (#3520) by @patrick91
- Preserve JPEG image quality by default (#3516) by @Lucas-C

[Full Changelog](6.2.0...6.3.0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Generated PDF is corrupted when using Python 3.13.7 but not with 3.12.10

3 participants