Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: look for regressions when converting PDFs #1089

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

almet
Copy link
Member

@almet almet commented Mar 5, 2025

I convert all the documents we have in our test suite and store them in a reference folder, and then compare this bit to bit, using pymupdf pixel buffers.

We could use another diffing tool and tell it what's acceptable (if that exists), but I believe that in the end what matters most is the developer experience when we have an output that changes.

Two things come to mind:

  1. Inspect the differences
  2. Updating the reference version(s)

Inspecting the diff

There are multiple tools that allow to do that, but I found that diff-pdf good and able to generate an output we can look at without having to run a GUI.

diff-pdf /tmp/pytest-of-alexis/pytest-current/sample-docx0.pdf ./tests/test_docs/reference/sample-docx.pdf -m --output-diff=diff.pdf

Produces this diff.pdf file for the changes between the 0.8.1 release and this commit.

Update the reference version

We should have a command to bump all the reference documents (or a specific one).


Status:

This PR currently only fails tests when there a change in the output. I plan to do the following:

  • Check that PDF outputs are the same (pixel comparison) in our tests
  • Collect all differences and publish them as an artifact so we can inspect them, probably as part of the CI.
  • Add a tool to update all the reference documents.

Fixes #321

This stores a reference version of the converted PDFs and diffs them when
the newly converted document during the tests.
@almet almet changed the title tests: test for regressions when converting PDFs when running the tests tests: look for regressions when converting PDFs when running the tests Mar 6, 2025
@almet almet changed the title tests: look for regressions when converting PDFs when running the tests tests: look for regressions when converting PDFs Mar 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

Verify PDF output of CI tests
1 participant