Skip to content

Is there any way that I can identify whether the PDF is edited/tampered and the exact location where the PDF is edited/tampered using Python? #892

Discussion options

You must be logged in to vote

In your example cases, the answer to the question "Are the PDFs equal?" is simple.
You can use the file creation / modification timestamps to answer this, and / or the file sizes. Best use the Python built-in module os here.

On the PDF level, you can look at some core indicators:

>>> import fitz
>>> doc1 = fitz.open("sbi statment_out2.pdf")
>>> doc2 = fitz.open("sbi statment_out2_Sejda_edited.pdf")
>>> from pprint import pprint
>>> pprint(doc1.metadata)
{'author': '',
 'creationDate': "D:20200911140637+05'30'",
 'creator': '',
 'encryption': None,
 'format': 'PDF 1.4',
 'keywords': '',
 'modDate': "D:20200911140637+05'30'",
 'producer': 'iText 2.0.4 (by lowagie.com)',
 'subject': '',
 'ti…

Replies: 3 comments 8 replies

Comment options

You must be logged in to vote
8 replies
@JorjMcKie
Comment options

@AbhishekTanksali
Comment options

@AbhishekTanksali
Comment options

@JorjMcKie
Comment options

@AbhishekTanksali
Comment options

Answer selected by JorjMcKie
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants