Error Message: MuPDF error: format error: cmsOpenProfileFromMem failed #3572
-
I am processing over 35,000 pdfs, and I continually get error message: "MuPDF error: format error: cmsOpenProfileFromMem failed". I cannot find any reference to this error in the documentation and Google is not much help. Is anyone familiar with A) What this error is in reference to and B) the ramifications of this error in terms of data extraction? Thx in advance! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
This message is issued when unsuccessfully trying to interpret ICC color profiles. Presumably there exist erroneous profiles in some PDFs. |
Beta Was this translation helpful? Give feedback.
-
The I got following advice on Discord which worked well for my use case (I don't mind at all about PDF images)
|
Beta Was this translation helpful? Give feedback.
This message is issued when unsuccessfully trying to interpret ICC color profiles. Presumably there exist erroneous profiles in some PDFs.
The message is issued from our base library, MuPDF so you may want to ask for more background in the public MuPDF Discord channel.
As a quick circumvention, why don't you wrap the processing in try/except clauses?
Another option might be to switch off ICC support by executing
pymupdf.TOOLS.set_icc(False)
.