What's Changed
- Fix typo in README.md by @konstantinschulz in #25
- Remove unused batch files by @stweil in #26
- Fix dockerimage creation by @joschrew in #27
- rewrite: Pythonic, ocrd v3, utilise page-level annotation by @bertsky in #28
- use Python instead of bashlib: faster, more flexible
- include a distribution of the PRImA PDF converter as package data
- instead of just the original image files, extract image data from the PAGE annotation, including any
AlternativeImage
- for that, introduce params
image_feature_selector
andimage_feature_filter
(e.g.cropped,deskewed,binarized
) - support processing with METS Server and all new
ocrd>=3.0
user-configurable features (page-parallel processing, page timeouts, error handling) - extend
negative2zero
to full PAGE validation and repairs for coordinates - back the
font
parameter by downloadable resources (ocrd resmgr
); provide a variety of preconfigured fonts multipage
: add settingpagelabels=pagelabels
for@ORDER
and@ORDERLABEL
from physical structMapmultipage
: add parametermultipage_only
to only keep the document-wide PDF, not the page-wise PDF filesmultipage
: add logical structMap divs as outline labels (PDF bookmarks)multipage
: improve and add more metadata, use proper formatting (string encoding, dates)multipage
: add MODS as extra XMP metadata payload- improve logging and relaying error messages
- add processor
ocrd-altotopdf
(with limited features) besidesocrd-pagetopdf
- add regression tests, CI and CD
Full Changelog: v1.1.0...v0.2.0