Skip to content

v2.0.0

Latest
Compare
Choose a tag to compare
@bertsky bertsky released this 12 Mar 12:35
· 2 commits to master since this release
ee1fe28

What's Changed

  • Fix typo in README.md by @konstantinschulz in #25
  • Remove unused batch files by @stweil in #26
  • Fix dockerimage creation by @joschrew in #27
  • rewrite: Pythonic, ocrd v3, utilise page-level annotation by @bertsky in #28
    • use Python instead of bashlib: faster, more flexible
    • include a distribution of the PRImA PDF converter as package data
    • instead of just the original image files, extract image data from the PAGE annotation, including any AlternativeImage
    • for that, introduce params image_feature_selector and image_feature_filter (e.g. cropped,deskewed,binarized)
    • support processing with METS Server and all new ocrd>=3.0 user-configurable features (page-parallel processing, page timeouts, error handling)
    • extend negative2zero to full PAGE validation and repairs for coordinates
    • back the font parameter by downloadable resources (ocrd resmgr); provide a variety of preconfigured fonts
    • multipage: add setting pagelabels=pagelabels for @ORDER and @ORDERLABEL from physical structMap
    • multipage: add parameter multipage_only to only keep the document-wide PDF, not the page-wise PDF files
    • multipage: add logical structMap divs as outline labels (PDF bookmarks)
    • multipage: improve and add more metadata, use proper formatting (string encoding, dates)
    • multipage: add MODS as extra XMP metadata payload
    • improve logging and relaying error messages
    • add processor ocrd-altotopdf (with limited features) besides ocrd-pagetopdf
    • add regression tests, CI and CD

Full Changelog: v1.1.0...v0.2.0