1. 07 Jan, 2022 1 commit
  2. 23 Dec, 2021 1 commit
  3. 14 Sep, 2019 1 commit
  4. 03 Sep, 2019 1 commit
  5. 09 May, 2019 1 commit
    • Robert Schubert's avatar
      add many more scripts, document recipes: · 1b6702d4
      Robert Schubert authored
      - improve tesserocr-batch: add multiprocessing
        and probability / confmat output
      - new script dta-txt2gt for DTA text-only GT
        (including line wrapping with hyphenation)
      - new script ocrd-gt2pkl to reduce METS/PAGE
        workspaces into plaintext pickle dumps
        (including alignment)
      - new script prob2pkl to combine OCR results
        for plaintext string and probabilities into
        pickle dump format
      - new script confmat2pkl to combine OCR results
        for plaintet string with alternatives and
        probabilities into pickle dump format
      - add proper module installation
      1b6702d4
  6. 04 Nov, 2018 1 commit
  7. 18 Oct, 2018 1 commit
    • Robert Sachunsky's avatar
      improved dta19-reduced script · 538c52de
      Robert Sachunsky authored
      - faster (batch processing, parallel)
      - tesseract OCR uses custom CLI (for batch processing, for confidence output)
      - added ocropus OCR (2 models)
      - generate confidence output (1-best only), encapsulate as tuples into pickle files
      538c52de
  8. 12 Sep, 2018 2 commits
  9. 11 Sep, 2018 2 commits
  10. 13 Jul, 2018 1 commit
  11. 14 Feb, 2018 1 commit