Skip to content

improve evaluation:

Robert Sachunsky requested to merge improved-evaluation into master
  • recombine combining characters to previous letter char, incorporate that into all metrics and remove metric combining-e-umlauts
  • introduce parameter gt_level for historic_latin:
  • add multi-character normalizations to historic_latin (historic ligatures and MUFI) when gt_level < 3
  • use single-character equivalences beyond NFKC when gt_level==1
  • encapsulate counting edit distances into class, use parallel aggregation algorithm for accurate mean and variance estimates
  • expose metric, gtlevel and confusion params to standalone CLI eval
  • expose confusion size param to OCR-D CLI evaluate

Merge request reports