Merged
Robert Sachunsky requested to merge
improved-evaluation into
master
- recombine combining characters to previous letter char,
incorporate that into all metrics and remove metric
combining-e-umlauts
- introduce parameter
gt_level
for historic_latin
:
- add multi-character normalizations to
historic_latin
(historic ligatures and MUFI) when gt_level < 3
- use single-character equivalences beyond NFKC when
gt_level==1
- encapsulate counting edit distances into class,
use parallel aggregation algorithm for accurate mean and variance
estimates
- expose metric, gtlevel and confusion params to standalone CLI
eval
- expose confusion size param to OCR-D CLI
evaluate