Advanced search


Verónica Romero, Joan-Andreu Sánchez, Alejandro H. Toselli. Active Learning in Handwritten Text Recognition using the Derivational Entropy. 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), 2018. pp. 291-296.

Handwritten Text Recognition systems are based on statistical models such as recurrent neural networks or hidden Markov models for optical modeling of characters. These models need large corpora for training, consisting in text line images with their corresponding transcripts. The manual annotation of this training data is expensive because it is carried out by experts in paleography, who are specialized in reading ancient scripts. An alternative to reduce the annotation human effort is to use Active Learning techniques to selecting the most informative samples to be used for training. In this paper we study an Active Learning technique to selecting the most informative samples in an HTR scenario. The expert paleographer transcribes only the most informative samples in each stage. The technique followed here is based in the derivational entropy computed from word-graphs obtained from the recognition process.