Publications

Advanced search

Abstract

Mauricio Villegas, Joan-Andreu Sánchez, Enrique Vidal. Optical Modelling and Language Modelling Trade-off for Handwritten Text Recognition. 13th International Conference on Document Analysis and Recognition, 2015. IEEE Computer Society.

Training the models needed for Automatic Handwritten Text Recognition of historical documents generally requires a significant amount of human effort. This is mainly due to the great differences that often exist between collections and to the lack of linguistic resources from the period when the documents were written, which results in a need of manual data labelling effort. This paper presents a study on the reuse of models trained with data from a different collection, focusing on the contribution that the language model and the optical models have on the performance. An empirical evaluation is performed using data from Jeremy Bentham manuscripts with the aim of recognising a manuscript about a very different topic written by Jane Austen.