Publications

Advanced search

Abstract

Moisés Pastor-i-Gadea, Enrique Vidal, Francisco Casacuberta. A Bi-modal Handwritten Text Corpus. Instituto Tecnológico de Informática. 2009.

Handwritten text is generally captured through two main modalities: off-line and on-line. Each modality has advantages and disadvantages, but it seems clear that smart approaches to handwritten text recognition (HTR) should make use of both modalities in order to take advantage of the positive aspects of each one. A particularly interesting case where the need of this bi-modal processing arises is when an off-line text, written by some writer, is considered along with the on-line modality of the same text written by another writer. This happens, for example, in computer-assisted transcription of text images, where on-line text can be used to interactively correct errors made by a main off-line HTR system. In order to develop adequate techniques to deal with this challenging bi-modal HTR recognition task, a suitable corpus is needed. We have collected such a corpus using data (word segments) from the publicly available off-line and on-line IAM data sets. In order to establish baseline performance figures, we have also obtained uni-modal results for each modality, as well as bi-modal results using basic, naive Bayes modality fusion techniques.