Multimodal Computer Assisted Transcription of Handwriting Images
- I Introduction
- icfhrInteracTranscrTut2010-I-2p.pdf (235K)
- I-p Off-line HTR in practice
- icfhrInteracTranscrTut2010-Ip-2p.pdf (2.0M)
- II Computer-Assisted Transcription of Text Images (CATTI)
- icfhrInteracTranscrTut2010-II-2p.pdf (181K)
- II-p CATTI in practice
- icfhrInteracTranscrTut2010-IIp-2p.pdf (178K)
- III Multimodality in CATTI (MM-CATTI)
- icfhrInteracTranscrTut2010-III-2p.pdf (949K)
- III-p Demostration of a complete MM-CATTI System in a real HTR task
- MM-CATTI demo overview
- MM-CATTI demo screencast overview
Practical GuideThe aim of this practice guide is to get familiar with the use of HTK (Hidden Markov Model ToolKit) applied in handwritten text recognition (HTR), and further, in computer assisted transcription of handwritten text (CATTI). In addition, brief explanations about the use of some homemade tools for image preprocessing and features extraction implemented for HTR will be given.
By far the most important software in this practice is "The Hidden Markov Model Toolkit (HTK), version 3.4", which (including its documentation) can be downloaded from http://htk.eng.cam.ac.uk.
In addition, in order to train n-grams language models, the software SRI Language Modeling Toolkit (SRILM) is required.
Furthermore, as this practice is completely developed in Linux, it is assumed that there is a prior knowledge and experience using this operating system and handling the standard GNU-Linux tools such as bash, awk, netpbm, xv, etc.
Guide: Exp-Guide.pdf (177K)
IAM Handwriting Database, http://www.iam.unibe.ch/fki/databases/iam-handwriting-database
Spanish-Number Corpus, SpanishNumbers.tar.bz2 (1.1M)
HTR processing tools, HTR-toolsUtils.tar.bz2 (24K)
IAMDB CATTI output, CATTI_IAMDB.log (379K)
Dr. Alejandro H. Toselli, Dr. Moises Pastor and Dr. Verónica Romero.