ICFHR-2010 Tutorial

Multimodal Computer Assisted Transcription of Handwriting Images

Tutorial Slides


I Introduction
icfhrInteracTranscrTut2010-I-2p.pdf (235K)
I-p Off-line HTR in practice
icfhrInteracTranscrTut2010-Ip-2p.pdf (2.0M)
II Computer-Assisted Transcription of Text Images (CATTI)
icfhrInteracTranscrTut2010-II-2p.pdf (181K)
II-p CATTI in practice
icfhrInteracTranscrTut2010-IIp-2p.pdf (178K)
III Multimodality in CATTI (MM-CATTI)
icfhrInteracTranscrTut2010-III-2p.pdf (949K)
III-p Demostration of a complete MM-CATTI System in a real HTR task
MM-CATTI demo overview
MM-CATTI demo screencast overview

Practical Guide

The aim of this practice guide is to get familiar with the use of HTK (Hidden Markov Model ToolKit) applied in handwritten text recognition (HTR), and further, in computer assisted transcription of handwritten text (CATTI). In addition, brief explanations about the use of some homemade tools for image preprocessing and features extraction implemented for HTR will be given.

By far the most important software in this practice is "The Hidden Markov Model Toolkit (HTK), version 3.4", which (including its documentation) can be downloaded from http://htk.eng.cam.ac.uk.

In addition, in order to train n-grams language models, the software SRI Language Modeling Toolkit (SRILM) is required.

Furthermore, as this practice is completely developed in Linux, it is assumed that there is a prior knowledge and experience using this operating system and handling the standard GNU-Linux tools such as bash, awk, netpbm, xv, etc.

Guide: Exp-Guide.pdf (177K)

IAM Handwriting Database, http://www.iam.unibe.ch/fki/databases/iam-handwriting-database
Spanish-Number Corpus, SpanishNumbers.tar.bz2 (1.1M)
HTR processing tools, HTR-toolsUtils.tar.bz2 (24K)
IAMDB CATTI output, CATTI_IAMDB.log (379K)

Dr. Alejandro H. Toselli, Dr. Moises Pastor and Dr. Verónica Romero.