Automatic speech recognition (ASR) systems employ several models that affect their performance. Those models are basically the acoustic models (usually Hidden Markov Models – HMM) and language models. HMM parameters are estimated from training data (acoustic segments) and are used in the decoding of an unknow acoustic sequence. There are several tools for both the estimation and the decoding processes, with different features and performance. In this project, the combination of the different tools and the implementation of new tools were explored. The new recognition tool was applied to a assisted transcription task, where two different information sources are present: handwritten text and speech. The combination of these two sources was used to improve the performance of the classic systems that were previously used in this task.
Duration: 31 January 2007 to 31 January 2009
Supported by: Universitat Politècnica de València (UPV) under reference PAID2006-20070315