Advanced search


Vicent Alabau, Verónica Romero, Antonio L. Lagarda, Carlos D. Martínez-Hinarejos. A Multimodal Approach to Dictation of Handwritten Historical Documents. Proceedings of the Interspeech 2011, 2011. pp. 2245-2248.

Interactive machine translation (IMT) is an increasingly popular paradigm for semi-automated machine translation, where a human expert is integrated into the core of an automatic machine translation system. The human expert interacts with the IMT system by partially correcting the errors of the system's output. Then, the system proposes a new solution. This process is repeated until the output meets the desired quality. In this scenario, the interaction is typically performed using the keyboard and the mouse. However, speech is also a very interesting input modality since the user does not need to abandon the keyboard to interact with it. In this work, we present a new approach to perform speech interaction in a way that translation and speech inputs are tightly fused. This integration is performed early in the speech recognition step. Thus, the information from the translation models allows the speech recognition system to recover from errors that otherwise would be impossible to amend. In addition, this technique allows to use currently available speech recognition technology. The proposed system achieves an important boost in performance with respect to previous approaches.