Advanced search


Míriam Luján-Mares, Carlos D. Martínez-Hinarejos, Vicent Alabau. A study on bilingual speech recognition involving a minority language. 3rd Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, 2007. pp. 138-142.

Multilingual Automatic Speech Recognition (ASR) systems are of great interest in multilingual environments. We studied the case of the Comunitat Valenciana because the two official languages are Spanish and Valencian. These two languages share most of their phonemes and their syntax and vocabulary are also quite similar since they have influenced each other for many years. In this work, we present the design of the language models and the acoustic models for this bilingual situation. We performed experiments with a small corpus to determine which option was better. Acoustic models can be separate for each language or shared by both of them, and they can be obtained directly from a training corpus or by adapting a previous set of acoustic models. Language models can be separate for each language (monolingual recognition) or mixed for both languages (bilingual recognition). We present the results of different experiments, which show that our Spanish-Valencian bilingual speech recognizer is feasible.