Publications

Advanced search

Abstract

José-Miguel Benedí. Estudio de un sistema de reconocimiento automático del habla: TABARCA. Universidad Politécnica de Valencia. 1989. Advisor(s): Dr. E.Vidal and Dr. F.Casacuberta

This work has been developped under the framework of Automatic Speech Recognition (ASR) area. On the other hand, ASR is a part of the "perception" area in Artificial Intelligence. The main objetive of ASR is to permit the human-computer oral communication. This work deal with the study of an ASR System named TABARCA, it has been one of the proyects of ASR group of Polit_cnica University of Valencia. TABARCA is an Automatic Speech Recognition system based on a homogeneous and modulated architecture, and in which an essential assumption of uncertainty both in the data and the knowledge has been assumed. Uncertainty produced by the variability and noise of speech signal, and ambiguity and vagueness of available knowledge about the production and speech perception processes. The main characteristics of TABARCA are: The knowledge representation is based on weighted networks. The communication between levels is hierarchical and bottom up. The strategy of interpretation is passive and data driven. And the search of the most acceptable interpretation(s) is adapted to the model of vagueness proposed and in it a certain error correcting mechanism has been introduced. In general, this kind of system produces important information reductions at each level by "interpreting" the data coming from the lower level(s). The alternative proposed in this work aims to reduce the degradation of results by minimizing the loss of information associated with the interpretation process at each level. This process focus the incoming data into the most likely interpretations in terms of the associated categories. But the necessary information about other interpretations with less evidence will be also available for the higher levels. The validity of this model has been tested through a simple implementation example, which have led to the detection those weak points, and further investigations should be aimed at in order to obtain more realistic systems. These investigations are the main target of the present work of Thesis, whose principal contributions can be summarized: The first contribution is mainly experimental, and deals with the development of all of TABARCA levels, with the goal of being capable to support some realistic applications. This implys the necessity of robust and reliable lower levels (microphonetic and sublexic), and independent of the application. Next, TABARCA has been tested in three different applications. The second of them deals with the problem to obtain a new generalized formalism to take into account the uncertainty of both rules and data. Therefore, we have introduced the concepts of L-symbols, extended L- symbols, L-symbol automata, and an interpretation mechanism of uncertainty data (strings of extended L-symbols) with uncertainty knowledge (L-symbol automata). Basically, these L-symbol automata are weights automata where the input string symbols are sets of symbols with weights (of a certain alfabet of input categories). The last contribution takes on the problem of knowledge learning associated to several understanding levels. In this way, a new method of automatic inductive learning of weights of the syntactic structures by means of reestimation methods beginning from samples is proposed . This relates syntactic and Decision Theoretic methods. In this work, we have utilized an extension to Generalized Linear Discriminant Functions.