Duration: 1 September 2025 to 31 August 2028
Grant PID2024-161104OB-C21 funded by MICIU/AEI/10.13039/501100011033 and by ERDF/EU
PI: Joan Andreu Sánchez, Moisés Pastor
Members: Roberto Paredes, Dan Anitei, Manuel Villarreal, David Villanova, Aitana Menárguez, Enrique Vidal, Miguel Domingo

EMBRACING UNCERTAINTY IN DEEP NEURONAL MODELS (EUDEEP) is a three-year coordinated project proposal composed of two sub-projects involving five institutions: Universitat Politècnica de València, Universitat de València, Ministry of Culture General Archive of Simancas and National Historical Archive. Expert researchers in different fields participate in this team: Machine Learning, Deep Neural Networks, Handwritten Text Recognition, Natural Language Processing, Machine Translation, Intelligent User Interfaces and Archival Science.

EUDEEP’s purpose is to investigate how to deal with the uncertainty that is exhibited by the training data of deep neural networks, as well as with the uncertainty associated with the output of these models. To this end, deep machine learning techniques will be studied. The hypothesis in EUDEEP is that uncertainty is inherent both to the the Pattern Recognition tasks and to the deep NN models at all processing levels. So, for example, when working with big data it is practically impossible to obtain error-free, unambiguous input data and, similarly, no “perfect” output results can be assumed. Therefore, it is necessary to adopt uncertainty as a rich source of information and incorporate it into workflows that use Deep Neural Networks.

The EUDEEP advances will be applied in different use cases: Ancient Handwritten Text Recognition, Printed Mathematical Expression Recognition, Handwritten Sheet Music Recognition, Machine Translation and other Natural Language Processing tasks such as Named Entity Recognition. It is intended to work with massive data collections so that the end user can efficiently and effectively locate information in these collections.

To bring the project’s developments closer to the end user, EUDEEP will study adaptive user interaction techniques. Likewise, methods to understand user search requirements will be explored. The reason for this is that, in massive collections, it is difficult to know in advance how the relevant information is written, depicted, or described. Users searching for information in such collections must interact repeatedly until they locate helpful information. Search engines should consider the user feedback both to streamline and to speed up the search process.

The EUDEEP developments will be applied to data that are already available to the project: collections of historical documents from the General Archive of Simancas and the National Historical Archive; public collections of handwritten musical texts and collections of scientific papers that contain innumerable mathematical expressions obtained in previous projects.

Project PID2024-161104OB-C21 funded by: