Welcome to the


The Pattern Recognition and Human Language Technology (PRHLT) research center is composed by researchers from the Universitat Politècnica de València (UPV) in the areas of Multimodal Interaction, Pattern Recognition, Image Processing (Image Analysis, Computer Vision, Handwritten Text Recognition, Document Analysis) and Language Processing (Speech Recognition and Understanding, Machine Translation, Information Retrieval).

The PRHLT center is an active research entity with important ongoing research projects, technology transfer activities, and research publications.


Big data and deep learning

“Machine Learning is the new electricity” Deep Learning is a technique that belongs to the Machine Learning Field. Machine Learning techniques learns from data. Nowadays the amount of data grows exponentially year after year. Therefore machine learning techniques obtain a great potential to solve very complex problems. Big-data is the perfect partner and deep learning techniques are becoming a standard thanks to the hardware and software advances. In PRHLT we have [...]

Read more


Speech processing and dialogue systems

Speech processing includes different applications, such like speech recognition and understanding, speech-to-speech translation, speech interaction with mobile devices, speaker and domain adaptation, and multimodal speech recognition. Dialogue systems related tasks are speech and multimodal based dialogue systems, statistical dialogue models, and automatic dialogue annotation.

Read more


Handwritten Text Recognition

Both off-line (document images) and on-line HTR (tablet or e-pen signals) are considered. No prior character or word segmentation is needed. Technology relies on character-level optical models based on Convolutional-Recurrent Neural Networks and Hidden Markov Models, , along with Finite-State Lexical and N-Gram Language Models. After model training, for each given text line image, a holistic (“Viterbi”) search provides both an optimal transcription and the corresponding word and character segmentations. [...]

Read more


Computer vision

General Statistical and Syntactic Pattern Recognition techniques for image analysis and recognition. Some applications: OCR and document analysis, medical diagnosis, biometric identification, image and video retrieval. Relevance-based Image Retrieval Biometrics

Read more


Language translation

The activities of the Machine Translation group began some years ago with the use of finite-state models for speech-to-speech translation and for text-to-text translation in limited domains. This group has developped a number of translation models with the corresponding learning algorithms and a number of prototypes for speech translation and computer-assisted translation. Currently, the Machine Translation group is devoted to the development of new interactive-predictive techniques for computer-assisted translation, techniques for [...]

Read more


Natural Language Processing

Social media data analysis: Author profiling, Stance detection, Deceptive opinion detection, Irony detection and sentiment analysis, Mixed-script text analysis, Plagiarism and social copying detection. Author profiling Given a text, what are the author’s traits? The focus is on inferring traits such as gender, age, native language, language variety, and personality on the basis of the stylistic analysis of the author’s texts. This is of interest for areas such [...]

Read more

Current Projects

DeepHealth: Deep-Learning and HPC to Boost Biomedical Applications for Health

Health scientific discovery and innovation are expected to quickly move forward under the so-called “fourth paradigm of science”, which relies on unifying the traditionally separated and heterogeneous high-performance computing and big data analytics environments. Under this paradigm, the DeepHealth project will provide HPC computing power at the service of biomedical applications; and apply Deep Learning (DL) techniques on large and complex biomedical datasets to support new and more efficient ways of [...]

Duration: 1 January 2019 to 30 June 2022
Read more

Deep learning for adaptive and multimodal interaction in pattern recognition (DeepPattern)

Los sistemas de reconocimiento de formas (pattern recognition) y aprendizaje automático (machine learning) no están libres de errores en sus predicciones por lo que en muchos casos es necesario la interacción con el usuario. En el paradigma del reconocimiento de formas interactivo se aprovecha la realimentación multimodal proporcionada por el usuario en cada interacción con el sistema, tanto para mejorar las predicciones del sistema (interactividad predictiva) como para mejorar los [...]

Duration: 1 January 2019 to 31 December 2022
Read more

Document Transcription with Interactive Ubiquitous Multimodal platforms (DocTIUM)

This project aims to make a step forward in the development of user centric intelligent tools for extracting knowledge from historical data. Starting from the recognition of historical document images and including the user in the loop through engaging experiences, we will develop the concept of the big data of the past. As use case driving the research, we will construct a browser of the memory of communities inspired in the [...]

Duration: 1 January 2019 to 30 September 2022
Read more

An automatic alert and recommendation system for food products using the nutrition label acquired by an image (FoodAlert)

The objective of this subproject is to develop the necessary tools for the automatic understanding and categorization of food labeling by the purchasing users. To this end, from images acquired from a mobile device, it is intended to make a literal transcription of the labeling of food products. This literal transcript may be used by other project members to assess the adequacy of this product to a given user profile. [...]

Duration: 1 January 2019 to 30 June 2022
Read more

SELENE: Self-monitored Dependable platform for Safety-Critical Systems

SELENE is aimed at proposing a new family of safety-critical computing platforms that builds upon open source components such as RISC-V cores, GNU/Linux, and Jailhouse hypervisor. SELENE will develop an advanced computing platform that is able to: adapt the system to the specific requirements of different application domains, to the changing environmental conditions, and to the internal conditions of the system itself allow the integration of applications of different criticalities [...]

Duration: 1 December 2019 to 30 November 2022
Read more

eXplainable AI for disinformation and conspiracy detection during infodemics (XAI-DisInfodemics)

The COVID-19 pandemic has increased the population’s time of exposure to digital content and produced a notable increase in disinformation. According to Eurobarometer 2018 no. 464, the majority of the Spanish population (88%), considers that disinformation is a problem. 66% claim to encounter false information at least once a week (Eurobarometer 503, 2020). If we take into account that Spanish is the second most widely spoken language in the world [...]

Duration: 1 December 2021 to 30 November 2024
Read more

Iberian Digital Media Research and Fact-Checking Hub (IBERIFIER)

IBERIFIER is an Iberian hub that aims to tackle disinformation in Spain and Portugal by bringing together a consortium of 23 partners, composed of 12 universities, 5 independent fact-checking organisations and publicly-owned news agencies, and 6 leading institutions on strategic analysis, computer and data science, and media research. With the support of public authorities of both countries, relevant media organisations, several scientific and professional associations, as well as some other stakeholders, [...]

Duration: 1 September 2021 to 29 February 2024
Members: P. Rosso
Read more

Searching in the Simancas Archive (SimancasSearch)

The main objective of SimancasSearch is the development of innovative and efficient techniques for i) enriching large handwritten document collections with transcription and semantic information, and ii) searching and localizing logical records in large collections. These techniques will rely on the probabilistic lattices and the PrIx’s derived from the HTR process in order to represent the uncertainty associated not only to the recognition process but also with the handwritten text [...]

Duration: 1 September 2021 to 31 August 2024
Read more

Resources and Applications for Detecting and Classifying Polarized and Hate Speech in Arabic Social Media

Societies are increasingly divided and polarized. This polarization is driven by two connected issues: the lack of communication between groups, and the use of hate speech. With social media speeding up the spread of hateful ideologies, polarization and technology go hand in hand. Statistics reveal the scale of the problem; 41% of people have been the target of hate speech. As communities recede into themselves, the prospect of conflict grows. [...]

Duration: 4 October 2021 to 19 April 2024
Members: P. Rosso
Read more

Latest News

More news

The PRHLT research center is co-organizing round 2 of the machine translation shared task from the Covid19-MLIA event

Covid-19 MLIA @ Eval organizes a community evaluation effort aimed at accelerating the creation of resources and tools for improved MultiLingual Information Access (MLIA) in the current emergency situation with [...]

PRHLT will participate in the IBERIFIER observatory on misinformation and fake news detection

The University of Navarra has obtained the support of the European Commission to lead a consortium of 23 Spanish and Portuguese institutions that will create an observatory to investigate digital [...]

Investigadores del centro PRHLT aparecen entre los más citados del planeta

Los investigadores fundadores del centro PRHLT, Enrique Vidal y Francisco Casacuberta aparecen entre los científicos más citados del planeta según el informe elaborado por John Ionnadis, Kevin Boyack y Jeroen Baas, [...]


PRHLT Research Center
Universitat Politècnica de València
Ciudad Politécnica la Innovación
Edif. 8B Acceso N Planta 0
Camí de Vera, s/n
46022 Valencia (VLC), Spain
(+34) 96 387 81 70