Welcome to the


The Pattern Recognition and Human Language Technology (PRHLT) research center is composed by researchers from the Universitat Politècnica de València (UPV) in the areas of Multimodal Interaction, Pattern Recognition, Image Processing (Image Analysis, Computer Vision, Handwritten Text Recognition, Document Analysis) and Language Processing (Speech Recognition and Understanding, Machine Translation, Information Retrieval).

The PRHLT center is an active research entity with important ongoing research projects, technology transfer activities, and research publications.


Big data and deep learning

“Machine Learning is the new electricity” Deep Learning is a technique that belongs to the Machine Learning Field. Machine Learning techniques learns from data. Nowadays the amount of data grows exponentially year after year. Therefore machine learning techniques obtain a great potential to solve very complex problems. Big-data is the perfect partner and deep learning techniques are becoming a standard thanks to the hardware and software advances. In PRHLT we have [...]

Read more


Speech processing and dialogue systems

Speech processing includes different applications, such like speech recognition and understanding, speech-to-speech translation, speech interaction with mobile devices, speaker and domain adaptation, and multimodal speech recognition. Dialogue systems related tasks are speech and multimodal based dialogue systems, statistical dialogue models, and automatic dialogue annotation.

Read more


Handwritten Text Recognition

Both off-line (document images) and on-line HTR (tablet or e-pen signals) are considered. No prior character or word segmentation is needed. Technology relies on character-level optical models based on Convolutional-Recurrent Neural Networks and Hidden Markov Models, , along with Finite-State Lexical and N-Gram Language Models. After model training, for each given text line image, a holistic (“Viterbi”) search provides both an optimal transcription and the corresponding word and character segmentations. [...]

Read more


Computer vision

General Statistical and Syntactic Pattern Recognition techniques for image analysis and recognition. Some applications: OCR and document analysis, medical diagnosis, biometric identification, image and video retrieval. Relevance-based Image Retrieval Biometrics

Read more


Language translation

The activities of the Machine Translation group began some years ago with the use of finite-state models for speech-to-speech translation and for text-to-text translation in limited domains. This group has developped a number of translation models with the corresponding learning algorithms and a number of prototypes for speech translation and computer-assisted translation. Currently, the Machine Translation group is devoted to the development of new interactive-predictive techniques for computer-assisted translation, techniques for [...]

Read more


Natural Language Processing

Social media data analysis: Author profiling, Stance detection, Deceptive opinion detection, Irony detection and sentiment analysis, Mixed-script text analysis, Plagiarism and social copying detection. Author profiling Given a text, what are the author’s traits? The focus is on inferring traits such as gender, age, native language, language variety, and personality on the basis of the stylistic analysis of the author’s texts. This is of interest for areas such [...]

Read more

Current Projects

Misinformation and Miscommunication in social media: FAKE news and HATE speech (MISMIS-FAKEnHATE)

Although social media are the default channel used by people to share information, ideas and opinions, they may contribute paradoxically to the polarization of society as we have recently witnessed in the last presidential elections in the USA and in the Brexit referendum. Every user ends up receiving only the information that matches her personal beliefs and viewpoints, with the risk of an intellectual isolation (filter bubble), where beliefs may [...]

Duration: 1 January 2019 to 31 December 2021
Read more

Arabic Author Profiling for Cyber-Security

Cyber-security has evolved to a key priority for Qatar and all nations over the world. Malicious actors from anywhere misuse the cyberspace to perpetrate various crimes such as phishing, Cyber-blackmailing, Cyber-bullying, and communicating or planning terrorist attacks using social media. For instance, there is a tendency from these cybercriminals to use similar writing styles in their messages, which makes it possible for security experts to detect and stop these threats [...]

Duration: 4 February 2017 to 4 February 2020
Members: P. Rosso
Read more

Carabela: probabilistic indexing of manuscript collections for the protection of underwater historic heritage

Please visit the Carabela project web site. The Carabela consortium includes researchers of the PRHLT center and the Centro de Arqueología Subacuática del Instituto Andaluz del Patrimonio Histórico. The goal of the project is to apply techniques that allow textual and large-scale searches in manuscripts from the 15th to 16th Centuries containing key information for identifying thousands [...]

Duration: 30 November 2017 to 30 November 2019
Read more

DeepHealth: Deep-Learning and HPC to Boost Biomedical Applications for Health

Health scientific discovery and innovation are expected to quickly move forward under the so-called “fourth paradigm of science”, which relies on unifying the traditionally separated and heterogeneous high-performance computing and big data analytics environments. Under this paradigm, the DeepHealth project will provide HPC computing power at the service of biomedical applications; and apply Deep Learning (DL) techniques on large and complex biomedical datasets to support new and more efficient ways of [...]

Duration: 1 January 2019 to 31 December 2021
Read more

IBEM: Indexing and search of mathematical expressions on a large scale in massive corpus of printed documents

Nowadays there exist large databases of digitized printed scientific documents, and many of them include mathematical expressions. The searching of textual information in these documents is currently a possibility widely exploited by the search engines of the most used web browsers. However, the searching in massive collections of digitized printed scientific documents with queries that are mathematical expressions is a research area scarcely explored. The methods that currently [...]

Duration: 1 November 2018 to 31 October 2020
Read more

Perfilado social de usuarios

La proliferación de las redes sociales y la ingente cantidad de información generada por las mismas (big data) proporcionan una gran oportunidad a las empresas para conocer mejor a sus clientes. Sin embargo, la cantidad de datos es habitualmente tan inabarcable que el reto principal de las compañías radica en seleccionar de todo ese corpus la información útil, aquella que mayor valor les puede aportar. El objetivo principal de este proyecto [...]

Duration: 13 March 2019 to 13 March 2020
Read more

HOME: History of Medieval Europe

Manuscripts are among the most important witnesses to our European shared cultural heritage and, while being increasingly digitized and published in large digital archives and libraries, they represent a valuable part of the European Digital Heritage. Its exploration, understanding, and dissemination of need new tools for promoting the community engagement with, and use of, heritage. Indeed, the wealth of information conveyed by the text captured in these images remains largely inaccessible, [...]

Duration: 1 September 2018 to 1 September 2020
Read more

HisClima : Dos Siglos de Datos Climáticos

El objetivo del proyecto es crear una plataforma inteligente que permita extraer información de miles de entradas de cuadernos de bitácora manuscritos que contienen (en un periodo de doscientos años y un amplio rango geográfico), datos sobre condiciones climatológicas diarias que pueden ser de gran utilidad a investigadores del cambio climático.

Duration: 30 April 2019 to 30 April 2021
Read more

Deep learning for adaptive and multimodal interaction in pattern recognition (DeepPattern)

Los sistemas de reconocimiento de formas (pattern recognition) y aprendizaje automático (machine learning) no están libres de errores en sus predicciones por lo que en muchos casos es necesario la interacción con el usuario. En el paradigma del reconocimiento de formas interactivo se aprovecha la realimentación multimodal proporcionada por el usuario en cada interacción con el sistema, tanto para mejorar las predicciones del sistema (interactividad predictiva) como para mejorar los [...]

Duration: 1 January 2019 to 31 December 2022
Read more

Document Transcription with Interactive Ubiquitous Multimodal platforms (DocTIUM)

This project aims to make a step forward in the development of user centric intelligent tools for extracting knowledge from historical data. Starting from the recognition of historical document images and including the user in the loop through engaging experiences, we will develop the concept of the big data of the past. As use case driving the research, we will construct a browser of the memory of communities inspired in the [...]

Duration: 1 January 2019 to 31 December 2021
Read more

An automatic alert and recommendation system for food products using the nutrition label acquired by an image (FoodAlert)

The objective of this subproject is to develop the necessary tools for the automatic understanding and categorization of food labeling by the purchasing users. To this end, from images acquired from a mobile device, it is intended to make a literal transcription of the labeling of food products. This literal transcript may be used by other project members to assess the adequacy of this product to a given user profile. [...]

Duration: 1 January 2019 to 31 December 2019
Read more

Latest News

More news

Grados de influencia en Twitter

El centro de investigación PRHLT desarrollará una herramienta para permitir a Vodafone conocer mejor a sus clientes. La proliferación de las redes sociales y la ingente cantidad de información generada por [...]

Advances in the development of a hybrid neural machine translation platform

The development of a hybrid neural machine translation platform reaches its first milestone. Its goal is the design and development of advance machine translation software using hybridization techniques over [...]

Colaboración con la empresa PANGEANIC en el desarrollo de una plataforma de traducción automática

El grupo de Traducción Automática del centro PRHLT está involucrado en el desarrollo de una plataforma de traducción automática basada en redes neuronales (Neural Machine Translation, NMT). Este desarrollo se está [...]


PRHLT Research Center
Universitat Politècnica de València
Ciudad Politécnica la Innovación
Edif. 8B Acceso N Planta 0
Camí de Vera, s/n
46022 Valencia (VLC), Spain
(+34) 96 387 81 70
Contact form

Write the text below (required)