Welcome to the

PRHLT RESEARCH CENTER

The Pattern Recognition and Human Language Technology (PRHLT) research center is composed by researchers from the Universitat Politècnica de València (UPV) in the areas of Multimodal Interaction, Pattern Recognition, Image Processing (Image Analysis, Computer Vision, Handwritten Text Recognition, Document Analysis) and Language Processing (Speech Recognition and Understanding, Machine Translation, Information Retrieval).

The PRHLT center is an active research entity with important ongoing research projects, technology transfer activities, and research publications.

ÁREAS DE INVESTIGACIÓN

Big data and deep learning

“Machine Learning is the new electricity” Deep Learning is a technique that belongs to the Machine Learning Field. Machine Learning techniques learns from data. Nowadays the amount of data grows exponentially year after year. Therefore machine learning techniques obtain a great potential to solve very complex problems. Big-data is the perfect partner and deep learning techniques are becoming a standard thanks to the hardware and software advances. In PRHLT we have [...]

Leer más

ÁREAS DE INVESTIGACIÓN

Speech processing and dialogue systems

Speech-to-speech translation or text-to-text translation for limited domains fall within these kind of projects. Finite-state and statistical transducers are used as the basis of the machine translation systems. These models can be learnt automatically from real examples of translation. Some applications included (but are not limited to) translation of technical reports, hotel services, Speech interaction with mobile devices Speaker and domain adaptation Statistical dialogue annotation models Multimodal speech recognition

Leer más

ÁREAS DE INVESTIGACIÓN

Handwritten Text Recognition

Both off-line (document images) and on-line HTR (tablet or e-pen signals) are considered. No prior character or word segmentation is needed. Technology, borrowed from Speech Recognition, relies on character Hidden Markov Models, Finite State word models, and syntactic N-Grams. After model training, for each given text line image, a holistic (“Viterbi”) search provides both an optimal transcription and the corresponding word and character segmentations. Applications: Transcription of ancient and legacy [...]

Leer más

ÁREAS DE INVESTIGACIÓN

Computer vision

General Statistical and Syntactic Pattern Recognition techniques for image analysis and recognition. Some applications: OCR and document analysis, medical diagnosis, biometric identification, image and video retrieval. Relevance-based Image Retrieval Biometrics

Leer más

ÁREAS DE INVESTIGACIÓN

Language translation

Las actividades del grupo de Traducción Automática comenzaron hace algunos años con el uso de modelos de estados finitos para la traducción de voz a voz y la traducción de texto a texto en dominios limitados. Este grupo ha desarrollado una serie de modelos de traducción con sus correspondientes algoritmos de aprendizaje y una serie de prototipos para la traducción de voz y la traducción asistida por ordenador. Actualmente, el grupo [...]

Leer más

ÁREAS DE INVESTIGACIÓN

Natural Language Processing

For many languages that use non-Roman based indigenous scripts (e.g., Arabic, Greek and Indic languages) one can often find a large amount of user generated transliterated content on the Web in the Roman script. IR in such space is challenging because queries written in either the native or the Roman scripts need to be matched to the documents written in both the scripts. Moreover, transliterated content features extensive spelling variations. [...]

Leer más

Proyectos actuales

READ: Recognition and Enrichment of Archival Documents

The overall objective of READ is to implement a Virtual Research Environment where archivists, humanities scholars, computer scientists and volunteers are collaborating with the ultimate goal of boosting research, innovation, development and usage of cutting edge technology for the automated recognition, transcription, indexing and enrichment of handwritten archival documents. This Virtual Research Environment will not be built from the ground up, but will benefit from research, tools, data and resources [...]

Duración: 1 febrero 2017 hasta 30 junio 2019
Leer más

ALMAMATER: Adaptive Learning and MultimodAlity in MAchine Translation and tExt tRanscription

ALMAMATER pretende seguir impulsando y llegar a consolidar el marco de investigación que pretende explorar los retos y oportunidades que ofrece la adaptación de la tecnología existente de Reconocimiento de Formas (RF) en un entorno de interacción con el usuario. Concretamente, a partir de los trabajos realizados hasta ahora en el proyecto PROMETEO sobre el aprovechamiento de la realimentación proporcionada por el usuario en cada interacción, se va a trabajar [...]

Duración: 1 enero 2014 hasta 31 diciembre 2017
Leer más

HIMANIS: HIstorical MANuscript Indexing for user-controlled Search

Manuscripts are among the most important witnesses to our European shared cultural heritage. In recent years, large quantities of historical handwritten documents are being scanned and made available through web portals. Yet, the wealth of information conveyed by the text captured in these images remains largely inaccessible. General users and researchers more and more expect to query handwritten resources in plain text like printed books, but current handwritten text recognition [...]

Duración: 1 noviembre 2015 hasta 31 octubre 2017
Leer más

SOCOCODE: SOCIAL COPYING COMMUNITY DETECTION

Copying in social media is a very important topic to investigate because it is becoming quite a standard way of communicating, with increased copying being considered positive because it is evidence of the higher influence of the information source. Although social copying is related to plagiarism, it has different personal and social dynamics, being not an ethical issue in social media. Therefore, it will have to be analyzed from a [...]

Duración: 9 junio 2014 hasta 8 junio 2017
Leer más

SomEMBED: SOcial Media language understanding-EMBEDing contexts

SomEMBED (SOcial Media language understanding – EMBEDing contexts) is a coordinated project whose goal is to advance in the area of Computational Linguistics (CL) and in Natural Language Processing (NLP) in order to deal with and solve the challenges posed by the use of language in the social media: (i) from CL, our goal is to develop techniques and methods for modeling non-standard language from representative corpus of the social [...]

Duración: 1 enero 2016 hasta 31 diciembre 2018
Leer más

CoMUN-HaT: Contexto, multimodalidad y colaboración del usuario en procesado de texto manuscrito

Processing of handwritten documents is a task that is of wide interest for many purposes, such as those related to preserve cultural heritage. Handwritten text recognition techniques have been successfully applied during the last decade to obtain transcriptions of handwritten documents, and keyword spotting techniques have been applied for searching specific terms in image collections of handwritten documents. However, results on transcription and indexing are far from perfect, although models [...]

Duración: 1 enero 2016 hasta 31 diciembre 2018
Leer más

Arabic Author Profiling for Cyber-Security

Cyber-security has evolved to a key priority for Qatar and all nations over the world. Malicious actors from anywhere misuse the cyberspace to perpetrate various crimes such as phishing, Cyber-blackmailing, Cyber-bullying, and communicating or planning terrorist attacks using social media. For instance, there is a tendency from these cybercriminals to use similar writing styles in their messages, which makes it possible for security experts to detect and stop these threats [...]

Duración: 4 febrero 2017 hasta 4 febrero 2020
Miembros: P. Rosso
Leer más

Últimas noticias

Más noticias

Valencia acoge el mayor encuentro europeo de Lingüística Computacional

05.04.2017
La 15ª edición de la conferencia de la European Chapter of the Association for Computational Linguistics (EACL), que se celebra en el Palacio de Congresos de Valencia, es uno de [...]

Laia: A deep learning toolkit for HTR

16.12.2016
Three members of this center, Joan Puigcerver, Daniel Martin-Albo and Mauricio Villegas, have released the first version of an open source deep learning toolkit for handwritten text recognition. Good job! Find [...]

BDVA 2016 Valencia Summit

La Universitat Politècnica de València acogerá la tercera edición del Summit, organizado por la Big Data Value Association (BDVA), que tendrá lugar en Valencia del 29 de noviembre al 2 [...]

Contacto

PRHLT Research Center
Universitat Politècnica de València
Ciudad Politécnica la Innovación
Edif. 8B Acceso N Planta 0
Camí de Vera, s/n
46022 Valencia (VLC), Spain
(+34) 96 387 81 70
Contact form


Write the text below (required)
captcha