Advanced search


Álvaro Peris, Francisco Casacuberta. Interactive-predictive neural multimodal systems. Proceedings of the 9th Iberian Conference on Pattern Recognition and Image Analysis, 2019. pp. 16-28.

Despite the advances achieved by neural models in sequence to se- quence learning, exploited in a variety of tasks, they still make errors. In many use cases, these are corrected by a human expert in a posterior revision process. The interactive-predictive framework aims to minimize the human effort spent on this process by considering partial corrections for iteratively refining the hy- pothesis. In this work, we generalize the interactive-predictive approach, typi- cally applied in to machine translation field, to tackle other multimodal problems namely, image and video captioning. We study the application of this framework to multimodal neural sequence to sequence models. We show that, following this framework, we approximately halve the effort spent for correcting the outputs generated by the automatic systems. Moreover, we deploy our systems in a pub- licly accessible demonstration, that allows to better understand the behavior of the interactive-predictive framework.