Publications

Advanced search

Abstract

Jesús González-Rubio, Daniel Ortiz-Martínez, Francisco Casacuberta. Fast incremental active learning for statistical machine translation. Avances en Inteligencia Artificial, proceedings of the Conferencia de la Asociación Española para la Inteligencia Artificial, 2011.

Different works show that the application of active learning techniques within statistical machine translation improves the quality of the final translations while minimizing the number of bilingual sentences required to train the system. All these previous works look for the best sentence sampling strategy while using the batch learning paradigm to retrain the translation model. Unfortunately, batch learning for statistical machine translation typically requires many hours to train a system of reasonable size. This fact limits the practical application of active learning for statistical machine translation. In this work, we propose to apply incremental learning techniques to retrain the translation model in an active learning scenario for statistical machine translation. Experiments show that incremental learning allows us to reduce by several orders of magnitude the training time per sentence while yielding similar improvements in the translation quality with respect to batch learning.