Publicaciones

Advanced search

Abstract

Zuzanna Parcheta, Germán Sanchis-Trilles, Francisco Casacuberta, Robin Redahl. Multi-input CNN for Text Classification in Commercial Scenarios. Proceedings of the International Work-Conference on Artificial Neural Networks, 2019. pp. 596-608. Springer.

In this work we describe a multi-input Convolutional Neural Network for text classification which allows for combining text preprocessed at word level, byte pair encoding level and character level. We conduct experiments on different datasets and we compare the results obtained with other classifiers. We apply the developed model to two different practical use cases: (1) classifying ingredients into their corresponding classes by means of a corpus provided by Northfork; and (2) classifying texts according to the English level of their corresponding writers by means of a corpus provided by ProvenWord. Additionally, we perform experiments on a standard classification task using Yahoo! Answers and GermEval2017 task A datasets. We show that the developed architecture obtains satisfactory results with these corpora, and we compare results obtained for each dataset with different state-of-the-art approaches, obtaining very promising results.