Duration: 1 September 2022 to 31 August 2025
Grant PID2021-124361OB-C31 funded by MCIN/AEI/10.13039/501100011033 and by ERDF, EU A way of making Europe
PI: Paolo Rosso
Members: José Miguel Benedí, Jon Ander Gómez, Jorge González, Reynier Ortega

One of the main Digitalization Challenges in terms of citizenship civil rights is linked to the defense of equality in the digital world. In social life, to defend equality means work for increasing opportunities of minorities and people with less economic and cultural resources. Social sciences research showed from its very beginning that stereotypes and prejudices about ethnic minority groups and women are one of the main difficulties that members of these groups must overcome to develop their maximum potential as human beings. These stereotypes and prejudices that discriminate women and minorities are embedded in everyday language. This means that are learned, and sometimes enhanced, by our AI systems that use NLP techniques. As a good amount of research in social bias has shown, the automation of decisions may reproduce prejudices and stereotypes leading to further discrimination and inequality. To fight against this important problem, our FairTransNLP-Stereotypes project aims to develop a research strategy in order to mitigate the inclusion of prejudices and stereotypes into machine learning models. First, we will propose a new approach to identify stereotypes and prejudices in social media.
We aim to study several targets (e.g. immigrants and women) in texts and memes, considering both implicit and explicit rethorical strategies to discriminate minorities in textual and visual information. Second, a comprehensive study of the problem will allow us to develop equitable systems that will include different viewpoints, rather than only representing the majority view. They will be able to operate over data with “conflicting” labels, in order not to marginalise minority views. We will employ the learning with disagreement paradigm and instead of learning only from gold labels, the systems will learn from both the gold labels and the distribution of the labels over multiple annotators, which we will treat as soft label distributions. Models trained with soft labels (i.e., probability distributions over the annotator labels) will reduce the confidence when dealing with unclear cases, not affecting the prediction of clear cases. Finally, our third objective is to make all these systems explainable for humans, developing fair and transparent systems for the identification of stereotypes and prejudices in social media texts and memes. This is a crucial factor in real-life applications both for developers to better understand their systems behaviour and for users to gain trust in the system. We will incorporate into the systems techniques as LIME, the masking technique, and the attention mechanisms in order to make sure that our systems could provide transparent and understandable explanations for their decisions. We will organise, together with members of the UB and UNED teams, two shared tasks on the identification of racial stereotypes and of sexism in memes. These benchmark scenarios will allow us to test systems under the learning with disagreement paradigm. Exploiting the standard metrics for gold labels, we will be able to compare the effect of different loss functions for the soft labels. We will evaluate the performance of the systems using metrics such as cross-entropy and Kullback-Leibler divergence. Moreover, we will also evaluate their fairness and transparency employing the diagnostic methodology we will jointly propose in the first part of the coordinated FairTransNLP project.