MilaNLP 2021 in Review Part I: Bias and Ethics
Reviewing MilaNLP 2021 research papers on bias and ethics.
In this blog post series, we revise what MilaNLP has been doing during 2021, analyzing the main themes of research and which output the team has produced.
MilaNLP is the NLP Lab in Milano (Italy) lead by Prof. Dirk Hovy at Bocconi University.
This first blog post will cover the work we did in the bias and ethics area.
Part I: Bias and Ethics
A lot of the work run in this lab is about bias and ethics in AI, with particular consideration for what revolves around Natural Language Processing methods. We study how bias affects language models, which things to consider when releasing a model into the wild, what to do to correct bias.
During this new AI wave, ethical considerations become the keystone on which we build machine-human interaction. Without that, we risk serious dangers, as some of our papers show.
This blog post has been compiled by different authors:
While a few of the papers we show are preprints, most of what we present has been peer-reviewed and has been presented at the most important conferences of the field.
Note that papers tagged with “special mention” refer to work done by our new lab members with their former institutions.
Research Papers
1) On The Gap Between Adoption and Understanding in NLP
by Federico Bianchi and Dirk Hovy
Findings-ACL2021
If these issues are not addressed, we risk a loss of reproducibility, reputability, and subsequently public trust in our field.
There are some issues with current research trends in NLP that can hamper the free development of scientific research. We identify five of particular concern: 1) the early adoption of methods without sufficient understanding or analysis; 2) the preference for computational methods regardless of risks associated with their limitations; 3) the resulting bias in the papers we publish; 4) the impossibility of re-running some experiments due to their cost; 5) the dangers of unexplainable methods.
Blog Post:
2) Five sources of bias in natural language processing
by Dirk Hovy and Shrimai Prabhumoye
Language and Linguistics Compass Chapter
This paper provides a gentle (and less technical introduction) and overview of much of the prior work in the area of bias in NLP.
NLP is affected by a number of biases, both in the technical pipeline and in the way we think about our work as practitioners.
3) Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling
by Emily Dinan, Gavin Abercrombie, A Stevie Bergman, Shannon Spruit, Dirk Hovy, Y-Lan Boureau, Verena Rieser
PrePrint
We provide a suite of tools to enable researchers to make better-informed decisions about training and releasing end-to-end conversational AI models.
We survey the problem landscape for safety for end-to-end conversational AI and discuss recent and related work. We highlight tensions between values, potential positive impact and potential harms, and provide a framework for making decisions about whether and how to release these models, following the tenets of value-sensitive design.
4) HONEST: Measuring Hurtful Sentence Completion in Language Models
by Debora Nozza, Federico Bianchi and Dirk Hovy
NAACL2021
Current language models might not be ready for production and more research in the direction of how to handle this behavior is required.
We show that language models generate hurtful statements in many different languages and propose a new benchmark to evaluate how hurtful a language model is.
Blog Post:
5) Exposing the limits of Zero-shot Cross-lingual Hate Speech Detection
by Debora Nozza
ACL2021
Current language models might not be ready for production and more research in the direction of how to handle this behavior is required.
We shed light on the limits of this zero-shot, cross-lingual transfer learning framework for hate speech detection using benchmark data sets in English, Italian, and Spanish.
6) SWEAT: Scoring Polarization of Topics across Different Corpora
by Federico Bianchi, Marco Marelli, Paolo Nicoli and Matteo Palmonari
EMNLP2021
Understanding differences of viewpoints across corpora is a fundamental task for computational social sciences.
We propose the Sliced Word Embedding Association Test (SWEAT), a novel statistical measure to compute the relative polarization of a topical wordset across two distributional representations. To this end, SWEAT uses two additional wordsets, deemed to have opposite valence, to represent two different poles. We validate our approach and illustrate a case study to show the usefulness of the introduced measure.
7) Sustainable Modular Debiasing of Language Models
by Anne Lauscher, Tobias Lüken, and Goran Glavaš
Findings-EMNLP2021
We show that ADELE retains fairness even after large-scale downstream training.
We propose a more sustainable modular debiasing approach based on dedicated debiasing adapters, dubbed ADELE. Concretely, we (1) inject adapter modules into the original PLM layers and (2) update only the adapters (i.e., we keep the original PLM parameters frozen) via language modeling training on a counterfactually augmented corpus. We showcase ADELE, in gender debiasing of BERT: our extensive evaluation, encompassing three intrinsic and two extrinsic bias measures, renders ADELE, very effective in bias mitigation.
Special Mentions
We hereby describe our “special mention” papers!
8) Special Mention: RedditBias: A Real-World Resource for Bias Evaluation and Debiasing of Conversational Language Models
by Soumya Barikeri, Anne Lauscher, Ivan Vulić, and Goran Glavaš
ACL2021
DialoGPT is biased with respect to religious groups and that some debiasing techniques can remove this bias while preserving downstream task performance.
We present REDDITBIAS, the first conversational data set grounded in the actual human conversations from Reddit, allowing for bias measurement and mitigation across four important bias dimensions: gender,race,religion, and queerness. Further, we develop an evaluation framework which simultaneously 1) measures bias on the developed REDDITBIAS resource, and 2) evaluates model capability in dialog tasks after model debiasing. We use the evaluation framework to benchmark the widely used conversational DialoGPT model along with the adaptations of four debiasing methods.
9) Special Mention: ConvAbuse: Data, Analysis, and Benchmarks for Nuanced Detection in Conversational AI
by Amanda Cercas Curry, Gavin Abercrombie and Verena Rieser
EMNLP2021
Abuse in ConvAbuse is significantly different from abuse/hate speech in social media settings, and that the performance of existing models is underwhelming with the domain shift.
This paper presents ConvAbuse, a dataset of abuse in conversational AI. The dataset has been annotated by expert.
10) Special Mention: Diachronic Analysis of German Parliamentary Proceedings: Ideological Shifts through the Lens of Political Biases
by Tobias Walter, Celina Kirschner, Steffen Eger, Goran Glavaš, Anne Lauscher, Simone Paolo Ponzetto
JCDL2021
A new corpus of German parliamentary proceedings and the diachronic analysis of antisemitic and anti-communist biases.
We analyze bias in historical corpora as encoded in diachronic distributional semantic models by focusing on two specific forms of bias, namely a political (i.e., anti-communism) and racist (i.e., antisemitism) one. For this, we use a new corpus of German parliamentary proceedings, DeuPARL, spanning the period 1867–2020. We complement this analysis of historical biases in diachronic word embeddings with a novel measure of bias on the basis of term co-occurrences and graph-based label propagation.
11) Special Mention: DebIE: A Platform for Implicit and Explicit Debiasing of Word Embedding Spaces
by Niklas Friedrich, Anne Lauscher, Simone Paolo Ponzetto, and Goran Glavaš
EACL2021-Demos
An easy-to-use platform for bias evaluation and debiasing of word embedding spaces.
We present DebIE, the first integrated platform for (1) measuring and (2) mitigating bias in word embeddings. Given an (i) embedding space (users can choose between the predefined spaces or upload their own) and (ii) a bias specification (users can choose between existing bias specifications or create their own), DebIE can (1) compute several measures of implicit and explicit bias and modify the embedding space by executing two (mutually composable) debiasing models. DebIE’s functionality can be accessed through four different interfaces: (a) a web application, (b) a desktop application, © a REST-ful API, and (d) as a command-line application. DebIE is available at: debie.informatik.uni-mannheim.de.
So Long
Thank you for reading our work! Feel free to contact us if you have any questions!
If you find errors, you can send me a message on Twitter.
See you next week for Part II. We will cover our contributions for the more general text-analytics area.