MilaNLP 2021 in Review Part I: Bias and Ethics

Reviewing MilaNLP 2021 research papers on bias and ethics.

7 min readSep 22, 2021

In this blog post series, we revise what MilaNLP has been doing during 2021, analyzing the main themes of research and which output the team has produced.

MilaNLP is the NLP Lab in Milano (Italy) lead by Prof. Dirk Hovy at Bocconi University.

This first blog post will cover the work we did in the bias and ethics area.

Our MilaNLP Logo. The left part represents the Dome in Milan.

Part I: Bias and Ethics

A lot of the work run in this lab is about bias and ethics in AI, with particular consideration for what revolves around Natural Language Processing methods. We study how bias affects language models, which things to consider when releasing a model into the wild, what to do to correct bias.

During this new AI wave, ethical considerations become the keystone on which we build machine-human interaction. Without that, we risk serious dangers, as some of our papers show.

This blog post has been compiled by different authors:

While a few of the papers we show are preprints, most of what we present has been peer-reviewed and has been presented at the most important conferences of the field.

Note that papers tagged with “special mention” refer to work done by our new lab members with their former institutions.

Research Papers

1) On The Gap Between Adoption and Understanding in NLP

by Federico Bianchi and Dirk Hovy

Findings-ACL2021

If these issues are not addressed, we risk a loss of reproducibility, reputability, and subsequently public trust in our field.

There are some issues with current research trends in NLP that can hamper the free development of scientific research. We identify five of particular concern: 1) the early adoption of methods without sufficient understanding or analysis; 2) the preference for computational methods regardless of risks associated with their limitations; 3) the resulting bias in the papers we publish; 4) the impossibility of re-running some experiments due to their cost; 5) the dangers of unexplainable methods.

**The 5 issues that are described in the paper.**

Blog Post:

On the Gap between Adoption and Understanding

towardsdatascience.com

2) Five sources of bias in natural language processing

by Dirk Hovy and Shrimai Prabhumoye

Language and Linguistics Compass Chapter

This paper provides a gentle (and less technical introduction) and overview of much of the prior work in the area of bias in NLP.

NLP is affected by a number of biases, both in the technical pipeline and in the way we think about our work as practitioners.

3) Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling

by Emily Dinan, Gavin Abercrombie, A Stevie Bergman, Shannon Spruit, Dirk Hovy, Y-Lan Boureau, Verena Rieser

PrePrint

We provide a suite of tools to enable researchers to make better-informed decisions about training and releasing end-to-end conversational AI models.

We survey the problem landscape for safety for end-to-end conversational AI and discuss recent and related work. We highlight tensions between values, potential positive impact and potential harms, and provide a framework for making decisions about whether and how to release these models, following the tenets of value-sensitive design.

4) HONEST: Measuring Hurtful Sentence Completion in Language Models

by Debora Nozza, Federico Bianchi and Dirk Hovy

NAACL2021

Current language models might not be ready for production and more research in the direction of how to handle this behavior is required.

We show that language models generate hurtful statements in many different languages and propose a new benchmark to evaluate how hurtful a language model is.

**An example of hurtful completion made by BERT**

Blog Post:

Can Too Much BERT Be Bad for You?

A little story about how BERT and GPT-2 hide harmful statements inside their weights.

towardsdatascience.com

5) Exposing the limits of Zero-shot Cross-lingual Hate Speech Detection

by Debora Nozza

ACL2021

Current language models might not be ready for production and more research in the direction of how to handle this behavior is required.

We shed light on the limits of this zero-shot, cross-lingual transfer learning framework for hate speech detection using benchmark data sets in English, Italian, and Spanish.

6) SWEAT: Scoring Polarization of Topics across Different Corpora

by Federico Bianchi, Marco Marelli, Paolo Nicoli and Matteo Palmonari

EMNLP2021

Understanding differences of viewpoints across corpora is a fundamental task for computational social sciences.

We propose the Sliced Word Embedding Association Test (SWEAT), a novel statistical measure to compute the relative polarization of a topical wordset across two distributional representations. To this end, SWEAT uses two additional wordsets, deemed to have opposite valence, to represent two different poles. We validate our approach and illustrate a case study to show the usefulness of the introduced measure.

7) Sustainable Modular Debiasing of Language Models

by Anne Lauscher, Tobias Lüken, and Goran Glavaš

Findings-EMNLP2021

We show that ADELE retains fairness even after large-scale downstream training.

We propose a more sustainable modular debiasing approach based on dedicated debiasing adapters, dubbed ADELE. Concretely, we (1) inject adapter modules into the original PLM layers and (2) update only the adapters (i.e., we keep the original PLM parameters frozen) via language modeling training on a counterfactually augmented corpus. We showcase ADELE, in gender debiasing of BERT: our extensive evaluation, encompassing three intrinsic and two extrinsic bias measures, renders ADELE, very effective in bias mitigation.

Special Mentions

We hereby describe our “special mention” papers!

8) Special Mention: RedditBias: A Real-World Resource for Bias Evaluation and Debiasing of Conversational Language Models

by Soumya Barikeri, Anne Lauscher, Ivan Vulić, and Goran Glavaš

ACL2021

DialoGPT is biased with respect to religious groups and that some debiasing techniques can remove this bias while preserving downstream task performance.

We present REDDITBIAS, the first conversational data set grounded in the actual human conversations from Reddit, allowing for bias measurement and mitigation across four important bias dimensions: gender,race,religion, and queerness. Further, we develop an evaluation framework which simultaneously 1) measures bias on the developed REDDITBIAS resource, and 2) evaluates model capability in dialog tasks after model debiasing. We use the evaluation framework to benchmark the widely used conversational DialoGPT model along with the adaptations of four debiasing methods.

9) Special Mention: ConvAbuse: Data, Analysis, and Benchmarks for Nuanced Detection in Conversational AI

by Amanda Cercas Curry, Gavin Abercrombie and Verena Rieser

EMNLP2021

Abuse in ConvAbuse is significantly different from abuse/hate speech in social media settings, and that the performance of existing models is underwhelming with the domain shift.

This paper presents ConvAbuse, a dataset of abuse in conversational AI. The dataset has been annotated by expert.

10) Special Mention: Diachronic Analysis of German Parliamentary Proceedings: Ideological Shifts through the Lens of Political Biases

by Tobias Walter, Celina Kirschner, Steffen Eger, Goran Glavaš, Anne Lauscher, Simone Paolo Ponzetto

JCDL2021

A new corpus of German parliamentary proceedings and the diachronic analysis of antisemitic and anti-communist biases.

We analyze bias in historical corpora as encoded in diachronic distributional semantic models by focusing on two specific forms of bias, namely a political (i.e., anti-communism) and racist (i.e., antisemitism) one. For this, we use a new corpus of German parliamentary proceedings, DeuPARL, spanning the period 1867–2020. We complement this analysis of historical biases in diachronic word embeddings with a novel measure of bias on the basis of term co-occurrences and graph-based label propagation.

11) Special Mention: DebIE: A Platform for Implicit and Explicit Debiasing of Word Embedding Spaces

by Niklas Friedrich, Anne Lauscher, Simone Paolo Ponzetto, and Goran Glavaš

EACL2021-Demos

An easy-to-use platform for bias evaluation and debiasing of word embedding spaces.

We present DebIE, the first integrated platform for (1) measuring and (2) mitigating bias in word embeddings. Given an (i) embedding space (users can choose between the predefined spaces or upload their own) and (ii) a bias specification (users can choose between existing bias specifications or create their own), DebIE can (1) compute several measures of implicit and explicit bias and modify the embedding space by executing two (mutually composable) debiasing models. DebIE’s functionality can be accessed through four different interfaces: (a) a web application, (b) a desktop application, © a REST-ful API, and (d) as a command-line application. DebIE is available at: debie.informatik.uni-mannheim.de.

So Long

Thank you for reading our work! Feel free to contact us if you have any questions!

If you find errors, you can send me a message on Twitter.

See you next week for Part II. We will cover our contributions for the more general text-analytics area.