Skip to content

This is the repo for the project in Ethics for AI at @unibo. We tackle the problem of gender discrimination present in various NLP tasks by exploiting notions and methods presented in the literature.

License

Notifications You must be signed in to change notification settings

Erhtric/debiasing-gender-nlp

 
 

Repository files navigation

♂️ Gender discrimination in Natural Language Processing ♀️

This repository contains a project realized as part of the Ethics in Artificial Intelligence course of the Master's degree in Artificial Intelligence, University of Bologna.

Description

The aim of this project is to develop a proof of concept about how to address the gender discrimination in NLP. Two approaches have been investigated:

  • Hard-Debiasing on pre-trained Italian Word Embeddings
  • GN-GloVe which reduce the bias during the training of word embedidngs

In order to have a deeper understanding of the problem, take a look at the presentation of the project.

Repository structure

.
├── data                             # Contains the files of words used for the experiments
├── debiaswe                         # Contains debiasing functions 
│   ├── co_occurrence.py             # Functions to compute the co-occurence matrix for GN-Glove
│   ├── data.py                      # Functions to load data files
│   ├── debias_glove.py              # Actual implementation of GN-Glove debiasing
│   ├── metrics.py                   # Functions to compute metrics for the experiments 
│   └── we.py                        # Auxiliar functions to load and manage word embeddings
├── embeddings                       # Contains the word embeddings file for the hard-debiasing approach
├── scripts                          # Contains the scripts to convert the original twitter word embeddings to a tsv file and fileter 
├── gn-glove_we_visualization.ipynb  # Visualization of the word embeddings generated by GN-Glove
├── hard_debias_italian_we.ipynb     # Visualization of the word embeddings generated by Hard-Debiasing                        
├── presentation.pdf                 # Slides about the project
├── LICENSE
└── README.md

Results

The results of both approaches are presented below:

  • Hard-Debiasing:

  • GN-GloVe:

Versioning

We use Git for versioning.

Group members

Name Surname Email Username
Davide Angelani davide.angelani@studio.unibo.it qnozo
Eric Rossetto eric.rossetto@studio.unibo.it Erhtric
Giuseppe Murro giuseppe.murro@studio.unibo.it gmurro
Salvatore Pisciotta salvatore.pisciotta@studio.unibo.it SalvoPisciotta
Xiaowei Wen xiaowei.wen@studio.unibo.it WenXiaowei

License

This project is licensed under the MIT License - see the LICENSE file for details

About

This is the repo for the project in Ethics for AI at @unibo. We tackle the problem of gender discrimination present in various NLP tasks by exploiting notions and methods presented in the literature.

Topics

Resources

License

Stars

Watchers

Forks

Languages

  • Jupyter Notebook 92.8%
  • Python 7.2%