♂️ Gender discrimination in Natural Language Processing ♀️

This repository contains a project realized as part of the Ethics in Artificial Intelligence course of the Master's degree in Artificial Intelligence, University of Bologna.

Description

The aim of this project is to develop a proof of concept about how to address the gender discrimination in NLP. Two approaches have been investigated:

Hard-Debiasing on pre-trained Italian Word Embeddings
GN-GloVe which reduce the bias during the training of word embedidngs

In order to have a deeper understanding of the problem, take a look at the presentation of the project.

Repository structure

.
├── data                             # Contains the files of words used for the experiments
├── debiaswe                         # Contains debiasing functions 
│   ├── co_occurrence.py             # Functions to compute the co-occurence matrix for GN-Glove
│   ├── data.py                      # Functions to load data files
│   ├── debias_glove.py              # Actual implementation of GN-Glove debiasing
│   ├── metrics.py                   # Functions to compute metrics for the experiments 
│   └── we.py                        # Auxiliar functions to load and manage word embeddings
├── embeddings                       # Contains the word embeddings file for the hard-debiasing approach
├── scripts                          # Contains the scripts to convert the original twitter word embeddings to a tsv file and fileter 
├── gn-glove_we_visualization.ipynb  # Visualization of the word embeddings generated by GN-Glove
├── hard_debias_italian_we.ipynb     # Visualization of the word embeddings generated by Hard-Debiasing                        
├── presentation.pdf                 # Slides about the project
├── LICENSE
└── README.md

Results

The results of both approaches are presented below:

Hard-Debiasing:
GN-GloVe:

Versioning

We use Git for versioning.

Group members

Name	Surname	Email	Username
Davide	Angelani	`davide.angelani@studio.unibo.it`	qnozo
Eric	Rossetto	`eric.rossetto@studio.unibo.it`	Erhtric
Giuseppe	Murro	`giuseppe.murro@studio.unibo.it`	gmurro
Salvatore	Pisciotta	`salvatore.pisciotta@studio.unibo.it`	SalvoPisciotta
Xiaowei	Wen	`xiaowei.wen@studio.unibo.it`	WenXiaowei

License

This project is licensed under the MIT License - see the LICENSE file for details

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

♂️ Gender discrimination in Natural Language Processing ♀️

Description

Repository structure

Results

Versioning

Group members

License

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
data		data
debiaswe		debiaswe
results		results
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
gn-glove_we_visualization.ipynb		gn-glove_we_visualization.ipynb
hard-debias_italian_we.ipynb		hard-debias_italian_we.ipynb
presentation.pdf		presentation.pdf

License

Erhtric/debiasing-gender-nlp

Folders and files

Latest commit

History

Repository files navigation

♂️ Gender discrimination in Natural Language Processing ♀️

Description

Repository structure

Results

Versioning

Group members

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages