Skip to content

Latest commit

 

History

History
66 lines (44 loc) · 3.2 KB

README.md

File metadata and controls

66 lines (44 loc) · 3.2 KB

privacy-glue

This repository documents PrivacyGLUE; a NLP benchmark consisting of legal-privacy related tasks.

Dependencies 🔍

  1. This repository's code was tested with Python version 3.8.13. To sync dependencies, we recommend creating a virtual environment with the same python version and installing the relevant packages with poetry:

    $ poetry install
    

    Alternatively, install dependencies in the virtual environment using pip:

    $ pip install -r requirements.txt
    
  2. This repository requires a working installation of Git LFS to access upstream task data. We utilized version 3.2.0 in our implementation.

  3. Optional: If you intend to develop this repository further, we recommend installing pre-commit to utilize local pre-commit hooks for various code-checks.

Initialization 🔥

  1. To prepare the necessary git submodules and data, simply execute:

    $ bash scripts/prepare.sh
    
  2. Optional: If you intend to further develop this repository, execute the following to initialize pre-commit hooks:

    $ pre-commit install
    

Tasks 🏃

Task Type Study
OPP-115 Multi-label* sequence classification Wilson et al. (2016)***
PI-Extract Joint multi-class** sequence tagging Duc et al. (2021)
Policy-Detection Binary sequence classification Amos et al. (2021)
PolicyIE-A Multi-class** sequence classification Ahmad et al. (2021)
PolicyIE-B Joint multi-class** sequence tagging Ahmad et al. (2021)
PolicyQA Reading comprehension Ahmad et al. (2021)
PrivacyQA Binary sequence classification Ravichander et al. (2019)

*Multi-label implies that each classification task can have more than one gold standard label

**Multi-class implies that each classification task can only have one gold standard label out of multiple choices

***Data splits were not defined in Wilson et al. (2016) and were instead taken from Mousavi et al. (2020)

Test 🔬

  1. To run unit and integration tests, execute:

    $ pytest
    
  2. To run a mypy type-integrity test, execute:

    $ mypy