Skip to content

Commit

Permalink
add licenses/copyrights classifier
Browse files Browse the repository at this point in the history
  • Loading branch information
kermitt2 committed Jan 28, 2024
1 parent 5e9d005 commit 86858eb
Show file tree
Hide file tree
Showing 14 changed files with 1,324 additions and 12 deletions.
10 changes: 5 additions & 5 deletions Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ Some contributions include:

A native Java integration of the library has been realized in [GROBID](https://github.com/kermitt2/grobid) via [JEP](https://github.com/ninia/jep).

The latest DeLFT release has been tested successfully with python 3.8 and Tensorflow 2.9.3. As always, GPU(s) are required for decent training time. A GeForce GTX 1050 Ti (4GB) for instance is fine for running RNN models and BERT or RoBERTa base models. Using BERT large model is possible from a GeForce GTX 1080 Ti (11GB) with modest batch size. Using multiple GPUs (training and inference) is supported.
The latest DeLFT release __0.3.4__ has been tested successfully with python 3.8 and Tensorflow 2.9.3. As always, GPU(s) are required for decent training time. For example, a GeForce GTX 1050 Ti (4GB) is working very well for running RNN models and BERT or RoBERTa base models. Using BERT large model is no problem with a GeForce GTX 1080 Ti (11GB), including training with modest batch size. Using multiple GPUs (training and inference) is supported.

## DeLFT Documentation

Expand All @@ -48,7 +48,7 @@ Visit the [DELFT documentation](https://delft.readthedocs.io) for detailed infor
PyPI packages are available for stable versions. Latest stable version is `0.3.4`:

```
pip install delft==0.3.4
python3 -m pip install delft==0.3.4
```

## DeLFT Installation
Expand All @@ -70,13 +70,13 @@ source env/bin/activate
Install the dependencies:

```sh
pip3 install -r requirements.txt
python3 -m pip install -r requirements.txt
```

Finally install the project, preferably in editable state

```sh
pip3 install -e .
python3 -m pip install -e .
```

See the [DELFT documentation](https://delft.readthedocs.io) for usage.
Expand All @@ -98,7 +98,7 @@ If you want to this work, please refer to the present GitHub project, together w
title = {DeLFT},
howpublished = {\url{https://github.com/kermitt2/delft}},
publisher = {GitHub},
year = {2018--2023},
year = {2018--2024},
archivePrefix = {swh},
eprint = {1:dir:54eb292e1c0af764e27dd179596f64679e44d06e}
}
Expand Down
20 changes: 20 additions & 0 deletions data/models/textClassification/copyright_gru/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{
"model_name": "copyright_gru",
"architecture": "gru",
"embeddings_name": "glove-840B",
"char_embedding_size": 25,
"word_embedding_size": 300,
"dropout": 0.5,
"recurrent_dropout": 0.25,
"maxlen": 300,
"dense_size": 32,
"use_char_feature": false,
"list_classes": [
"publisher",
"authors",
"undecided"
],
"fold_number": 1,
"batch_size": 256,
"transformer_name": null
}
Binary file not shown.
27 changes: 27 additions & 0 deletions data/models/textClassification/license_gru/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
{
"model_name": "license_gru",
"architecture": "gru",
"embeddings_name": "glove-840B",
"char_embedding_size": 25,
"word_embedding_size": 300,
"dropout": 0.5,
"recurrent_dropout": 0.25,
"maxlen": 300,
"dense_size": 32,
"use_char_feature": false,
"list_classes": [
"CC-0",
"CC-BY",
"CC-BY-NC",
"CC-BY-NC-ND",
"CC-BY-SA",
"CC-BY-NC-SA",
"CC-BY-ND",
"copyright",
"other",
"undecided"
],
"fold_number": 1,
"batch_size": 256,
"transformer_name": null
}
Binary file not shown.

Large diffs are not rendered by default.

Loading

0 comments on commit 86858eb

Please sign in to comment.