Updated/added model cards (#3435)

huggingface · Mar 25, 2020 · 010e046 · 010e046
1 parent ffa17fe
commit 010e046
Show file tree

Hide file tree

Showing 4 changed files with 127 additions and 34 deletions.
diff --git a/model_cards/twmkn9/albert-base-v2-squad2/README.md b/model_cards/twmkn9/albert-base-v2-squad2/README.md
@@ -1,22 +1,24 @@
-This model is ALBERT base v2 trained on SQuAD v2 as:
+This model is [ALBERT base v2](https://huggingface.co/albert-base-v2) trained on SQuAD v2 as:
 
 ```
-python run_squad.py 
---model_type albert 
---model_name_or_path albert-base-v2 
---do_train 
---do_eval 
---overwrite_cache 
---do_lower_case 
---version_2_with_negative 
---train_file $SQUAD_DIR/train-v2.0.json 
---predict_file $SQUAD_DIR/dev-v2.0.json 
---per_gpu_train_batch_size 8 
---num_train_epochs 3 
---learning_rate 3e-5 
---max_seq_length 384 
---doc_stride 128 
---output_dir ./tmp/albert_base_fine/
+export SQUAD_DIR=../../squad2
+python3 run_squad.py 
+    --model_type albert 
+    --model_name_or_path albert-base-v2 
+    --do_train 
+    --do_eval 
+    --overwrite_cache 
+    --do_lower_case 
+    --version_2_with_negative 
+    --save_steps 100000 
+    --train_file $SQUAD_DIR/train-v2.0.json 
+    --predict_file $SQUAD_DIR/dev-v2.0.json 
+    --per_gpu_train_batch_size 8 
+    --num_train_epochs 3 
+    --learning_rate 3e-5 
+    --max_seq_length 384 
+    --doc_stride 128 
+    --output_dir ./tmp/albert_fine/
 ```
 
 Performance on a dev subset is close to the original paper:

diff --git a/model_cards/twmkn9/bert-base-uncased-squad2/README.md b/model_cards/twmkn9/bert-base-uncased-squad2/README.md
@@ -1,22 +1,24 @@
-This model is BERT base uncased trained on SQuAD v2 as:
+This model is [BERT base uncased](https://huggingface.co/bert-base-uncased) trained on SQuAD v2 as:
 
 ```
-python run_squad.py 
---model_type bert 
---model_name_or_path bert-base-uncased
---do_train 
---do_eval 
---overwrite_cache 
---do_lower_case 
---version_2_with_negative 
---train_file $SQUAD_DIR/train-v2.0.json 
---predict_file $SQUAD_DIR/dev-v2.0.json 
---per_gpu_train_batch_size 8 
---num_train_epochs 3 
---learning_rate 3e-5 
---max_seq_length 384 
---doc_stride 128 
---output_dir ./tmp/bert_base_fine/
+export SQUAD_DIR=../../squad2
+python3 run_squad.py 
+    --model_type bert 
+    --model_name_or_path bert-base-uncased 
+    --do_train 
+    --do_eval 
+    --overwrite_cache 
+    --do_lower_case 
+    --version_2_with_negative 
+    --save_steps 100000 
+    --train_file $SQUAD_DIR/train-v2.0.json 
+    --predict_file $SQUAD_DIR/dev-v2.0.json 
+    --per_gpu_train_batch_size 8 
+    --num_train_epochs 3 
+    --learning_rate 3e-5 
+    --max_seq_length 384 
+    --doc_stride 128 
+    --output_dir ./tmp/bert_fine_tuned/
 ```
 
 Performance on a dev subset is close to the original paper:

diff --git a/model_cards/twmkn9/distilbert-base-uncased-squad2/README.md b/model_cards/twmkn9/distilbert-base-uncased-squad2/README.md
@@ -0,0 +1,45 @@
+This model is [Distilbert base uncased](https://huggingface.co/distilbert-base-uncased) trained on SQuAD v2 as:
+
+```
+export SQUAD_DIR=../../squad2
+python3 run_squad.py 
+    --model_type distilbert 
+    --model_name_or_path distilbert-base-uncased
+    --do_train 
+    --do_eval 
+    --overwrite_cache 
+    --do_lower_case 
+    --version_2_with_negative 
+    --save_steps 100000 
+    --train_file $SQUAD_DIR/train-v2.0.json 
+    --predict_file $SQUAD_DIR/dev-v2.0.json 
+    --per_gpu_train_batch_size 8 
+    --num_train_epochs 3 
+    --learning_rate 3e-5 
+    --max_seq_length 384 
+    --doc_stride 128 
+    --output_dir ./tmp/distilbert_fine_tuned/
+```
+
+Performance on a dev subset is close to the original paper:
+
+```
+Results: 
+{
+    'exact': 64.88976637051661, 
+    'f1': 68.1776176526635, 
+    'total': 6078, 
+    'HasAns_exact': 69.7594501718213, 
+    'HasAns_f1': 76.62665295288285, 
+    'HasAns_total': 2910, 
+    'NoAns_exact': 60.416666666666664, 
+    'NoAns_f1': 60.416666666666664, 
+    'NoAns_total': 3168, 
+    'best_exact': 64.88976637051661, 
+    'best_exact_thresh': 0.0, 
+    'best_f1': 68.17761765266337, 
+    'best_f1_thresh': 0.0
+}
+```
+
+We are hopeful this might save you time, energy, and compute. Cheers!
diff --git a/model_cards/twmkn9/distilroberta-base-squad2/README.md b/model_cards/twmkn9/distilroberta-base-squad2/README.md
@@ -0,0 +1,44 @@
+This model is [Distilroberta base](https://huggingface.co/distilroberta-base) trained on SQuAD v2 as:
+
+```
+export SQUAD_DIR=../../squad2
+python3 run_squad.py 
+    --model_type robberta 
+    --model_name_or_path distilroberta-base 
+    --do_train 
+    --do_eval 
+    --overwrite_cache 
+    --do_lower_case 
+    --version_2_with_negative 
+    --save_steps 100000 
+    --train_file $SQUAD_DIR/train-v2.0.json 
+    --predict_file $SQUAD_DIR/dev-v2.0.json 
+    --per_gpu_train_batch_size 8 
+    --num_train_epochs 3 
+    --learning_rate 3e-5 
+    --max_seq_length 384 
+    --doc_stride 128 
+    --output_dir ./tmp/distilroberta_fine_tuned/
+```
+
+Performance on a dev subset is close to the original paper:
+
+```
+Results: 
+{
+    'exact': 70.9279368213228, 
+    'f1': 74.60439802429168, 
+    'total': 6078, 
+    'HasAns_exact': 67.62886597938144, 
+    'HasAns_f1': 75.30774267754136, 
+    'HasAns_total': 2910, 
+    'NoAns_exact': 73.95833333333333, 
+    'NoAns_f1': 73.95833333333333, 'NoAns_total': 3168, 
+    'best_exact': 70.94438960184272, 
+    'best_exact_thresh': 0.0, 
+    'best_f1': 74.62085080481161, 
+    'best_f1_thresh': 0.0
+}
+```
+
+We are hopeful this might save you time, energy, and compute. Cheers!