update documentation

lfoppiano · Feb 4, 2024 · 81b2342 · 81b2342
1 parent e5e4f01
commit 81b2342
Show file tree

Hide file tree

Showing 3 changed files with 83 additions and 50 deletions.
diff --git a/README.md b/README.md
@@ -37,39 +37,9 @@ Spaces: https://lfoppiano-grobid-quantities.hf.space/
 
 ## Latest version
 
-The latest released version of grobid-quantities
-is [0.7.3](https://github.com/kermitt2/grobid-quantities/releases/tag/v0.7.3). The current development version is
-0.7.4-SNAPSHOT.
+The latest released version of grobid-quantities is [0.7.3](https://github.com/kermitt2/grobid-quantities/releases/tag/v0.7.3). The current development version is 0.7.4-SNAPSHOT.
+**Important**: to upgrade please check [here](https://grobid-quantities.readthedocs.io/gettingStarted.html#upgrade).
 
-### Update from 0.7.2 to 0.7.3
-
-#### Grobid models
-In version 0.7.3 we have updated the DeLFT models. The DL models must be updated by running `./gradlew copyModels`.
-
-#### JDK Update
-The version 0.7.3 enable the support for running with JDK > 11. We recommend to run it with JDK 17.
-Running grobid-quantities with gradle (`./gradlew clean run`) is already supported in the `build.gradle`.
-Running grobid-quantities via the JAR file requires an additional parameter to set the java.path: 
-- Linux: `-Djava.library.path=../grobid-home/lib/lin-64:../grobid-home/lib/lin-64/jep`
-- Mac (arm): `-Djava.library.path=.:/usr/lib/java:../grobid-home/lib/mac_arm-64:{MY_VIRTUAL_ENV}/jep/lib:{MY_VIRTUAL_ENV}/jep/lib/python3.9/site-packages/jep --add-opens java.base/java.lang=ALL-UNNAMED`
-- Mac (intel): `-Djava.library.path=.:/usr/lib/java:../grobid-home/lib/mac-64:{MY_VIRTUAL_ENV}/jep/lib:{MY_VIRTUAL_ENV}/jep/lib/python3.9/site-packages/jep --add-opens java.base/java.lang=ALL-UNNAMED`
-    With `MY_VIRTUAL_ENV` I use `/Users/lfoppiano/anaconda3/envs/jep`
-
-
-### Update from 0.7.1 to 0.7.2
-
-In version 0.7.2 we have updated the DeLFT models.   
-The DL models must be updated by running `./gradlew copyModels`.
-
-### Update from 0.7.0 to 0.7.1
-
-In version 0.7.1 a new version of DeLFT using Tensorflow 2.x is used.  
-The DL models must be updated by running `./gradlew copyModels`.
-
-### Update from 0.6.0 to 0.7.0
-
-In version 0.7.0 the models have been updated, therefore is required to run a `./gradlew copyModels` to have properly
-results especially for what concern the unit normalisation.
 
 ## Documentation
 

diff --git a/doc/evaluation-scores.rst b/doc/evaluation-scores.rst
@@ -1,8 +1,34 @@
 .. topic:: Evaluation scores
 
-*****************
-Evaluation scores
-*****************
+**********
+Evaluation
+**********
+
+--------------------
+End 2 end evaluation
+--------------------
+
+The end-to-end evaluation was performed with the `MeasEval dataset <https://github.com/harperco/MeasEval>`_ (SemEval-2021 Task 8).
+The scores in the following table are the micro average. 
+MeasEval was annotated to allow approximated entities, which are not supported in grobid-quantities. 
+
++---------------------------+----------------+-----------+--------+---------+---------+
+| Type (Ref)                | Matching method| Precision | Recall | F1-score| Support |
++===========================+================+===========+========+=========+=========+
+| Quantities (QUANT)        | strict         | 53.05     | 54.74  | 53.88   | 1165    |
++---------------------------+----------------+-----------+--------+---------+---------+
+| Quantities (QUANT)        | soft           | 64.64     | 66.70  | 65.65   | 1165    |
++---------------------------+----------------+-----------+--------+---------+---------+
+| Quantified substance (ME) | strict         | 14.03     | 9.78   | 11.53   | 613     |
++---------------------------+----------------+-----------+--------+---------+---------+
+| Quantified substance (ME) | soft           | 21.53     | 15.02  | 17.69   | 613     |
++---------------------------+----------------+-----------+--------+---------+---------+
+
+Note: the ME (Measured Entity) is still experimental in Grobid-quantities
+
+-------------------------------------------------------
+Machine Learning Named Entities Recognition  Evaluation
+-------------------------------------------------------
 
 The scores (P: Precision, R: Recall, F1: F1-score) for all the models are performed either as 10-fold cross-validation or using an holdout dataset.
 The holdout dataset of Grobid-quantities is composed by the following examples: 
@@ -18,14 +44,14 @@ The models are organised as follow:
  - BERT_CRF is a BERT-based model obtained by fine-tuning a SciBERT encoder. Like others, the activation function is composed by a CRF layer. 
 
 
-=======================
+
 Results from 27/10/2022
-=======================
+~~~~~~~~~~~~~~~~~~~~~~~
 
 The evaluation was performed on the holdout dataset from the grobid-quantities dataset.
 Average values are computed as Micro average. 
 
-----------
+
 Quantities
 ----------
 
@@ -79,7 +105,6 @@ Quantities
 +------------------+--------------+--------+---------+-------------------------+--------+---------+
 
 
------
 Units
 -----
 
@@ -113,7 +138,7 @@ Units were evaluated using UNISCOR dataset. For more information check the secti
 | All (micro avg)  | 70.19        | 60.88  | 65.20   | 73.03                   | 65.31  | 68.94   |
 +------------------+--------------+--------+---------+-------------------------+--------+---------+
 
-------
+
 Values
 ------
 
@@ -150,9 +175,9 @@ Values
 | All (micro avg) | 98.90      | 99.17  | 99.03    | 98.86                   | 99.25   | 99.05    |
 +-----------------+------------+--------+----------+-------------------------+---------+----------+
 
-================
+
 Previous results 
-================
+~~~~~~~~~~~~~~~~
 
 The scores of this evaluation were obtained using n-fold cross-validation. The metrics are the micro average of n=10 folds.
 
@@ -163,7 +188,7 @@ Evaluation notes:
  - The `CRF` model was evaluated on the 30/04/2020.
  - The `BidLSTM_CRF_FEATURES` model was evaluated on the 28/11/2021
 
-----------
+
 Quantities
 ----------
 
@@ -191,7 +216,6 @@ Quantities
 | All (micro avg)     | 88.96      | 85.40  | 87.14    | 87.23                | 89.00  | 88.10    |
 +---------------------+------------+--------+----------+----------------------+--------+----------+
 
------
 Units
 -----  
 
@@ -212,7 +236,6 @@ CRF was updated on the 10/02/2021
 +------------------+------------+--------+----------+-----------+-------+-----------+
 
 
-------
 Values
 ------
 

diff --git a/doc/gettingStarted.rst b/doc/gettingStarted.rst
@@ -7,25 +7,65 @@
 .. _latest discussion: https://github.com/kermitt2/grobid/issues/1014
 
 
-
+###############
 Getting started
-===============
+###############
 
-Before you start
-~~~~~~~~~~~~~~~~
 .. warning:: Grobid and grobid-quantities are `not compatible with Windows`_ and limited on Apple M1. While Windows users can easily use Grobid and grobid-quantities through docker containers, the support for grobid on ARM is under development, see the `latest discussion`_. 
 
 .. warning:: Since grobid-quantities 0.7.3 (using grobid 0.7.3), we extended the support to JDK after version 11. This requires specifying the `java.library.path` explicitly. Obviously, *all these issues are solved by using Docker containers*.
 
 
+Upgrade
+~~~~~~~
+
+0.7.2 to 0.7.3
+==============
+
+Grobid models
+-------------
+
+In version 0.7.3, we have updated the DeLFT models. The DL models must be updated by running ``./gradlew copyModels``.
+
+JDK Update
+-----------
+
+The version 0.7.3 enables the support for running with JDK > 11. We recommend running it with JDK 17.
+Running grobid-quantities with gradle (``./gradlew clean run``) is already supported in the ``build.gradle``.
+Running grobid-quantities via the JAR file requires an additional parameter to set the java.path:
+
+- Linux: ``-Djava.library.path=../grobid-home/lib/lin-64:../grobid-home/lib/lin-64/jep``
+- Mac (arm): ``-Djava.library.path=.:/usr/lib/java:../grobid-home/lib/mac_arm-64:{MY_VIRTUAL_ENV}/jep/lib:{MY_VIRTUAL_ENV}/jep/lib/python3.9/site-packages/jep --add-opens java.base/java.lang=ALL-UNNAMED``
+- Mac (intel): ``-Djava.library.path=.:/usr/lib/java:../grobid-home/lib/mac-64:{MY_VIRTUAL_ENV}/jep/lib:{MY_VIRTUAL_ENV}/jep/lib/python3.9/site-packages/jep --add-opens java.base/java.lang=ALL-UNNAMED``
+    With ``MY_VIRTUAL_ENV`` I use ``/Users/lfoppiano/anaconda3/envs/jep``
+
+0.7.1 to 0.7.2
+==============
+
+In version 0.7.2, we have updated the DeLFT models.
+The DL models must be updated by running ``./gradlew copyModels``.
+
+0.7.0 to 0.7.1
+==============
+
+In version 0.7.1, a new version of DeLFT using Tensorflow 2.x is used.
+The DL models must be updated by running ``./gradlew copyModels``.
+
+0.6.0 to 0.7.0
+==============
+
+In version 0.7.0, the models have been updated, therefore it is required to run a ``./gradlew copyModels`` to have properly
+results, especially for what concerns the unit normalization.
+
+
 Install and build
 ~~~~~~~~~~~~~~~~~
 
 Docker containers
 -----------------
 The simplest way to run grobid-quantities is via docker containers.
 
-The Grobid-quantities repository provides a configuration file for docker: `resources/config/config-docker.yml`, which should work out of the box, although we recommend to **check the configuration** (e.g., to enable modules using deep learning).
+The Grobid-quantities repository provides a configuration file for docker: ``resources/config/config-docker.yml``, which should work out of the box, although we recommend to **check the configuration** (e.g., to enable modules using deep learning).
 
 To run the container use:
 ::