Skip to content

Commit

Permalink
ES TN Fix for Issue #166 (#224)
Browse files Browse the repository at this point in the history
* ES TN Fix for Issue #166

Signed-off-by: Simon Zuberek <szuberek@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Updates the cache

Signed-off-by: Simon Zuberek <szuberek@nvidia.com>

* Unioning the lower and upper Roman graphs into one

Signed-off-by: Simon Zuberek <szuberek@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Removes all upper-case Roman numerals from data files

Signed-off-by: Simon Zuberek <szuberek@nvidia.com>

---------

Signed-off-by: Simon Zuberek <szuberek@nvidia.com>
Co-authored-by: Simon Zuberek <szuberek@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Alex Cui <alexcui1994@gmail.com>
  • Loading branch information
3 people authored and BuyuanCui committed Oct 8, 2024
1 parent 889d957 commit 80da2b4
Show file tree
Hide file tree
Showing 4 changed files with 11 additions and 6 deletions.
5 changes: 5 additions & 0 deletions Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,13 @@ pipeline {

AR_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/04-24-24-0'
DE_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/06-03-24-0'
<<<<<<< HEAD
EN_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/09-04-24-0'
ES_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/09-25-24-0'
=======
EN_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/08-22-24-0'
ES_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/08-30-24-0'
>>>>>>> 208d055 (ES TN Fix for Issue #166 (#224))
ES_EN_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/08-30-24-0'
FR_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/06-04-24-0'
HU_TN_CACHE='/home/jenkinsci/TestData/text_norm/ci/grammars/07-16-24-0'
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,8 @@ x
l
c
d
<<<<<<< HEAD
m
=======
m
>>>>>>> 208d055 (ES TN Fix for Issue #166 (#224))
6 changes: 1 addition & 5 deletions nemo_text_processing/text_normalization/es/graph_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,11 +44,7 @@
ES_PLUS = pynini.union("más", "Más", "MÁS").optimize()


<<<<<<< HEAD
def strip_accent(fst: "pynini.FstLike") -> "pynini.FstLike":
=======
def strip_accent(fst: 'pynini.FstLike') -> 'pynini.FstLike':
>>>>>>> a8ca196 (es and es_en changes for unified models (#143))
"""
Converts all accented vowels to non-accented equivalents
Expand Down Expand Up @@ -216,4 +212,4 @@ def _load_roman(file: str, upper_casing: bool):
)
).optimize()

return graph @ fst
return graph @ fst
Original file line number Diff line number Diff line change
Expand Up @@ -117,4 +117,4 @@ todo mi reconocimiento~todo mi reconocimiento
V~quinto
El texto de Li Qin en este libro ahora está disponible en forma de libro electrónico.~El texto de Li Qin en este libro ahora está disponible en forma de libro electrónico.
Xi Jinping es el actual presidente de China.~Xi Jinping es el actual presidente de China.
Matías fue el XI apóstol.~Matías fue el undécimo apóstol.
Matías fue el XI apóstol.~Matías fue el undécimo apóstol.

0 comments on commit 80da2b4

Please sign in to comment.