Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory leak add throughput tests #20

Merged
merged 12 commits into from
Mar 23, 2023
Merged

Conversation

galv
Copy link
Collaborator

@galv galv commented Mar 23, 2023

No description provided.

galv added 11 commits March 23, 2023 09:10
I may remove them later.
Previously, it would always return an empty list.

Add support for blank penalty and length penalty, as wenet has them.
Fix GPU memory leak. My pybind11 DLPack integration was incorrect.
Previously, 5 GiB were being created for every test because the
librispeech test and dev sets were being loaded every time.
We don't check that RTFx is the same right now. It might be too
flakey.

Recorded throughput results here: #19
openfst_binaries = [f"fst{operation}" for operation in "arcsort closure compile concat connect convert determinize disambiguate encode epsnormalize equal equivalent invert isomorphic map minimize project prune push randgen relabel replace reverse reweight synchronize topsort union".split(" ")]
fst_binaries = "arpa2fst arpa-to-const-arpa fstdeterminizestar fstrmsymbols fstisstochastic fstminimizeencoded fstmakecontextfst fstmakecontextsyms fstaddsubsequentialloop fstaddselfloops fstrmepslocal fstcomposecontext fsttablecompose fstrand fstdeterminizelog fstphicompose fstcopy fstpushspecial fsts-to-transcripts fsts-project fsts-union fsts-concat transcripts-to-fsts".split(" ")
fst_binaries.extend(openfst_binaries)
# ERROR to fix: "fstcompose" is used in mkgraph_ctc.sh...
fst_binaries = ["arpa2fst", "fsttablecompose", "fstdeterminizestar", "fstminimizeencoded", "fstarcsort", "fstcompile", "fstaddselfloops", "transcripts-to-fsts", "fstconvert"]
fst_binaries = ["arpa2fst", "fsttablecompose", "fstdeterminizestar", "fstminimizeencoded", "fstarcsort", "fstcompile", "fstaddselfloops", "transcripts-to-fsts", "fstconvert", "fstisstochastic", "fstcompose", "fstrmepslocal"]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not stricly speaking necessary, but it is helpful. I'm keeping it for now.

sort $dir/units.txt > $dir/units_sorted.txt
cmp $dir/units_sorted.txt $dir/units_from_lexicon_sorted.txt
if [ $? ]; then
echo "ERROR: Difference in units.txt and units derived from lexicon!"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be a false positive when it triggers. It requires more inspection.

@@ -383,6 +951,14 @@ def trace_back_stats(r, h, d):
j = j
return insertions, substitutions, deletions


Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can delete this

@galv galv merged commit c05a637 into main Mar 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant