Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fstmakecontextfst produces C.fst with duplicated arcs on disambig symbols #3810

Closed
taomanwai opened this issue Jan 5, 2020 · 5 comments · Fixed by #3811
Closed

fstmakecontextfst produces C.fst with duplicated arcs on disambig symbols #3810

taomanwai opened this issue Jan 5, 2020 · 5 comments · Fixed by #3811
Labels

Comments

@taomanwai
Copy link
Contributor

For this toy example, mono-phone of Chinese, where one phone exactly equal to one Chinese character

Why and is it a bug ?

@taomanwai taomanwai added the bug label Jan 5, 2020
@danpovey
Copy link
Contributor

danpovey commented Jan 5, 2020 via email

@taomanwai
Copy link
Contributor Author

Sorry for missing command line code

fstmakecontextfst --context-size=1 --central-position=0 --read-disambig-syms=lang/phones/disambig.int --write-disambig-syms=disambig_ilabels_1_0.int phones.txt 11 ilabels_1_0 > C.fst

PS:
phones.txt:
0
SIL 1
SPN 2
地 3
鐵 4
去 5
#0 6
#1 7
#nonterm:entity 8
#nonterm_begin 9
#nonterm_end 10

lang/phones/disambig.int:
6
7
8
9
10

@danpovey
Copy link
Contributor

danpovey commented Jan 5, 2020

The following lines:

    for (size_t i = 0; i < disambig_in.size(); i++) {
      int32 sym = disambig_in[i];
      loop_fst.AddArc(0, StdArc(sym, sym, TropicalWeight::One(), 0));
    }

need to be removed from fstmakcontextfst.cc since those symbols are already added from phones.txt. Do you have time to make a PR:

@taomanwai
Copy link
Contributor Author

Sure and wait, forking ...

@taomanwai
Copy link
Contributor Author

PR created

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants