Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Several problems with foma2js.perl / foma2js.py #155

Open
dhdaines opened this issue May 12, 2024 · 1 comment
Open

Several problems with foma2js.perl / foma2js.py #155

dhdaines opened this issue May 12, 2024 · 1 comment

Comments

@dhdaines
Copy link

dhdaines commented May 12, 2024

The JavaScript generated by these scripts, while it works, is not really correct (this is 90% of all JavaScript code in the world, so don't feel bad). It seems that the intention here is to create separate Arrays for transitions, alphabet, and finals, or perhaps put them all in the same Array?

var myNet = new Object;
myNet.t = Array;
myNet.f = Array;
myNet.s = Array;

Regardless, this doesn't do either of those things, because you didn't add the magical new operator. It just sets properties on the global builtin Array object. This is likely to cause random problems for any other JavaScript code that is loaded with the FST. Also, it makes it impossible to serialize myNet to JSON. You shouldn't be using an Array for these in the first place because JSON.stringify can't enumerate string keys on an array, even if JavaScript, in its infinite wisdom, lets you use them. Instead, I suggest doing this (PR coming soon):

var myNet = new Object;
myNet.t = new Object;
myNet.f = new Object;
myNet.s = new Object;

(yes, you could make them all the same Object since the key names are unique, but I don't see a good reason to do this)

Also, foma2js.py misses some symbols in the alphabet - I'll just fix this in the PR to come.

Note that foma_apply_down.js actually only needs to know the input symbols in the alphabet, so if you want to save some space, you can omit the output symbols from the s array.

Note also that maxlen is wrong when there are surrogate pairs, I've fixed this in the pyfoma implementation :) (and in #156 too)

@dhdaines
Copy link
Author

(note, actually, the optimal solution is not to output JavaScript at all, but just to output JSON that you assign to a JavaScript object, like the pyfoma implementation does)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant