Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optional replacement of input of partially identical sequences doesn't work #87

Open
mcswell opened this issue Aug 2, 2019 · 1 comment

Comments

@mcswell
Copy link

mcswell commented Aug 2, 2019

Optional rules where the left- and right-hand sides are partially identical sequences don't work, although the corresponding obligatory rule does. Example:
foma[0]: def l {ange};
redefined l: 328 bytes. 5 states, 4 arcs, 1 path.
foma[0]: regex l .o. [{ng} -> {ny}];
484 bytes. 5 states, 4 arcs, 1 path.
foma[1]: lower
anye
foma[1]: regex l .o. [{ng} (->) {ny}];
484 bytes. 5 states, 4 arcs, 1 path.
foma[2]: lower
ange
The first rule, with obligatory replacement, correctly returns 'anye'. But the second rule, with optional replacement, should give both 'ange' (unchanged) and 'anye' (changed), but only gives the unchanged form.

When I do regex [{ng} (->) {ny}]; , I get a rather network with 2 states but only 4 arcs. The odd thing (and the reason it doesn't work, I suppose) is that there's an arc from state 1 back to 0, but no arc to get to state 1.

The problem seems to happen only when the input and output are sequences, and share a first character (or maybe a sequence of initial characters). For example, the following works correctly:
{ng} (->) {xg}
but the following (as above) does not:
{ng} (->) {ny}
BTW, the reason for wanting to write the rule in this semi-redundant way is that in Indonesian, the digraphs 'ng' and 'ny' represent single phonemes.

@rcastromamani
Copy link

rcastromamani commented May 3, 2021

I recently found myself in a similar situation when implementing the following optional replacement rule:

define THReplacement [ {ts} (->) {th} || _ [a|e|i|o|á|é|í|ó] ]; ! thamiri -> tsamiri

The only workaround I could use was the following:

define THReplacement [ {ts} -> {th}, {ts} -> {ts} || _ [a|e|i|o|á|é|í|ó] ];

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants