Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crashes in parsetab due to ply race condition #206

Open
jfly opened this issue Sep 21, 2018 · 2 comments · May be fixed by #207
Open

Crashes in parsetab due to ply race condition #206

jfly opened this issue Sep 21, 2018 · 2 comments · May be fixed by #207

Comments

@jfly
Copy link

jfly commented Sep 21, 2018

When multiple Python processes are simultaneously doing a from flanker.addresslib import address, it's possible for some of them to crash in ply code. I can fairly reliably reproduce this crash with the following dockerfile:

$ docker run $(docker build -q https://raw.githubusercontent.com/jfly/jfly.github.io/master/misc/ply-race/Dockerfile-race-demo)
Symbol 'mailbox_or_url_list' is unreachable
Symbol 'delim' is unreachable
Symbol 'mailbox_or_url' is unreachable
Symbol 'url' is unreachable
Symbol 'mailbox_or_url_list' is unreachable
Symbol 'delim' is unreachable
Symbol 'mailbox_or_url' is unreachable
Symbol 'url' is unreachable
Symbol 'mailbox_or_url_list' is unreachable
Symbol 'delim' is unreachable
Symbol 'mailbox_or_url' is unreachable
Symbol 'url' is unreachable
Symbol 'mailbox_or_url_list' is unreachable
Symbol 'delim' is unreachable
Symbol 'mailbox_or_url' is unreachable
Symbol 'url' is unreachable
Symbol 'mailbox_or_url_list' is unreachable
Symbol 'delim' is unreachable
Symbol 'mailbox_or_url' is unreachable
Symbol 'url' is unreachable
Symbol 'mailbox_or_url_list' is unreachable
Symbol 'delim' is unreachable
Symbol 'mailbox_or_url' is unreachable
Symbol 'mailbox' is unreachable
Symbol 'url' is unreachable
Symbol 'angle_addr' is unreachable
Symbol 'name_addr' is unreachable
Symbol 'phrase' is unreachable
Symbol 'mailbox_or_url_list' is unreachable
Process Process-5:
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "importing.py", line 5, in someFunc
    from flanker.addresslib import address
  File "/usr/local/lib/python3.7/site-packages/flanker/addresslib/address.py", line 49, in <module>
    from flanker.addresslib._parser.parser import (Mailbox, Url, mailbox_parser,
  File "/usr/local/lib/python3.7/site-packages/flanker/addresslib/_parser/parser.py", line 161, in <module>
    tabmodule='mailbox_parsetab')
  File "/usr/local/lib/python3.7/site-packages/ply/yacc.py", line 3293, in yacc
    read_signature = lr.read_table(tabmodule)
  File "/usr/local/lib/python3.7/site-packages/ply/yacc.py", line 1987, in read_table
    if parsetab._tabversion != __tabversion__:
AttributeError: module 'flanker.addresslib._parser.mailbox_parsetab' has no attribute '_tabversion'
Symbol 'mailbox_or_url_list' is unreachable
Symbol 'delim' is unreachable
Symbol 'mailbox_or_url' is unreachable
Symbol 'mailbox' is unreachable
Symbol 'addr_spec' is unreachable
Symbol 'angle_addr' is unreachable
Symbol 'delim' is unreachable
Symbol 'mailbox_or_url' is unreachable
Symbol 'mailbox' is unreachable
Symbol 'name_addr' is unreachable
Symbol 'addr_spec' is unreachable
Symbol 'angle_addr' is unreachable
Symbol 'phrase' is unreachable
Symbol 'name_addr' is unreachable
Symbol 'phrase' is unreachable
Symbol 'local_part' is unreachable
Symbol 'local_part' is unreachable
Symbol 'domain' is unreachable
Symbol 'quoted_string' is unreachable
Symbol 'domain_literal' is unreachable
Symbol 'domain' is unreachable
Symbol 'quoted_string_text' is unreachable
Symbol 'domain_literal_text' is unreachable
Symbol 'quoted_string' is unreachable
Symbol 'domain_literal' is unreachable
Symbol 'quoted_string_text' is unreachable
Symbol 'domain_literal_text' is unreachable
Symbol 'mailbox_or_url_list' is unreachable
Symbol 'delim' is unreachable
Symbol 'mailbox_or_url' is unreachable
Symbol 'mailbox' is unreachable
Symbol 'addr_spec' is unreachable
Symbol 'angle_addr' is unreachable
Symbol 'name_addr' is unreachable
Symbol 'phrase' is unreachable
Symbol 'local_part' is unreachable
Symbol 'domain' is unreachable
Symbol 'quoted_string' is unreachable
Symbol 'domain_literal' is unreachable
Symbol 'quoted_string_text' is unreachable
Symbol 'mailbox_or_url_list' is unreachable
Symbol 'delim' is unreachable
Symbol 'mailbox_or_url' is unreachable
Symbol 'mailbox' is unreachable
Symbol 'addr_spec' is unreachable
Symbol 'angle_addr' is unreachable
Symbol 'name_addr' is unreachable
Symbol 'phrase' is unreachable
Symbol 'local_part' is unreachable
Symbol 'domain' is unreachable
Symbol 'quoted_string' is unreachable
Symbol 'domain_literal' is unreachable
Symbol 'quoted_string_text' is unreachable
Symbol 'domain_literal_text' is unreachable
Symbol 'mailbox_or_url_list' is unreachable
Symbol 'delim' is unreachable
Symbol 'mailbox_or_url' is unreachable
Symbol 'mailbox' is unreachable
Symbol 'addr_spec' is unreachable
Symbol 'angle_addr' is unreachable
Symbol 'name_addr' is unreachable
Symbol 'phrase' is unreachable
Symbol 'local_part' is unreachable
Symbol 'domain' is unreachable
Symbol 'quoted_string' is unreachable
Symbol 'domain_literal' is unreachable
Symbol 'quoted_string_text' is unreachable
Symbol 'domain_literal_text' is unreachable
Symbol 'domain_literal_text' is unreachable
Symbol 'mailbox_or_url_list' is unreachable
Symbol 'delim' is unreachable
Symbol 'mailbox_or_url' is unreachable
Symbol 'mailbox' is unreachable
Symbol 'addr_spec' is unreachable
Symbol 'angle_addr' is unreachable
Symbol 'name_addr' is unreachable
Symbol 'phrase' is unreachable
Symbol 'local_part' is unreachable
Symbol 'domain' is unreachable
Symbol 'quoted_string' is unreachable
Symbol 'domain_literal' is unreachable
Symbol 'quoted_string_text' is unreachable
Symbol 'domain_literal_text' is unreachable
Symbol 'mailbox_or_url_list' is unreachable
Symbol 'mailbox_or_url_list' is unreachable
Symbol 'delim' is unreachable
Symbol 'mailbox_or_url_list' is unreachable
Symbol 'delim' is unreachable
Symbol 'delim' is unreachable
Symbol 'mailbox_or_url_list' is unreachable
Symbol 'delim' is unreachable
Symbol 'mailbox_or_url_list' is unreachable
Process Process-6:
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "importing.py", line 5, in someFunc
    from flanker.addresslib import address
  File "/usr/local/lib/python3.7/site-packages/flanker/addresslib/address.py", line 49, in <module>
    from flanker.addresslib._parser.parser import (Mailbox, Url, mailbox_parser,
  File "/usr/local/lib/python3.7/site-packages/flanker/addresslib/_parser/parser.py", line 171, in <module>
    tabmodule='url_parsetab')
  File "/usr/local/lib/python3.7/site-packages/ply/yacc.py", line 3293, in yacc
    read_signature = lr.read_table(tabmodule)
  File "/usr/local/lib/python3.7/site-packages/ply/yacc.py", line 1987, in read_table
    if parsetab._tabversion != __tabversion__:
AttributeError: module 'flanker.addresslib._parser.url_parsetab' has no attribute '_tabversion'
Symbol 'delim' is unreachable
Symbol 'mailbox_or_url_list' is unreachable
Symbol 'delim' is unreachable
Process Process-7:
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "importing.py", line 5, in someFunc
    from flanker.addresslib import address
  File "/usr/local/lib/python3.7/site-packages/flanker/addresslib/address.py", line 49, in <module>
    from flanker.addresslib._parser.parser import (Mailbox, Url, mailbox_parser,
  File "/usr/local/lib/python3.7/site-packages/flanker/addresslib/_parser/parser.py", line 176, in <module>
    tabmodule='mailbox_or_url_parsetab')
  File "/usr/local/lib/python3.7/site-packages/ply/yacc.py", line 3293, in yacc
    read_signature = lr.read_table(tabmodule)
  File "/usr/local/lib/python3.7/site-packages/ply/yacc.py", line 1984, in read_table
    exec('import %s' % module)
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.7/site-packages/flanker/addresslib/_parser/mailbox_or_url_parsetab.py", line 54
    ('quoted_string_text -> quoted_string_text QTEXT','quoted_string_text',2,'p_expression_quoted_string_text','parser.py',83),
                                                                                                                              ^
SyntaxError: unexpected EOF while parsing
Process Process-4:
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/local/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "importing.py", line 5, in someFunc
    from flanker.addresslib import address
  File "/usr/local/lib/python3.7/site-packages/flanker/addresslib/address.py", line 49, in <module>
    from flanker.addresslib._parser.parser import (Mailbox, Url, mailbox_parser,
  File "/usr/local/lib/python3.7/site-packages/flanker/addresslib/_parser/parser.py", line 176, in <module>
    tabmodule='mailbox_or_url_parsetab')
  File "/usr/local/lib/python3.7/site-packages/ply/yacc.py", line 3293, in yacc
    read_signature = lr.read_table(tabmodule)
  File "/usr/local/lib/python3.7/site-packages/ply/yacc.py", line 1987, in read_table
    if parsetab._tabversion != __tabversion__:
AttributeError: module 'flanker.addresslib._parser.mailbox_or_url_parsetab' has no attribute '_tabversion'
<module 'flanker.addresslib.address' from '/usr/local/lib/python3.7/site-packages/flanker/addresslib/address.py'>
<module 'flanker.addresslib.address' from '/usr/local/lib/python3.7/site-packages/flanker/addresslib/address.py'>
<module 'flanker.addresslib.address' from '/usr/local/lib/python3.7/site-packages/flanker/addresslib/address.py'>
<module 'flanker.addresslib.address' from '/usr/local/lib/python3.7/site-packages/flanker/addresslib/address.py'>
<module 'flanker.addresslib.address' from '/usr/local/lib/python3.7/site-packages/flanker/addresslib/address.py'>
<module 'flanker.addresslib.address' from '/usr/local/lib/python3.7/site-packages/flanker/addresslib/address.py'>

This bug has been reported before in #168, and PR #188 attempted to work around the issue. However, the workaround in that PR doesn't help if you have a different version of ply installed than flanker used to generate its parsetab files. ply will notice that the parsetab files were generated with a different version of flanker, and decide to regenerate the files, causing the same bug. As I understand it, there are two ways flanker could work around this issue:

  1. Update setup.py to pin ply to exactly version 3.10 (this is the version used in Dont always regenerate parsetab #188). This was actually discussed here: Dont always regenerate parsetab #188 (comment), but rejected (I don't think the people involved in that discussion were aware of the race condition).
  2. Change the invocations of yacc to pass write_tables=False, as @jlev did in spacedogXYZ@54d5008.

However, I would argue that neither of these are great solutions, and they are just working around a fundamental race condition in ply. I've sent in a PR to ply to try to fix the root bug (dabeaz/ply#184), but I wanted to file this issue with you guys just so you're aware of it. Currently, it's pretty unfortunate that someone who does pip install flanker will automatically end up with ply == 3.11, and be exposed to this race condition.

jfly added a commit to jfly/flanker that referenced this issue Sep 30, 2018
This fixes mailgun#206.

Before this change, when multiple Python processes are simultaneously
doing a `from flanker.addresslib import address`, it's possible for some
of them to crash in `ply` code.

See dabeaz/ply#184, where
I attempted to work around this issue by changing ply. You can see in this comment: dabeaz/ply#184 (comment) that the author of ply suggests to workarounds for this issue:

1. Remove `ply` as a dependency in setup.py and copy the source code of `ply` into `flanker`.
2. Disable writing parsetab files to disk when invoking `yacc`.

2) seemed like the simpler solution to me, so that's what I've done
here.
@jfly jfly linked a pull request Sep 30, 2018 that will close this issue
jfly added a commit to jfly/flanker that referenced this issue Sep 30, 2018
This fixes mailgun#206.

Before this change, when multiple Python processes are simultaneously
doing a `from flanker.addresslib import address`, it's possible for some
of them to crash in `ply` code.

See dabeaz/ply#184, where
I attempted to work around this issue by changing ply. You can see in this comment: dabeaz/ply#184 (comment) that the author of ply suggests to workarounds for this issue:

1. Remove `ply` as a dependency in setup.py and copy the source code of `ply` into `flanker`.
2. Disable writing parsetab files to disk when invoking `yacc`.

2) seemed like the simpler solution to me, so that's what I've done
here.
jfly added a commit to jfly/flanker that referenced this issue Mar 21, 2019
This fixes mailgun#206.

Before this change, when multiple Python processes are simultaneously
doing a `from flanker.addresslib import address`, it's possible for some
of them to crash in `ply` code.

See dabeaz/ply#184, where
I attempted to work around this issue by changing ply. You can see in this comment: dabeaz/ply#184 (comment) that the author of ply suggests two workarounds for this issue:

1. Remove `ply` as a dependency in setup.py and copy the source code of `ply` into `flanker`.
2. Disable writing parsetab files to disk when invoking `yacc`.

2) seemed like the simpler solution to me, so that's what I've done
here.
@jfly
Copy link
Author

jfly commented Nov 25, 2019

My PR to ply was rejected: dabeaz/ply#184. How do you feel about implementing one of the 2 workarounds/fixes detailed in my original post?

@jfly
Copy link
Author

jfly commented Jul 16, 2024

@thrawn01, I see you've made a change to this repo in the last year. I'd appreciate some input here, we're still dealing with (working around) this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant