Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify IPCClient and prevent corrupt messages #52445

Merged
merged 7 commits into from
Apr 10, 2019

Conversation

dwoz
Copy link
Contributor

@dwoz dwoz commented Apr 8, 2019

What does this PR do?

  • Re-raise queued exceptions with their original traceback
  • Removes singleton from IPCClient and prevents corrupt message passing

What issues does this PR fix or reference?

  • Fixes many failing and flaky tests

Previous Behavior

We tried to share a single IPCClient across the entire codebase resulting in send calls stomping on each other and creating corrupted messages.

New Behavior

Allow IPCClient to be regular instances

Tests written?

No - Fixing existing tests

Full py2 test suite
Full py3 test suite

Commits signed with GPG?

Yes

@thatch45
Copy link
Contributor

thatch45 commented Apr 8, 2019

You had me at "Removes singleton"

@cachedout
Copy link
Contributor

Hi @dwoz !

One question that I have about this is that it seems like it means that from this point forward, callers would have to ensure that they are passing in the current loop or we'll end up with the potential for multiple loops per-thread, which seems bad. The old approach tried to prevent this via the singleton implementation but I don't see those same safeguards in place here. If we're going down this route, should we consider disallowing a loop to even be passed in to an IPC client?

@DmitryKuzmenko
Copy link
Contributor

@cachedout Tornado 5+ disallows multiple io loops per thread now. Moreover they removed the ioloop argument from all tornado API.

@cachedout
Copy link
Contributor

@DmitryKuzmenko Ah, yes. Tornado 5 was going to be my next question. :) That should provide the safety we need in these cases. Will this change ever have to run under < 5 though?

@DmitryKuzmenko
Copy link
Contributor

@cachedout should run good, but let's see. =)

@dwoz
Copy link
Contributor Author

dwoz commented Apr 9, 2019

@cachedout One thing to note is the use of tornado.ioloop.IOLoop.current here. This should ensure we are re-using an existing loop if one is not passed in explicitly.

@dwoz
Copy link
Contributor Author

dwoz commented Apr 10, 2019

Here is a centos 7 Python 2 full test run with these change, there were two failures which look to be un-related: https://jenkinsci.saltstack.com/job/salt-centos-7-py2-dwoz/3/

@codecov
Copy link

codecov bot commented Apr 10, 2019

Codecov Report

Merging #52445 into 2018.3 will decrease coverage by 16.25%.
The diff coverage is 25%.

Impacted file tree graph

@@             Coverage Diff             @@
##           2018.3   #52445       +/-   ##
===========================================
- Coverage   48.78%   32.52%   -16.26%     
===========================================
  Files        1511     1511               
  Lines      256875   256865       -10     
  Branches    55626    53866     -1760     
===========================================
- Hits       125306    83554    -41752     
- Misses     131285   163403    +32118     
- Partials      284     9908     +9624
Flag Coverage Δ
#arch ?
#centos7 32.52% <25%> (+0.03%) ⬆️
#debian8 ?
#opensuse42 ?
#py2 32.52% <25%> (-0.1%) ⬇️
#py3 ?
#pycryptodomex ?
#transport ?
#ubuntu1404 ?
Impacted Files Coverage Δ
salt/transport/ipc.py 53.27% <25%> (-6.57%) ⬇️
salt/transport/frame.py 13.63% <0%> (-46.97%) ⬇️
salt/ext/ssl_match_hostname.py 0% <0%> (-36.18%) ⬇️
salt/client/ssh/wrapper/config.py 0% <0%> (-35.56%) ⬇️
salt/output/virt_query.py 12.9% <0%> (-35.49%) ⬇️
salt/states/pagerduty_schedule.py 10.44% <0%> (-34.33%) ⬇️
salt/states/rbac_solaris.py 8.42% <0%> (-33.69%) ⬇️
salt/states/smartos.py 6.34% <0%> (-33.51%) ⬇️
salt/states/zabbix_mediatype.py 3.43% <0%> (-33.34%) ⬇️
salt/runners/bgp.py 21.48% <0%> (-33.06%) ⬇️
... and 1385 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b6028b9...1bdaf29. Read the comment docs.

@codecov
Copy link

codecov bot commented Apr 10, 2019

Codecov Report

Merging #52445 into 2018.3 will decrease coverage by 3.29%.
The diff coverage is 46.66%.

Impacted file tree graph

@@            Coverage Diff            @@
##           2018.3   #52445     +/-   ##
=========================================
- Coverage   48.78%   45.48%   -3.3%     
=========================================
  Files        1511     1511             
  Lines      256875   256870      -5     
  Branches    55626    55625      -1     
=========================================
- Hits       125306   116849   -8457     
+ Misses     131285   130529    -756     
- Partials      284     9492   +9208
Flag Coverage Δ
#arch ?
#centos7 45.48% <46.66%> (+12.99%) ⬆️
#debian8 ?
#opensuse42 ?
#py2 32.53% <43.33%> (-0.1%) ⬇️
#py3 32.33% <43.33%> (+31.75%) ⬆️
#pycryptodomex ?
#transport ?
#ubuntu1404 ?
Impacted Files Coverage Δ
salt/transport/ipc.py 55.25% <46.66%> (-4.59%) ⬇️
salt/modules/s3.py 54.34% <0%> (-28.27%) ⬇️
salt/modules/napalm_snmp.py 63.15% <0%> (-21.06%) ⬇️
salt/modules/svn.py 61.62% <0%> (-20.94%) ⬇️
salt/tops/ext_nodes.py 58.62% <0%> (-20.69%) ⬇️
salt/modules/zfs.py 63.5% <0%> (-18.25%) ⬇️
salt/states/user.py 39.84% <0%> (-18.23%) ⬇️
salt/renderers/jinja.py 71.42% <0%> (-17.86%) ⬇️
salt/modules/telegram.py 60.78% <0%> (-17.65%) ⬇️
salt/grains/mdadm.py 73.91% <0%> (-17.4%) ⬇️
... and 979 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b6028b9...1bdaf29. Read the comment docs.

@cachedout
Copy link
Contributor

@dwoz My concern is more about the behavior if one is passed in explicitly. I'm not sure if that's happening anywhere but if it is, could it be an issue?

@s0undt3ch
Copy link
Collaborator

s0undt3ch commented Apr 10, 2019 via email

@dwoz
Copy link
Contributor Author

dwoz commented Apr 10, 2019

@cachedout Looking at this again I see that, prior to this change, if a caller passed in an event loop that is not the 'current loop' it would have made a new instance and allowed two loops in the same thread.

https://github.com/saltstack/salt/pull/52445/files#diff-d5f1a701c958ed082895806182d4d7fdL246

So I don't think we're loosing anything here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants