Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meqpipeliner hangs when connection is not established #879

Open
bennahugo opened this issue Jan 8, 2018 · 2 comments
Open

Meqpipeliner hangs when connection is not established #879

bennahugo opened this issue Jan 8, 2018 · 2 comments

Comments

@bennahugo
Copy link

Meqtrees pipeliner hangs when the connection to meqserver cannot be established. This causes pipelines to hang indefinitely.

running: docker start -a calibrator_Gjones_subtract_lsm0-140267809901120151517920471
running: /usr/bin/meqtree-pipeliner.py --mt 16 -c /code/tdlconf.profiles [stefcal] ms_sel.ms_read_flags=1 ms_sel.input_column=DATA ms_sel.field_index=0 ms_sel.msname=/home/jenkins/msdir/12A-405.sb7601493.eb10633016.56086.127048738424-corr.ms stefcal_gain.table=/home/jenkins/output/12A-405.sb7601493.eb10633016.56086.127048738424-corr.gain.cp tiggerlsm.lsm_subset=all ms_wfl.write_bitflag=stefcal do_output=CORR_DATA stefcal_gain.enabled=1 stefcal_gain.flag_chisq=0 ms_sel.ms_fill_legacy_flags=1 stefcal_gain.flag_ampl=1 ms_sel.ddid_index=0 ms_sel.tile_size=512 ms_sel.ms_write_flag_policy="'replace set'" ms_rfl.read_flagsets=-stefcal stefcal_gain.reset=1 stefcal_gain.freqint=64 stefcal_gain.implementation=GainDiagPhase stefcal_gain.flag_chisq_threshold=10 ms_rfl.read_legacy_flags=1 stefcal_gain.flag_ampl_low=0.15 ms_sel.ms_corr_sel='2x2' stefcal_gain.flag_ampl_high=2.0 ms_sel.ms_write_flags=1 stefcal_gain.mode=solve-save tiggerlsm.filename=/home/jenkins/output/vla_NGC417_LBand-LSM0.lsm.html stefcal_gain.timeint=20 ms_sel.output_column=CORRECTED_DATA /usr/local/lib/python2.7/dist-packages/Cattery/Calico/calico-stefcal.py =stefcal 
### Starting meqserver
Traceback (most recent call last):
  File "/usr/bin/meqtree-pipeliner.py", line 77, in <module>
    mqs = meqserver.default_mqs(wait_init=10,extra=["-mt",str(options.mt)]+(["-python_memprof"] if options.memprof else []));
  File "/usr/lib/python2.7/dist-packages/Timba/Apps/meqserver.py", line 262, in default_mqs
    mqs = meqserver(extra=extra,**args);
  File "/usr/lib/python2.7/dist-packages/Timba/Apps/meqserver.py", line 94, in __init__
    multiapp_proxy.__init__(self,appid,client_id,spawn=spawn,**kwargs);
  File "/usr/lib/python2.7/dist-packages/Timba/Apps/multiapp_proxy.py", line 211, in __init__
    self.ensure_connection(wait_init);
  File "/usr/lib/python2.7/dist-packages/Timba/Apps/multiapp_proxy.py", line 438, in ensure_connection
    raise RuntimeError,"timeout waiting for connection";
RuntimeError: timeout waiting for connection
@o-smirnov
Copy link
Contributor

Can you point me at an easily reproducible case? I agree the pipeliner should bomb out with an error rather than hang, but the fact that it can't establish a connection in the first place is indicative of some other problem.

@bennahugo
Copy link
Author

bennahugo commented Jan 8, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants