Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve subprocess CRDS calls for easier debugging #140

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

mcara
Copy link
Member

@mcara mcara commented Jun 23, 2020

Recently there were a couple of helpdesk issues (INC0155647 and INC0156501) when users reported errors with running CRDS commands using subprocess. One difficulty in understanding the cause of the errors was that suprocess in some notebooks is not set up to output errors. This PR modifies calls to subprocess.check_output()so that messages from CRDS can be read in the notebook.

@mcara mcara requested a review from eteq as a code owner June 23, 2020 18:51
"subprocess.check_output('crds bestrefs --files ua0605*_c0m.fits --sync-references=1 --update-bestrefs', shell=True, stderr=subprocess.DEVNULL)\n",
"stdout = subprocess.check_output(\n",
" 'crds bestrefs --files ua0605*_c0m.fits --sync-references=1 --update-bestrefs',\n",
" shell=True,\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to do this without shell=True?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't CRDS written in Python? Can't we call its Python API directly?

Copy link
Member Author

@mcara mcara Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC: CRDS experts @eslavich @jaytmiller

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pllim May I ask: what is the issue with shell=True?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

@mcara mcara Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of academic curiosity, is there benefit in sorting the glob results first in this usage?

I do not see why would it matter here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, let's say if the bestref is usually named in a way that would appear last in listing, and the function do a simple linear search, then sorted(..., reverse=True) would make the search faster? Anyway, I have no idea how it works, so I am just thinking aloud. And of course, it is probably out of topic as far as this PR is concerned.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order shouldn't matter -- the glob result is a list of datasets, and we have to make the best references determination for each of them.

Incidentally, I don't think shell=True is a security concern in this case. We wouldn't want to use it in web server code (for example) that interpolates user input into a command, but here the user is already in an environment where they can execute arbitrary shell commands so it's no more unsafe than a notebook in general.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not as much as a safety in this case, as to promote best practices. What if someone copy-paste what is here into server code? 😉

And given that it is possible to call Python API directly, there is really no reason to have shell=True at all in the first place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s. Yeah, just ignore my comment about sorting. Sorry for the noise!

@bhilbert4
Copy link

I don't know if it is relevant or useful here or not, but here's an example of using the CRDS API: https://github.com/spacetelescope/mirage/blob/master/mirage/reference_files/crds_tools.py#L141

@mcara
Copy link
Member Author

mcara commented Jun 23, 2020

I would say that I just want to improve one aspect of existing notebooks to address a specific issue. If there is desire to re-design the notebooks, that could be done too as a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants