-
-
Notifications
You must be signed in to change notification settings - Fork 406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Xmm proprietary data #2251
Xmm proprietary data #2251
Conversation
Hello @javier-ballester! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2022-01-18 09:53:24 UTC |
Codecov Report
@@ Coverage Diff @@
## main #2251 +/- ##
==========================================
+ Coverage 62.60% 62.68% +0.07%
==========================================
Files 131 131
Lines 16784 16815 +31
==========================================
+ Hits 10508 10540 +32
+ Misses 6276 6275 -1
Continue to review full report at Codecov.
|
Hello Astroquery team, I have updated the code to pass the checks however I do not understand how to resolve the Code scanning results. Could you please explain to me this check a bit more and how to resolve it? Many thanks |
Having a quick look it looks like that messing with the log levels doesn't make codeql satisfied. What about changing those log conditionals to |
Thank you for getting back to me so quickly, I have placed the changing of log levels inside the if prop to make it clearer. I did not quite understand what you meant about including this in the "if verbose". The problem we have about revealing sensitive data is when we call the _download_file method it runs a method inside query.py which if at debug level, reveals the username and password which we do not want. I can't seem to work out a way of solving this problem without changing the log levels around the sensitive parts or changing query.py which do not want to do. Is there another way of solving the CodeQl error? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have several changes to request, but most importantly, if at all possible, this should use the QueryWithLogin
base class instead of rewriting the authorization code.
astroquery/astroquery/query.py
Line 461 in f18f02b
class QueryWithLogin(BaseQuery): |
Does the XMM Newton service support logging in once for a session, rather than sending auth information with each request independently?
# If the user wants to access proprietary data, ask them for there credentials | ||
if prop: | ||
username, password = self._get_username_and_password(credentials_file) | ||
link = f"{link}&AIOUSER={username}&AIOPWD={password}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this can result in plaintext logging/output of the username and password. Does the server allow these data to be sent via post request rather than through the URL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://github.com/astropy/astroquery/pull/2251/checks?check_run_id=4559276557 - it's the auto-security-checker that is noting that there is plaintext logging of passwords happening
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The server that the xmm_newton archive is currently using does not handle post request. We understand that sending a plaintext username and password over url is not safe. This is the reason we have updated the http requests to be https. Once the functionality of downloading proprietary data has been implemented. The xmm team have planned to improve the aio server and change the way the user can login and access data. However until them we only have url commands.
""" | ||
If prop we change the log level so that it is above 20, this is to stop a log.debug (line 431) in query.py. | ||
This debug reveals the url being sent which in turn reveals the users username and password | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we instead avoid putting the password into the URL? log.debug
is needed for actual debugging in many cases; disabling it here is potentially going to hide errors we need to see in the future. I'm not comfortable overriding it like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, _download_file
supports the auth
keyword:
https://github.com/astropy/astroquery/blob/main/astroquery/query.py#L324
so we could do self._download_file(link_with_no_password, auth={'username': username, 'password': password})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried to use the _download_file auth keyword however the only way for the aio server to download the proprietary data is by having the username and password attached to the end of the url. The _download_file which in turn leads to the requests method does not attach the auth keyword to the end of the url so we cannot use this functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see. If that's all the server supports, then we're stuck with this approach.
Are there any example proprietary data sets we can access - something that is permanently protected by username/password, but is meant to be public - so we can have a test of successfully logging in? |
Currently we do not have any example data sets that we can use to test a successful login however I have asked the team if this is something we can implement. I will let you know when they get back to me. Until then are there any other aspects of the code that need changing to make it pull ready? |
@bsipocz if you have any input here, please chime in. The problem is that the XMM Newton server requires plain-text user/pw sending and doesn't accept http auth, so we can't use existing astroquery or astropy infrastructure. I think we just accept this as is, unless you can propose a better idea? |
I have added more unit tests to resolve the patch coverage error however now I am failing "CI / astropy dev with all dependencies with coverage (pull_request)" test and I am not sure why. |
Dear @bsipocz and @keflavich, The XMM team are wondering what else is needed for this pull request to accepted? Many thanks, Javier Ballester |
The failing test is because of the recent pyvo change; I'm restarting it because of their release of 1.2.1 yesterday |
Okay, so are there any other changes from our side required? |
Dear Astroquery team, After talking to various members of the XMM archive we have come up with a way to create a dummy account that can access dummy proprietary data. This will be created to access the development server instead of the operational one to avoid any problems / misunderstandings. However this will take some time to develop. My question is, should we put this pull request on hold until this functionality is implemented and we have added the test, or accept this pull request now and include this test in the next pull request? Many thanks, |
@javier-ballester that's a great solution. We can merge this now and add the additional tests when the dummy account becomes available - please let us know! I'm approving this, but I will wait for @bsipocz to merge. |
- added corresponding tests
c304731
to
0c365f3
Compare
Thanks @javier-ballester. I'll go ahead and merge this now. Please do open a new issue about the dummy testing scheme & efforts to avoid logging private information. |
Dear Astroquery team,
My name is javier-ballester and I am working alongside jespinosaar in developing astroquery functionality for ESA XMM module. In this pull request I have changed the http requests to be https request and have changed the download_data method to allow the downloading of proprietary data.
Looking forward to working alongside your team. Please let me know if there are any comments or improvements that are needed.
Kind regards,
javier-ballester