Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xmm proprietary data #2251

Merged
merged 17 commits into from
Jan 18, 2022
Merged

Conversation

javier-ballester
Copy link
Contributor

Dear Astroquery team,

My name is javier-ballester and I am working alongside jespinosaar in developing astroquery functionality for ESA XMM module. In this pull request I have changed the http requests to be https request and have changed the download_data method to allow the downloading of proprietary data.

Looking forward to working alongside your team. Please let me know if there are any comments or improvements that are needed.

Kind regards,

javier-ballester

@pep8speaks
Copy link

pep8speaks commented Dec 16, 2021

Hello @javier-ballester! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-01-18 09:53:24 UTC

@codecov
Copy link

codecov bot commented Dec 16, 2021

Codecov Report

Merging #2251 (0c365f3) into main (a31ab86) will increase coverage by 0.07%.
The diff coverage is 62.96%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2251      +/-   ##
==========================================
+ Coverage   62.60%   62.68%   +0.07%     
==========================================
  Files         131      131              
  Lines       16784    16815      +31     
==========================================
+ Hits        10508    10540      +32     
+ Misses       6276     6275       -1     
Impacted Files Coverage Δ
astroquery/esa/xmm_newton/core.py 64.45% <62.96%> (+3.99%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a31ab86...0c365f3. Read the comment docs.

@javier-ballester
Copy link
Contributor Author

Hello Astroquery team, I have updated the code to pass the checks however I do not understand how to resolve the Code scanning results. Could you please explain to me this check a bit more and how to resolve it?

Many thanks

@bsipocz
Copy link
Member

bsipocz commented Dec 17, 2021

Having a quick look it looks like that messing with the log levels doesn't make codeql satisfied. What about changing those log conditionals to if verbose and not proprietary:? I feel it would provide a much clearer intent than setting the loglevels back and forth in the code around the sensitive parts.

@javier-ballester
Copy link
Contributor Author

Thank you for getting back to me so quickly,

I have placed the changing of log levels inside the if prop to make it clearer. I did not quite understand what you meant about including this in the "if verbose". The problem we have about revealing sensitive data is when we call the _download_file method it runs a method inside query.py which if at debug level, reveals the username and password which we do not want. I can't seem to work out a way of solving this problem without changing the log levels around the sensitive parts or changing query.py which do not want to do.

Is there another way of solving the CodeQl error?

Copy link
Contributor

@keflavich keflavich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have several changes to request, but most importantly, if at all possible, this should use the QueryWithLogin base class instead of rewriting the authorization code.

class QueryWithLogin(BaseQuery):

Does the XMM Newton service support logging in once for a session, rather than sending auth information with each request independently?

astroquery/esa/xmm_newton/core.py Outdated Show resolved Hide resolved
# If the user wants to access proprietary data, ask them for there credentials
if prop:
username, password = self._get_username_and_password(credentials_file)
link = f"{link}&AIOUSER={username}&AIOPWD={password}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this can result in plaintext logging/output of the username and password. Does the server allow these data to be sent via post request rather than through the URL?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://github.com/astropy/astroquery/pull/2251/checks?check_run_id=4559276557 - it's the auto-security-checker that is noting that there is plaintext logging of passwords happening

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The server that the xmm_newton archive is currently using does not handle post request. We understand that sending a plaintext username and password over url is not safe. This is the reason we have updated the http requests to be https. Once the functionality of downloading proprietary data has been implemented. The xmm team have planned to improve the aio server and change the way the user can login and access data. However until them we only have url commands.

Comment on lines +129 to +131
"""
If prop we change the log level so that it is above 20, this is to stop a log.debug (line 431) in query.py.
This debug reveals the url being sent which in turn reveals the users username and password
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we instead avoid putting the password into the URL? log.debug is needed for actual debugging in many cases; disabling it here is potentially going to hide errors we need to see in the future. I'm not comfortable overriding it like this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, _download_file supports the auth keyword:
https://github.com/astropy/astroquery/blob/main/astroquery/query.py#L324
so we could do self._download_file(link_with_no_password, auth={'username': username, 'password': password})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried to use the _download_file auth keyword however the only way for the aio server to download the proprietary data is by having the username and password attached to the end of the url. The _download_file which in turn leads to the requests method does not attach the auth keyword to the end of the url so we cannot use this functionality.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see. If that's all the server supports, then we're stuck with this approach.

astroquery/esa/xmm_newton/core.py Outdated Show resolved Hide resolved
astroquery/esa/xmm_newton/core.py Show resolved Hide resolved
@keflavich
Copy link
Contributor

Are there any example proprietary data sets we can access - something that is permanently protected by username/password, but is meant to be public - so we can have a test of successfully logging in?

@javier-ballester
Copy link
Contributor Author

Are there any example proprietary data sets we can access - something that is permanently protected by username/password, but is meant to be public - so we can have a test of successfully logging in?

Currently we do not have any example data sets that we can use to test a successful login however I have asked the team if this is something we can implement. I will let you know when they get back to me.

Until then are there any other aspects of the code that need changing to make it pull ready?

@keflavich
Copy link
Contributor

@bsipocz if you have any input here, please chime in. The problem is that the XMM Newton server requires plain-text user/pw sending and doesn't accept http auth, so we can't use existing astroquery or astropy infrastructure. I think we just accept this as is, unless you can propose a better idea?

@javier-ballester
Copy link
Contributor Author

I have added more unit tests to resolve the patch coverage error however now I am failing "CI / astropy dev with all dependencies with coverage (pull_request)" test and I am not sure why.

@javier-ballester
Copy link
Contributor Author

Dear @bsipocz and @keflavich,

The XMM team are wondering what else is needed for this pull request to accepted?

Many thanks,

Javier Ballester

@keflavich
Copy link
Contributor

The failing test is because of the recent pyvo change; I'm restarting it because of their release of 1.2.1 yesterday

@javier-ballester
Copy link
Contributor Author

The failing test is because of the recent pyvo change; I'm restarting it because of their release of 1.2.1 yesterday

Okay, so are there any other changes from our side required?

@javier-ballester
Copy link
Contributor Author

Are there any example proprietary data sets we can access - something that is permanently protected by username/password, but is meant to be public - so we can have a test of successfully logging in?

Dear Astroquery team,

After talking to various members of the XMM archive we have come up with a way to create a dummy account that can access dummy proprietary data. This will be created to access the development server instead of the operational one to avoid any problems / misunderstandings. However this will take some time to develop. My question is, should we put this pull request on hold until this functionality is implemented and we have added the test, or accept this pull request now and include this test in the next pull request?

Many thanks,

@keflavich
Copy link
Contributor

@javier-ballester that's a great solution. We can merge this now and add the additional tests when the dummy account becomes available - please let us know!

I'm approving this, but I will wait for @bsipocz to merge.

CHANGES.rst Show resolved Hide resolved
@keflavich
Copy link
Contributor

Thanks @javier-ballester. I'll go ahead and merge this now. Please do open a new issue about the dummy testing scheme & efforts to avoid logging private information.

@keflavich keflavich merged commit 5644067 into astropy:main Jan 18, 2022
@bsipocz bsipocz added this to the v0.4.6 milestone Jan 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants