Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ALMA data download broken #2489

Closed
keflavich opened this issue Aug 11, 2022 · 6 comments · Fixed by #2490 or #2493
Closed

ALMA data download broken #2489

keflavich opened this issue Aug 11, 2022 · 6 comments · Fixed by #2490 or #2493

Comments

@keflavich
Copy link
Contributor

keflavich commented Aug 11, 2022

The ALMA data downloader is broken because of an upstream change:

from astroquery.alma import Alma
alma = Alma()
uid = 'uid://A001/X12a3/Xe9'
data_info = alma.get_data_info(uid, expand_tarfiles=True)
alma.retrieve_data_from_uid([uid])

This fails on the second entry in the list because there is now a blank URL in the access_url column:

>>>  data_info
<Table length=9>
         ID                                                   access_url                                                                  service_def                         error_message   semantics                           description                           content_type   content_length readable
                                                                                                                                                                                                                                                                                            byte
       object                                                   object                                                                       object                               object        object                               object                                object          int64        bool
-------------------- -------------------------------------------------------------------------------------------- ----------------------------------------------------------- ------------- -------------- --------------------------------------------------------- ----------------- -------------- --------
uid://A001/X12a3/Xe9               https://almascience.nrao.edu/dataPortal/member.uid___A001_X12a3_Xe9.README.txt                                                                           #documentation Download documentation for ./uid://A001/X12a3/Xe9|README.        text/plain           3523     True
uid://A001/X12a3/Xe9   https://almascience.nrao.edu/dataPortal/2017.1.01185.S_uid___A001_X12a3_Xe9_001_of_001.tar                                                                                    #this           Download dataset of type: null, and class: N/A. application/x-tar      556278784     True
uid://A001/X12a3/Xe9                                                                                              DataLink.2017.1.01185.S_uid___A001_X12a3_Xe9_001_of_001.tar                        #this                                                                    text/xml             --       --
uid://A001/X12a3/Xe9    https://almascience.nrao.edu/dataPortal/2017.1.01185.S_uid___A001_X12a3_Xe9_auxiliary.tar                                                                               #auxiliary           Download dataset of type: null, and class: N/A. application/x-tar      135479296     True
uid://A001/X12a3/Xe9                                                                                               DataLink.2017.1.01185.S_uid___A001_X12a3_Xe9_auxiliary.tar                   #auxiliary                                                                    text/xml             --       --
uid://A001/X12a3/Xe9 https://almascience.nrao.edu/dataPortal/2017.1.01185.S_uid___A002_Xd28a9e_X71b8.asdm.sdm.tar                                                                              #progenitor               Download dataset of type: , and class: N/A. application/x-tar     2343569408     True
uid://A001/X12a3/Xe9 https://almascience.nrao.edu/dataPortal/2017.1.01185.S_uid___A002_Xd28a9e_X7b4d.asdm.sdm.tar                                                                              #progenitor               Download dataset of type: , and class: N/A. application/x-tar     2343861248     True
uid://A001/X12a3/Xe9 https://almascience.nrao.edu/dataPortal/2017.1.01185.S_uid___A002_Xd29c1f_X1f74.asdm.sdm.tar                                                                              #progenitor               Download dataset of type: , and class: N/A. application/x-tar     2616798208     True
uid://A001/X12a3/Xe9  https://almascience.nrao.edu/dataPortal/2017.1.01185.S_uid___A002_Xd29c1f_X5cf.asdm.sdm.tar                                                                              #progenitor               Download dataset of type: , and class: N/A. application/x-tar     2680167424     True

The error is:

Traceback (most recent call last):
  File "<ipython-input-13-3e41ffe75afd>", line 1, in <module>
    alma.retrieve_data_from_uid([uid])
  File "/home/adam/repos/astroquery/astroquery/alma/core.py", line 807, in retrieve_data_from_uid
    downloaded_files = self.download_files(file_urls)
  File "/home/adam/repos/astroquery/astroquery/alma/core.py", line 681, in download_files
    check_filename = self._request('HEAD', file_link, auth=auth)
  File "/home/adam/repos/astroquery/astroquery/query.py", line 315, in _request
    response = query.request(self._session,
  File "/home/adam/repos/astroquery/astroquery/query.py", line 69, in request
    return session.request(self.method, self.url, params=self.params,
  File "/home/adam/anaconda3/lib/python3.8/site-packages/requests/sessions.py", line 528, in request
    prep = self.prepare_request(req)
  File "/home/adam/anaconda3/lib/python3.8/site-packages/requests/sessions.py", line 456, in prepare_request
    p.prepare(
  File "/home/adam/anaconda3/lib/python3.8/site-packages/requests/models.py", line 316, in prepare
    self.prepare_url(url, params)
  File "/home/adam/anaconda3/lib/python3.8/site-packages/requests/models.py", line 390, in prepare_url
    raise MissingSchema(error)
MissingSchema: Invalid URL '': No schema supplied. Perhaps you meant http://?

@andamian I think this is a simple fix in astroquery, I'll propose the fix, but I hope you understand something about the upstream change.

@aida-ahmadi
Copy link

Hi @keflavich et al., one more thing to add to this in case it's not on your radar, I think the expand_tarfiles=True is also not working properly. This particular uid, for example, has many fits files associated with it on the archive that are not listed in the table that is returned by astroquery.

@bsipocz
Copy link
Member

bsipocz commented Aug 18, 2022

hopefully, the PR in #2493 addresses that issue, too, could you @keflavich and @at88mph make sure that this usage example is also covered in the tests? (IMO the PR is almost ready, pending the VO table related remote test failures are get fixed)

@at88mph
Copy link
Contributor

at88mph commented Aug 18, 2022

It is indeed addressed in #2493 . See the test updates at https://github.com/astropy/astroquery/pull/2493/files#diff-c9dadaf5f972d477718623f20c7c01c45185315feacd2897823407293e50cdbeR474 and https://github.com/astropy/astroquery/pull/2493/files#diff-c9dadaf5f972d477718623f20c7c01c45185315feacd2897823407293e50cdbeR512 where the expected number of links is now higher as it should be with expand_tarfiles=True.

I could update the test_get_data_info_expand_tarfiles function (https://github.com/astropy/astroquery/pull/2493/files#diff-c9dadaf5f972d477718623f20c7c01c45185315feacd2897823407293e50cdbeR512) to include an expand_tarfiles=False and show a lower count. Would that illustrate that feature more?

@keflavich
Copy link
Contributor Author

This is not solved by #2493. I get the same error. I cleared the cache, so it's not just a caching issue.

@keflavich keflavich reopened this Aug 22, 2022
@bsipocz
Copy link
Member

bsipocz commented Aug 22, 2022

ahh, indeed, I see the issue, too.

@bsipocz bsipocz added the bug label Aug 22, 2022
@keflavich
Copy link
Contributor Author

#2490 solves this, and I've cleaned it up a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants