Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: DFP fetch_example_data.py broken on s3fs=2023.6.0 #1047

Closed
2 tasks done
mdemoret-nv opened this issue Jul 12, 2023 · 1 comment · Fixed by #1053
Closed
2 tasks done

[BUG]: DFP fetch_example_data.py broken on s3fs=2023.6.0 #1047

mdemoret-nv opened this issue Jul 12, 2023 · 1 comment · Fixed by #1053
Assignees
Labels
bug Something isn't working

Comments

@mdemoret-nv
Copy link
Contributor

Version

23.07

Which installation method(s) does this occur on?

Docker, Conda, Source

Describe the bug.

With the upgrade to s3fs=2023.6.0 in PR #1023, we solved one problem but caused another. Now the examples/digital_fingerprinting/fetch_example_data.py fails.

We need to determine how to best download the files with the most recent version of s3fs

Minimum reproducible example

Ensure s3fs=2023.6 is installed

`python examples/digital_fingerprinting/fetch_example_data.py all`

Relevant log output

python examples/digital_fingerprinting/fetch_example_data.py all
Downloading DUO_2022-08-01T00_05_06.806Z.json
Traceback (most recent call last):
  File "<CONDA_ENV>/lib/python3.10/site-packages/s3fs/core.py", line 693, in _lsdir
    async for c in self._iterdir(
  File "<CONDA_ENV>/lib/python3.10/site-packages/s3fs/core.py", line 743, in _iterdir
    async for i in it:
  File "<CONDA_ENV>/lib/python3.10/site-packages/aiobotocore/paginate.py", line 30, in __anext__
    response = await self._make_request(current_kwargs)
  File "<CONDA_ENV>/lib/python3.10/site-packages/aiobotocore/client.py", line 371, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<MORPHEUS_ROOT>/examples/digital_fingerprinting/fetch_example_data.py", line 588, in <module>
    main()
  File "<MORPHEUS_ROOT>/examples/digital_fingerprinting/fetch_example_data.py", line 584, in main
    fetch_dataset(ds)
  File "<MORPHEUS_ROOT>/examples/digital_fingerprinting/fetch_example_data.py", line 554, in fetch_dataset
    fs.get(S3_BASE_PATH + f, train_dir + f)
  File "<CONDA_ENV>/lib/python3.10/site-packages/fsspec/asyn.py", line 121, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "<CONDA_ENV>/lib/python3.10/site-packages/fsspec/asyn.py", line 106, in sync
    raise return_result
  File "<CONDA_ENV>/lib/python3.10/site-packages/fsspec/asyn.py", line 61, in _runner
    result[0] = await coro
  File "<CONDA_ENV>/lib/python3.10/site-packages/fsspec/asyn.py", line 579, in _get
    rpaths = [
  File "<CONDA_ENV>/lib/python3.10/site-packages/fsspec/asyn.py", line 580, in <listcomp>
    p for p in rpaths if not (trailing_sep(p) or await self._isdir(p))
  File "<CONDA_ENV>/lib/python3.10/site-packages/s3fs/core.py", line 1380, in _isdir
    return bool(await self._lsdir(path))
  File "<CONDA_ENV>/lib/python3.10/site-packages/s3fs/core.py", line 706, in _lsdir
    raise translate_boto_error(e)
PermissionError: Access Denied

Full env printout

Fixing this requires running `mamba install s3fs=2022.8.2` which shows the broken env -> working env

  - aiobotocore     2.5.0  pyhd8ed1ab_0  conda-forge                  
  + aiobotocore     2.4.0  pyhd8ed1ab_0  conda-forge/noarch     Cached
  - boto3         1.26.76  pyhd8ed1ab_0  conda-forge                  
  + boto3         1.24.59  pyhd8ed1ab_0  conda-forge/noarch     Cached
  - botocore      1.29.76  pyhd8ed1ab_0  conda-forge                  
  + botocore      1.27.59  pyhd8ed1ab_0  conda-forge/noarch     Cached
  - fsspec       2023.6.0  pyh1a96a4e_0  conda-forge                  
  + fsspec       2022.8.2  pyhd8ed1ab_0  conda-forge/noarch     Cached
  - s3fs         2023.6.0  pyhd8ed1ab_0  conda-forge                  
  + s3fs         2022.8.2  pyhd8ed1ab_0  conda-forge/noarch     Cached

Other/Misc.

No response

Code of Conduct

  • I agree to follow Morpheus' Code of Conduct
  • I have searched the open bugs and have found no duplicates for this bug report
@mdemoret-nv mdemoret-nv added the bug Something isn't working label Jul 12, 2023
@mdemoret-nv
Copy link
Contributor Author

In the near term, we might need to make this note in documentation. Long term, we should switch to using the new RAPIDS CDN to download these files instead of boto3.

@dagardner-nv dagardner-nv self-assigned this Jul 12, 2023
rapids-bot bot pushed a commit that referenced this issue Jul 13, 2023
* Fix bug by using `s3fs.S3FileSystem.get_file` instead of `s3fs.S3FileSystem.get`
* Apply pylint fixes

fixes #1047

Authors:
  - David Gardner (https://github.com/dagardner-nv)

Approvers:
  - Michael Demoret (https://github.com/mdemoret-nv)

URL: #1053
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants