Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fsspec: loosen the prefix check to cover both None and False #6246

Merged
merged 2 commits into from
Jul 1, 2021

Conversation

isidentical
Copy link
Contributor

Related #6244

@isidentical
Copy link
Contributor Author

@mnrozhkov @jnissin would you mind installing from this branch and testing whether it works now or not?

pip install "dvc[gs] @ git+https://github.com/isidentical/dvc@fsspec-prefix-based"

@jnissin
Copy link

jnissin commented Jun 29, 2021

I can test, however my local DVC setup is installed using brew. So I am wondering whether there is a way to test the patch through brew? I guess I can also just create a virtualenv and try the pip version 🤔

@jnissin
Copy link

jnissin commented Jun 29, 2021

@isidentical Not working for me:

/opt/anaconda3/envs/dvc-test/bin/dvc push -v  
2021-06-29 15:19:23,882 DEBUG: Check for update is enabled.
2021-06-29 15:19:24,428 DEBUG: Preparing to upload data to 'gs://****'
2021-06-29 15:19:24,428 DEBUG: Preparing to collect status from gs://****
2021-06-29 15:19:24,428 DEBUG: Collecting information from local cache...
2021-06-29 15:19:24,433 DEBUG: Collecting information from remote cache...                                                                                                                                                    
2021-06-29 15:19:24,433 DEBUG: Querying 15 hashes via object_exists
2021-06-29 15:19:41,455 DEBUG: Querying 1 hashes via object_exists                                                                                                                                                            
2021-06-29 15:19:41,612 DEBUG: Matched '2' indexed hashes                                                                                                                                                                     
2021-06-29 15:19:41,711 DEBUG: Estimated remote size: 4096 files                                                                                                                                                              
2021-06-29 15:19:41,712 DEBUG: Querying '7' hashes via traverse                                                                                                                                                               
2021-06-29 15:19:43,605 ERROR: unexpected error - list indices must be integers or slices, not str                                                                                                                            
------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/dvc/main.py", line 55, in main
    ret = cmd.do_run()
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/dvc/command/base.py", line 50, in do_run
    return self.run()
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/dvc/command/data_sync.py", line 67, in run
    glob=self.args.glob,
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/dvc/repo/__init__.py", line 51, in wrapper
    return f(repo, *args, **kwargs)
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/dvc/repo/push.py", line 44, in push
    pushed += self.cloud.push(objs, jobs, remote=remote)
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/dvc/data_cloud.py", line 83, in push
    show_checksums=show_checksums,
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/dvc/remote/base.py", line 57, in wrapper
    return f(obj, *args, **kwargs)
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/dvc/remote/base.py", line 499, in push
    download=False,
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/dvc/remote/base.py", line 356, in _process
    download=download,
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/dvc/remote/base.py", line 196, in _status
    md5s, jobs=jobs, name=str(self.odb.path_info)
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/dvc/remote/base.py", line 145, in hashes_exist
    return indexed_hashes + self.odb.hashes_exist(list(hashes), **kwargs)
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/dvc/objects/db/base.py", line 468, in hashes_exist
    self.list_hashes_traverse(remote_size, remote_hashes, jobs, name)
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/dvc/objects/db/base.py", line 312, in list_hashes_traverse
    yield from itertools.chain.from_iterable(in_remote)
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/concurrent/futures/_base.py", line 586, in result_iterator
    yield fs.pop().result()
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/concurrent/futures/_base.py", line 425, in result
    return self.__get_result()
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/concurrent/futures/thread.py", line 56, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/dvc/objects/db/base.py", line 304, in list_with_update
    prefix=prefix, progress_callback=pbar.update
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/dvc/objects/db/base.py", line 193, in list_hashes
    for path in self._list_paths(prefix, progress_callback):
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/dvc/objects/db/base.py", line 173, in _list_paths
    for file_info in self.fs.walk_files(path_info, prefix=prefix):
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/dvc/fs/fsspec_wrapper.py", line 109, in walk_files
    for file in self.find(path_info, **kwargs):
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/dvc/fs/fsspec_wrapper.py", line 179, in find
    path, detail=detail, prefix=path_info.parts[-1]
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/fsspec/asyn.py", line 87, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/fsspec/asyn.py", line 68, in sync
    raise result[0]
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/fsspec/asyn.py", line 24, in _runner
    result[0] = await coro
  File "/opt/anaconda3/envs/dvc-test/lib/python3.6/site-packages/gcsfs/core.py", line 953, in _find
    par = o["name"]
TypeError: list indices must be integers or slices, not str
------------------------------------------------------------
2021-06-29 15:19:43,675 DEBUG: Version info for developers:
DVC version: 2.4.2+f1bf10 
---------------------------------
Platform: Python 3.6.13 on Darwin-19.6.0-x86_64-i386-64bit
Supports: gs, http, https
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s1
Caches: local
Remotes: gs
Workspace directory: apfs on /dev/disk1s1
Repo: dvc, git

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2021-06-29 15:19:43,677 DEBUG: Analytics is enabled.
2021-06-29 15:19:43,761 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/var/folders/4p/0n_mm8js23x1vhydv_3yq8b40000gq/T/tmpk6czf8o1']'
2021-06-29 15:19:43,763 DEBUG: Spawned '['daemon', '-q', 'analytics', '/var/folders/4p/0n_mm8js23x1vhydv_3yq8b40000gq/T/tmpk6czf8o1']'

@isidentical isidentical force-pushed the fsspec-prefix-based branch from 9c951df to 04a7139 Compare June 30, 2021 12:54


@pytest.mark.parametrize("remote", full_clouds, indirect=True)
def test_pull_no_00_prefix(tmp_dir, dvc, remote, monkeypatch):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What this does is, it is triggering a prefix based search while that prefix doesn't exist (unlike test_pull_00_prefix, where it is triggering a prefix based search while the prefix exists).

@jnissin
Copy link

jnissin commented Jul 1, 2021

I can confirm that now pushing to GCS is working for me 👍 .

@efiop
Copy link
Contributor

efiop commented Jul 1, 2021

@isidentical Is this ready for review?

@isidentical isidentical marked this pull request as ready for review July 1, 2021 11:58
@isidentical isidentical requested a review from a team as a code owner July 1, 2021 11:58
@isidentical isidentical requested a review from pmrowla July 1, 2021 11:58
@isidentical
Copy link
Contributor Author

@isidentical Is this ready for review?

yes, it is.

@efiop efiop merged commit e3ff6d5 into iterative:master Jul 1, 2021
@efiop
Copy link
Contributor

efiop commented Jul 1, 2021

@isidentical Great catch! 💪

@efiop efiop added the bugfix fixes bug label Jul 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix fixes bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants