Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check at nc_open if file appears to be in NCZarr/Zarr format. #2658

Merged
merged 8 commits into from
Apr 13, 2023

Conversation

DennisHeimbigner
Copy link
Collaborator

@DennisHeimbigner DennisHeimbigner commented Mar 13, 2023

Charlie Zender notes that nc_open() does not immediately detect that the given path refers to a file not in zarr format. Rather it fails later when trying to read the (meta-)data.

The reason is that the Zarr format is highly decentralized. There is no easily testable magic number or superblock to look for. In effect the only way to see if a directory is Zarr is to successfully read it.

It is possible to heuristically detect that a path refers to an NCZarr/Zarr file by doing a breadth-first search of the file tree starting at the given path. If the search encounters a file whose name starts with ".z", then assume it is a legitimate NCZarr/Zarr file. Of course, this test could be costly. One hopes that in practice that it is not.

In addition to this fix, a corresponding test case was added.

Other Changes

  1. re: PR Add Cygwin CI and stop installing unwanted plugins #2529 -- There was an error under Cygwin for this PR that is fixed in this PR. The fix was to convert all noinst_ references to check_.
  2. Fix run_jsonconvention.sh to be resilient against irrelevant changes to _NCProperties.

re: Issue Unidata#2656

Charlie Zender notes that *nc_open()* does not immediately detect that the given path refers to a file not in zarr format. Rather it fails later when trying to read the (meta-)data.

The reason is that the Zarr format is highly decentralized. There is no easily testable magic number or superblock to look for. In effect the only way to see if a directory is Zarr is to successfully read it.

It is possible to heuristically detect that a path refers to an NCZarr/Zarr file by doing a breadth-first search of the file tree starting at the given path. If the search encounters a file whose name starts with ".z", then assume it is a legitimate NCZarr/Zarr file. Of course, this test could be costly. One hopes that in practice that it is not.

In addition to this fix, a corresponding test case was added.

## Other Changes

re: PR Unidata#2529

There was an error under Cygwin for this PR that is fixed in this PR. The fix was to convert all *noinst_* references to *check_*.
WardF
WardF previously approved these changes Apr 12, 2023
@WardF WardF added this to the 4.9.3 milestone Apr 12, 2023
@WardF WardF self-assigned this Apr 12, 2023
WardF
WardF previously approved these changes Apr 12, 2023
@WardF WardF merged commit b30b4e8 into Unidata:main Apr 13, 2023
@DennisHeimbigner DennisHeimbigner deleted the znotnc.dmh branch April 17, 2023 20:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nc_open(<file schema>) returns NC_NOERR on non-NCZarr stores
2 participants