Skip to content
This repository has been archived by the owner on Jun 21, 2022. It is now read-only.

Avoid misleading error message by return 0 for numbaskets if interpretation is None #420

Merged
merged 4 commits into from
Dec 18, 2019

Conversation

tamasgal
Copy link
Contributor

With reference to #419 (comment) this is a tiny patch which avoids a misleading error message when trying to call e.g. .uncompressedbytes() on a branch with no data (i.e. interpretation is None).

Not sure if you are happy with this, to me it seemed that returning 0 in the numbaskets @property should be enough to also avoid further confusions, if I understood correctly.

Anyways, at least we now have a todo entry, as you wanted @jpivarski 😉

@jpivarski
Copy link
Member

Wow, thank you! This is not the solution I was thinking of (I was thinking of just preventing the error message), but this might be a better choice.

I'd like to merge this on Monday to give myself some time to think of possible consequences of claiming that a branch has zero baskets. (Some branches have interpretation == None because we don't know what its interpretation should be, not because it has no data.)

@tamasgal
Copy link
Contributor Author

I see! You’re the architect ; )

@jpivarski
Copy link
Member

No, we can't do this. If there's a new type we're unfamiliar with, the interpretation will be None and we'll want to investigate that new type by reading the array with interpretation=uproot.debug. However, that wouldn't work with this PR because the interpretation attached to the branch would still be None, the array method would think there's no data, and we'd get no data back. Accepting this PR would make it difficult to debug new types.

This would also happen with another potential solution (something I was thinking about after seeing yours) in which we set the numentries to zero, rather than the numbaskets.

The only way to do this correctly is to put the check immediately before the error message:

https://github.com/scikit-hep/uproot/blob/6e89ce6556fe204ee4cc9e0c0c06f6d78f97503b/uproot/tree.py#L1745-L1746

changed to

else if self.interpretation is not None:

Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requested changes in a comment.

@tamasgal
Copy link
Contributor Author

Yes, actually that was my first approach, however I encountered more errors since self._recoveredbaskets (line 993) is then None and len() fails.

~/Dev/uproot/uproot/tree.py in numbaskets(self)
    990         if self._recoveredbaskets is None:
    991             self._tryrecover()
--> 992         return self._numgoodbaskets + len(self._recoveredbaskets)
    993
    994     def _cachekey(self, interpretation, entrystart, entrystop):

TypeError: object of type 'NoneType' has no len()

I just pushed your change request.

I could not manage to create a small enough file yet to reproduce, but I'll try. I however assume that we already have some samples in the repo, I'll check them first.

@jpivarski
Copy link
Member

Oh, sorry—in that case where self.interpretation is None, _recoveredbaskets should return [] (empty list).

@tamasgal
Copy link
Contributor Author

Makes sense 😉

@tamasgal
Copy link
Contributor Author

tamasgal commented Dec 16, 2019

Alright:

In [1]: import uproot

In [2]: f = uproot.open("KM3NeT_00000044_00006772_n10.root")

In [3]: f[b'KM3NET_TIMESLICE_L0;2'][b'km3net_timeslice_L0'].uncompressedbytes()
Out[3]: 0

I however would like to cover this with a test...

@jpivarski
Copy link
Member

I can wait. There are a few sample files in tests/samples/*.root that require recovery, and many of them are small. They're mentioned in tests/test_issues.py with issue numbers that correspond to GitHub issues. Searching GitHub for "recover" might turn up some useful examples.

They only didn't trigger an error before because we never tried to access one of the structure branches, one of the branches that should have interpretation == None.

@jpivarski
Copy link
Member

I've been deploying new releases of awkward, uproot-methods, and uproot, and in some cases, finding a backlog of bug-fixes that have been waiting a while for a new release (uproot-methods had about 5 of them). In uproot, Oksana's addition of ZSTD as a decompression algorithm is a big enough change to increase the middle digit of the version number. I just deployed new versions of awkward and uproot-methods, and I'd like to include your fix of this error message in the uproot deployment.

If it's likely to be ready today, I'll wait—otherwise, I'll release this now and do a bug-fix release later when you're done. Do you know if I should wait or just deploy now? Thanks!

@tamasgal
Copy link
Contributor Author

Hey Jim! Sorry I had a busy day, I can follow up tomorrow, but feel free to merge if you are happy with the change and I add the tests the next day. I don't want to block the release pipeline...

@tamasgal
Copy link
Contributor Author

OK, test is there. I have however two failing tests, which also fail on upstream/master. I'll follow up with an issue.

@jpivarski jpivarski merged commit 5149b83 into scikit-hep:master Dec 18, 2019
@jpivarski
Copy link
Member

I'm not sure what's happening in your local tests, but it looks fine and passes tests.

Event.root is a large file that we test remotely with HTTP.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants