-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rejection of invalid layouts. #1333
Comments
We discussed the idea of a per-layout flag that determines whether a layout is in its simplest form. This would add a reasonable amount of code to check something that internally should never happen. A less robust approach would be to tackle #516 at the same time and create a mid-level API that resembles def wrap(layout, behavior):
assert_simplified(layout)
return ak._v2._util.wrap(layout, behavior)
... The issue with this approach is that it only checks for validity at The more I think about it, the less confident I am that we want to go to all this trouble. We don't enforce I appreciate that I've gone round in a few circles here, thanks for coming for the ride! |
At least in the two Content subclasses I checked yesterday, the All in all, there are three ways for a layout to be invalid:
All of this discussion is about (3), which could be caught in O(1) time with O(n) flags where n is the number of nodes in a tree (not very large). However, just as a mid-level user (someone going into the Type (1) errors are always checked by Content subclass constructors, so it's never possible to make one without knowing about the error. This checking is O(1) in time and memory use. So it comes down to how much we care about this: we'd be able to check for type (3) errors without requiring user opt-in at the cost of adding O(n) flags (not a big cost). But the layouts could still have type (2) errors. The current policy is that if an operation relies on some type (2) error not being wrong, it has to check for those errors just before using them: for example, checking This PR is about a change in policy to detect type (3) errors more proactively: to raise an error when constructing the |
(:facepalm:)∞ You're completely right. I don't know what I was looking at yesterday. Yes, this issue is about if/when to check (3). The easiest solution is to check this when constructing Array vs doing this in the layout constructor. Whilst users could then have erroneous layouts, they will nearly always end up converting them to Array at some point and then discover the error. So, this is a wishlist item for when we get around to splitting the structure checking into its own routine. |
I think @jpivarski work means that we can close this! We now have an explicit separation between |
In #1308 there was a conversation about the policy for accepting redundant (unsimplified) layouts. There have been at least two cases (#713, #1308 discussion) where non-simplified layouts produce strange results when passed to high level functions, and until now I've raised these as bugs.
After the discussion with @jpivarski in #1308 I've realised that this is a policy issue, as Awkward functions themselves do/should not produce such layouts. Accepting redundant layouts is transparent in some functions (like
ak.unflatten
andak.packed
), but not those where the high-level function needs to match a particular layout structure. Such functions are made far simpler by the requirement of simplified layouts.The take-away is:
To support external users of the mid layer, I think it would be helpful to catch cases where this invariant fails; bugs in the Awkward code-base will more easily found than those in external users' code (because we will have a greater number of users overall).
The text was updated successfully, but these errors were encountered: