Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API design question: Should Node.AsFoo methods return error? #43

Closed
warpfork opened this issue Feb 14, 2020 · 15 comments
Closed

API design question: Should Node.AsFoo methods return error? #43

warpfork opened this issue Feb 14, 2020 · 15 comments

Comments

@warpfork
Copy link
Collaborator

warpfork commented Feb 14, 2020

The ipld.Node interface currently has a bunch of methods for unboxing its content to a primitive golang type:

type Node interface {
    // {... elided ...}

    IsNull() bool
    AsBool() (bool, error)
    AsInt() (int, error)
    AsFloat() (float64, error)
    AsString() (string, error)
    AsBytes() ([]byte, error)
    AsLink() (Link, error)

    // {... elided ...}
}

These methods return errors.

The question is: should they?

the utility of errors

Returning errors is generally considered better than panicing, as a rule of thumb and a default stance in the golang community at large.

Returning errors is also often faster, when you're doing something where speed matters. Returning an error is just moving some memory around, and later facing a conditional jump instruction: this is about as cheap as it gets (especially if you return an error value, rather than allocating a new value with any specific information). By contrast, panicing and recovering is a pretty complex thing.

Panicking also (note: in the current versions of the go compiler, and to the best of my knowledge) prevents functions from being inlined. This can be a pretty significant obstruction to optimization. And these unboxing functions we have here are certainly small, nearly-trivial functions that we'd like to have inlined!

... but a bunch of that is dubiously applicable

We're returning new error values that include helpful contextual information in almost all cases. Which means we're allocating in the error path. Which means it's already not fast: you'd much rather not hit the error, if you're concerned about the speed. In practice, for these methods, that means: if you care about speed at all, you're absolutely going to want to check the Kind() method and switch on that, rather than blindly calling these AsFoo() methods and checking errors.

Worrying about inlining is vacuous. We're talking about an interface. There's already no inlining.

So that shoots down most of the concrete reasons to prefer error returns, at least as far as I can think of. That leaves us with just the vague "rule of thumb" reason intact.

downsides of returning errors

Returning errors from these methods makes them significantly more irritating to use in many common situations.

You can't chain calls easily when a function has multiple returns.

You can't use the result as an argument to other functions, either. (Mostly. There's some special cases; see next heading.)

Even if you're not worried about ergonomics of function composition, even checking these error values often just plain feels silly in practice in a lot of the code I've written. Very frequently, I statically know the error is impossible, either because I know for sure what shape of data I'm handling, or because I've done a n.Kind() switch just a moment ago anyway. So, if I'm writing answer, _ := node.AsFoo() in the majority of cases... is having this error return really worth it?

other mitigation techniques

'must'

The must package seems to do well here.

E.g.:

package must

func String(x string, e error) string {
    // panic if e is not nil, return x otherwise; you get the idea
}

... and this can then be used as:

AnyOtherFunctionTakingAString(must.String(someNode.AsString()))`

The must pattern can get icky if you have to use it recursively (this is why it's Not Great when it comes to handling a series of "Lookup" operations), but since all the "AsFoo" methods are leafs of the call tree of handling some IPLD data, it tends to work out fine here.

We don't have must.String and must.Int methods at present, but it'd take all of about five minutes to whip them off.

'fluent'

The fluent package also already does away with error returns.

It's clearly okay for this package to do it, even if it's unclear for the core ipld.Node interface, because it's opt-in.

It's got downsides though (namely, it incurs an allocation for boxing into the fluent interface; see the Background section in issue #42 for some other discussion about that). A mitigation to an ergonomic issue that's got unavoidable performance costs is not a very good mitigation.

related APIs

Several other methods already don't return errors for wrong kind, such as Length(). But this is a little different from the AsFoo() methods, because Length can make a clear out-of-bound return (a negative number).

The MapIterator() and ListIterator() methods don't return error, but they did used to return a dummy iterator which would return an error on first use. I recently concluded this was also silly, and elected to change these methods to simply returning nil if they're used on the wrong kind. But this is a little different from the AsFoo methods, because they return interface types, so we can use nil as a clear out-of-bound sentinel value.

The LookupString() and LookupIndex() methods (and etc methods that traverse a piece of recursive data and return another Node) do return error. But this is a little different from the AsFoo methods, because there's more than one error they can return (wrong-kind vs key-not-found vs not-a-struct-field, etc), and those errors can't always be avoided by probing another method (by contrast, the AsFoo method errors can always be avoided by probing Kind first). These methods also have more (and different, Because Recursive) mitigations in the form of the fluent, traversal, proposed `cursor, and other packages.

what's the balance?

Are returning errors from these methods worth it?

Can we make sufficiently user-friendly higher level interfaces that we don't care if the core API is a little rough?

What's the balance?

Feedback welcome.

@warpfork
Copy link
Collaborator Author

warpfork commented Mar 5, 2020

(forgot to note this in the original post) It's worth also comparing our interfaces to stdlib's https://golang.org/pkg/reflect/#Value and methods on it like Value.Int(), all of which panic rather than returning errors. I suspect the same chain of reasoning applied there.

@warpfork
Copy link
Collaborator Author

warpfork commented Mar 5, 2020

I continue to slow-cook on this and continue to fail to come up with good reasons these errors need to be adding complexity to the return signatures of commonly used functions.

I also polled @hannahhoward who's written a lot of consuming code for thoughts on this, though (especially whether it might make any sense to bundle with the already-big set of changes in #49 ) , and feedback there was meh, "not super needing to lose those errors right now personally."

So, while this might still deserve review and doing later: still not slating it for today.

@rvagg
Copy link
Member

rvagg commented Mar 6, 2020

meh, "not super needing to lose those errors right now personally."

I'm getting used to the braindead error checking in Go code too, it just seems to be part of what you expect, with APIs that do non-trivial work not returning errors making me question whether there really are no error conditions in their work.

My preference is for idiomatic APIs as long as it doesn't come at severe performance cost (and even then, surely there's space for an opt-out like you're suggesting for fasttraversal). The error handling in Go is sufficiently annoying and pervasive that it's likely to be something that's dealt with in a future version of Go with sugar. Like ? in Rust, or what Promise (really async/await) did for JavaScript's callback error handling: if (err) return callback(err) is just as common in traditional Node.js code as if err != nil { return err } in Go code.

So, idiomatic and cross fingers that the Go gods follow sensible language design and iron out needless boilerplate at some future date.

@warpfork
Copy link
Collaborator Author

warpfork commented Apr 23, 2020

I'm strolling through some production + not-written-by-me code and gathering some observations that make me update my beliefs on this subject.

Observation: people can and readily will use the AsString, AsBool, etc methods to do destructuring of unknown freshly deserialized data. This seems to feel like a natural thing to do if you have expectations about the data and you just want to drill it apart accordingly.

This contrasts with a lot of code I've been writing lately, in which it's almost always felt natural to do a switch node.Kind() {/*...*/} and so on. Those switch statements have made the error returns from the As{Kind} methods feel foolish, because they're made statically (assuming the implementation of the Node interface is reasonably lawful) redundant. (This is already discussed a bit in the original post.) But... these switch statements over Kind only seem to be called for when writing some logic which actually handles many different cases. While I'm writing core code, it comes up a lot; and it will probably be similar for most user-authored "generic" functions, but outside of these situations, it's less common. Unless you're processing something that's the logical equivalent of a kinded union, most expectation-heavy destructuring code doesn't tend to do this.

This usage pattern would also be a lot different if the user had schemas + codegen, in which case all of the assertions of the expectations would be done in a consistent up-front way by the schema system, and you'd then get no-error accessor methods thanks to the codegen, nicely solving everything. But, when users don't have that (or choose not to use those tools), this destructuring assignment is the next most natural thing -- and as such, ought to work reasonably well.

Updated belief: no-error-return in the signature of the As{Kind} methods will ease users into writing worse code that does not correctly handle errors when they are writing code for the high level user story of expectation-heavy destructuring of serial data. Furthermore, this user story is a very common one, if not outright predominating, and so we should optimize for it. Therefore, we should keep the error returns, as they are, and look for other ways to offer features that make the error handling in aggregate usage more pleasant.

Observation: It's extremely common for users doing this kind of destructuring work to call LookupString("Something") and then AsBool (or such) right in a row.

As already discussed in the original post, the must solution, though it works, doesn't work particularly great for this, because you end up with two 'must' invocations in short order, and it doesn't "chain" very gracefully.

It gets more syntactically irritating as it goes deeper, but even at two nestings, it's already irritating... and it's rarely less than at least two.

Updated belief: something like the fluent package is still desirable.

It would just be really nice to come up with a way to do fluent-like chainability without significant performance (specifically, allocation) overheads. (Maybe more research will be the answer, and there's a way to lay this out in code better than any I've sussed yet.)

Even if we don't find an ideal-performance way to do it, though, we really ought to have a batteries-included good ergonomic option for fluent-style code available in the core library.

@mvdan
Copy link
Contributor

mvdan commented Aug 27, 2020

I personally agree with the earlier point that a Kind method along with methods that panic when used incorrectly is the right way.

It is true that they are perhaps easier to misuse due to the lack of an error return, but it's pretty common for APIs to have invariants which cause panics when broken.

I'm also not a fan of the must functions. If they are used heavily for the API, it means that the errors aren't actually useful, because we just know they'll never happen so we convert them to panics. I think those kinds of APIs only make sense when the input is entirely static; for example, regexp.MustCompile("foo") or the equivalent for text/template.Template.Parse. If those are in init funcs with constant strings as input, a panic is a good tradeoff - if it works in the tests, it will always work.

@warpfork
Copy link
Collaborator Author

Yeah, as I've left this on the slow-cooker in the back of my brain, I've come back around again to preferring fewer errors in return signatures in this scenario. (In other words, I agree with your comment @mvdan, and retract about half the "updated belief" in my own last comment.)

The "expectation-heavy destructuring code" user story is an interesting and important one that I'd like to provide good ergonomics for. But I guess more research into how to provide that best is still in order -- making the Node interface contort towards that goal doesn't seem to help, so we might as well discount it. (And while I'm still feeling optimistic that some sort of functional package is the general form of the answer, so far it's probably fair to say that neither the must nor fluent experiments have quite nailed it on the head yet either. Close, maybe. But definitely not nailing it.)

[if] must [...] are used too heavily [...] means that the errors aren't actually useful

Agree, and well said.

I still think I like the must package existing, because I do use it in tests in a way that feels reasonable. (And I think at least some of these might remain even after we change these methods. Though that might be subject to review again later.)

But it is also somewhat too tempting to use it in non-test circumstances at present, and that is indeed feeling like a strong hint that most of these methods should just as well panic themselves.


I think at this point I'm convinced that we should remove the error returns from these methods. It'll be a pretty big change, though, so we'll want to plan and sequence it carefully.

A couple of other things to think about in the planning:

  • The Lookup*(*) family still returns errors, so it has room to indicate "not found". This still breaks chaining (and in one of the circumstances where it's most desirable). Do we have ideas on how we could make this feel nicer?

  • It would be good to specify what kind of information we expect from wrong-kind errors. Currently, a lot of code I've written opts for very high amounts of information -- stating at least the actual vs appropriate kinds, but also including a type name if there's schemas in play. I figure this information is probably useful in error messages and in development & debugging -- but it has a downside: the amount of extra code needed to deal with getting those type names into the error text is pretty significant; this affects both codegen output sizes, and also sheer irritation if a developer wants to make a new Node implementation by hand (tl;dr there's no way to make a type that can be easily embedded that addresses all the boilerplate functions). We could... just decide not to do this. Maybe? (There are more remarks on this in the source of the mixins packages and in codegen (and even some failed experiments on trying to do this with stack inspection wizardry! eep), though it might require some digging/grepping to find them.)

@mvdan
Copy link
Contributor

mvdan commented Aug 27, 2020

Do we have ideas on how we could make this feel nicer?

If you want chaining and error reporting, the only nice way that I know of (besides panics directly) is to "chain" together a query, working out the result at the end. For example:

// Lookup gets chained, and Do stops at the first operation that errors, returning that error.
bar, err := node.Lookup("foo").Lookup("bar").Do()
  • the amount of extra code needed to deal with getting those type names into the error text is pretty significant

I'm not quite visualizing that problem. Perhaps you can show me some examples.

@rvagg
Copy link
Member

rvagg commented Aug 28, 2020

bar, err := node.Lookup("foo").Lookup("bar").Do()

Boiling down to a similar API that you and I have been bounding around in our discussions about how the ideas in ipld-prime might translate into JS @warpfork, but that's forced more by the async boundary (Promises) than error handling but ends in a similar place.. chaining a thing that isn't the final thing until you execute or "resolve" it.

I'd be fine with an API like this because we could end up building something that roughly mirrors it in JavaScript and that's a little bit neat. The question is - how comfortable is such an API for a typical Go user? Do you see these kinds of APIs in the wild today? It'd be nice to avoid novel API styles as much as possible.

@mvdan
Copy link
Contributor

mvdan commented Aug 28, 2020

I wouldn't say this is common - most people would go with the less abstract "each step returns an error" API, but that doesn't mean it's the best or most idiomatic :)

Here's a somewhat similar example from the standard library: https://golang.org/pkg/bufio/#Scanner

The more obvious way to design the API would be like:

scanner := bufio.NewScanner(os.Stdin)
for {
    line, err := scanner.Text()
    if err != nil { ... }
}

Instead, it uses Scan to tell you if it's done, and keeps the first error it encounters:

scanner := bufio.NewScanner(os.Stdin)
for scanner.Scan() {
    line := scanner.Text()
}
if err := scanner.Err(); err != nil { ... }

This is slightly different from our case here, but it's a similar idea. "Buffer" the errors until the user cares about them, also meaning that they can handle them in one place instead of many.

Instead of chaining with a final "do" method, we could also use a receiver, more similar to what bufio does above. This has a perhaps larger API overhead though, as we require some sort of stateful receiver to be created for every "query" (or whatever we could call it):

query := node.Query()
bar := query.Lookup("foo").Lookup("bar")
baz := bar.Lookup("baz")
if err := query.Err(); err != nil {
    // the first error encountered; if "bar" encounters an error, then "baz" will be a zero value or empty
}

I think the chaining makes sense if you generally want to catch an error for every chained lookup, and the "query receiver" makes sense if it's normal to do many separate batched lookups and checking for any error only at the very end. I imagine it's the first of those two.

@warpfork
Copy link
Collaborator Author

warpfork commented Sep 2, 2020

the amount of extra code needed to deal with getting those type names into the error text is pretty significant

I'm not quite visualizing that problem. Perhaps you can show me some examples.

I opened a new issue for this, and loaded it up with some examples: #73

(It might not be a solvable issue. And we might not "need" to "solve" it; it's just lament about boilerplate. I just wanted to mention it as something that's within ~about one jump of relevance when we discuss these error returns.)

@warpfork
Copy link
Collaborator Author

warpfork commented Sep 5, 2020

A lot of this discussion about "how to make multi-step actions chain syntactically nicely": I did ask for it :) We might want to break that out into other issues, though. It seems like that discussion is all heading in the direction of "how to make a better fluent package"... and that's gonna be a complicated topic, and possibly be worth entertaining more than one competing answer for. Deciding that's our heading, overall, is sufficient detail for this issue.

One last comment I'll leave on that subject, though: One of the big issues I had with the fluent package drafts so far is that it's really dang hard to make these kinds of chains without signing up for a bunch of allocations (for e.g. the new intermediate objects required to track the accumulated steps, until the .Do() step is hit). So, part of it is a user ergonomics question; but another part of it is definitely a performance question. (And I don't think ignoring the performance question by saying "it's optional porcelain; drill through when you want to go fast" has turned out to be a viable cop-out here. Tried that: lesson learned is: at the very least, fully worked examples of how to gracefully opt-out should be part of the design, because if it's an all-or-nothing thing or very high friction to switch style, users will have to rewrite large swatches of program in fast-style or fluent-style, and that does not end with happy. Much better if the fluent style doesn't impose onerous performance costs that make users want to avoid it.)

@mvdan
Copy link
Contributor

mvdan commented Sep 5, 2020

I should note that the compiler has gotten better with interfaces and inlining in the last couple of years, so perhaps the fluent style can actually be made to avoid allocations for simple cases today. I'd certainly want to give that a try.

@mvdan
Copy link
Contributor

mvdan commented Sep 5, 2020

Error values and fluent APIs aside, I still strongly believe that these "as T" methods should simply mirror the reflect.Value API.

@warpfork
Copy link
Collaborator Author

warpfork commented Oct 21, 2020

This one's been on simmer for a while (and still will be for a while more, because it's a big API change, and we're going to be careful about wrangling it), but just to have a clear post on the end of the thread here:

Yes, we're going to change this, and almost all of these methods are going to drop their error return.

In review:

  • it is not useful for the As*() methods to return errors, because in all cases the user should have already considered the Kind() before calling those methods. These methods will panic instead, and we consider that acceptable because it's so very avoidable, and normal operation does consistently avoid it naturally. (This is as discussed already in the issues comments above.)
  • it is not necessary for the Lookup*() methods to return errors either! In the case of "wrong kind" errors, the comments above about acceptability of panics, since one should have already examined the Kind(), apply. In the case of "not found" errors: we have the ipld.Absent sentinel value, and we can use it. (Though ipld.Absent's original purpose was to describe states found when handling data with schemas, it also fits perfectly well here, and simplifies things dramatically: let's use it!)

(And also, I'm increasingly hoping to get to this "soon". Still no promises on a date, though.)

@warpfork
Copy link
Collaborator Author

warpfork commented Dec 4, 2020

Okay, no, we're not going to do this after all. It doesn't fly.

We got right up to the edge of trying to do this, and while writing the draft of what the change notes would be, realized there's still problems here. Only in some scenarios! But they're critical scenarios.

  • Scenario 1: the Node is handling data that's already in memory and processed: in this case, we could get away with no error returns.

    • As previously discussed, it's really only the kind misuse errors that could arise, and those we would actually be comfortable turning into panics because they're always avoidable.
  • Scenario 2: the Node is a front for an ADL of some kind (e.g. perhaps a sharded, multiple-block data structure, such as a HAMT): in this case, we need error returns, because the ADL internal logic might either need to pass up IO errors encountered while getting more data, or return errors from validation rejections on data the ADL internal logic has loaded and attempted to process.

    • We are not comfortable turning these into a panic, because these errors can't be avoided by "programming better".
      • Turning them into panics would therefore be an unacceptable deviation in contract from how Nodes operate for data in memory -- and we really don't want this: making ADLs work transparently is a major goal for IPLD.
      • Alternatively, adding additional methods which probe for errors: is a no go for the same reason -- whatever we do here, it has to be identical to the rules for in-memory nodes; this is actually one place feature-detection would not be a good thing to lean on.
    • There's also an interesting reflection to be made here on the MapIterator() and ListIterator() returning nil rather than error, as remarked on in the "related APIs" heading of the initial error: it's not just the validity of nil as a sentinel that's relevant to those methods: it's also that the only thing that can go wrong with them, even on ADLs, is a kind misuse error. (And it's basically inconceivable to have an ADL that can't figure out what kind it is by its first block, so there's no way there will be a need for a block load operation which could go wrong during these methods.)
    • It's most obvious how this would apply to the Lookup*() family of methods, because those "big map" and "big list" tend to be the most common reason we want ADLs... but we might as well keep applying the logic consistently to all the accessors. (We've also talked about the possibility of other ADLs like encryption wrappers, and those would make sense to apply even on scalar kinds; so let's not discount that.)
  • Scenario 3: (this is less relevant, but we considered it, so I'll note it briefly): lazy-loading or lazy-validating nodes: these would need error returns: for fairly obvious reasons.

    • This is a really low-weight consideration.
      • It's speculative. We don't have any implementations of these things which could give confidence to statements about them.
      • Hypothesis: in many use cases of lazy loading, the best possible ways to serve the goal involve getting the user to state what data they want up-front. The Node interface just isn't good for that (it's for walking around a data graph step by step) -- so we probably shouldn't waste effort and add pain to other usecases trying to shoehorn in lazy-loading support to this interface, when it's likely we'd end up developing other interfaces for it anyway.
      • ... all that said, we don't really want to slam the door on this possibility either.

Scenario 2 (ADLs can return errors that aren't statically avoidable) is what cinches this. We really can't let the desire for ease-of-use be so over-weighted that it would result in inconsistency of contracts between ADLs and regular nodes.

We're just going to have to keep working on more nice facades and assistive code that makes it more pleasant to use this core API, errors and all.

@warpfork warpfork closed this as completed Dec 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants