Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Principles: Error handling #84

Closed
wants to merge 13 commits into from
Closed

Principles: Error handling #84

wants to merge 13 commits into from

Conversation

geoffromer
Copy link
Contributor

No description provided.

docs/project/principles/error_handling.md Outdated Show resolved Hide resolved
docs/project/principles/error_handling.md Outdated Show resolved Hide resolved
docs/project/principles/error_handling.md Outdated Show resolved Hide resolved
docs/project/principles/error_handling.md Show resolved Hide resolved
docs/project/principles/error_handling.md Outdated Show resolved Hide resolved
#### Examples

If Carbon supports assertions and/or contract checking, failed assertions will
not throw exceptions, even as an optional build mode.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see some statements about what would happen instead -- in particular would there be some sort of function you could register to customize what happens? It seems like there are a lot of different things users might want to do:

  • Write a crash report somewhere (local file system, remote RPC service), including possibly a core dump.
  • Print or log a stack trace.
  • Put up a dialog box or other OS-specific notification UI.
  • Break into the debugger.

There may be limits to what we want to allow too.

  • Allocating memory could be a problem, depending on what the failure was. May need to preallocate space at startup.
  • If the failure handler also triggers a failure, well we'd need a fallback strategy instead of getting stuck in an infinite loop.
  • Some clean up may be out of scope -- like do we want to let you try and flush buffered I/O?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most of these details are out of scope for a principles doc, but I've added some brief examples of what this principle doesn't rule out.

docs/project/principles/error_handling.md Outdated Show resolved Hide resolved
docs/project/principles/error_handling.md Show resolved Hide resolved
docs/project/principles/error_handling.md Outdated Show resolved Hide resolved
docs/project/principles/error_handling.md Outdated Show resolved Hide resolved
geoffromer and others added 2 commits June 18, 2020 14:09
@chandlerc chandlerc added proposal A proposal WIP labels Jun 20, 2020
@googlebot googlebot added the cla: yes PR meets CLA requirements according to bot. label Jun 23, 2020
Co-authored-by: Dmitri Gribenko <gribozavr@gmail.com>
creates control flow paths that are not visible to the reader of the code, and
it is extremely difficult to reason about procedural code when you aren't aware
of all control flow paths. This would make Carbon code harder to understand,
maintain, and debug.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced on this paragraph. My canonical example for when the "invisible control flow" has been helpful for me in the past is writing network code. I can't always verify the data is good before doing any work (doing so might imply doing double the work or waiting to begin processing one packet until the next packet has arrived), so an error can occur fairly deep in the call stack. For certain types of errors, bad data suggests some sort of corruption somewhere. There is frequently no sane way to recover, so the program logic I want to express is "Go back to the code where I created this socket, clean up anything that I created since then, disconnect, and reconnect". Exceptions handle that perfectly and localize the error to the two places that care about it.

You somewhat address this later on in the section "Error propagation must be straightforward". To be convinced of this, I would want to see an alternative strategy that is at least somewhat comparable. I don't expect it to be as nice as "no code", but I want to understand just how much of that I am giving up to get the benefits you describe before I could say I support this paragraph.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rust and Swift mark the propagation of an error with a single token at the callsite (postfix ? and prefix try, respectively), which seems about as close to "no code" as you can get while still being code (admittedly, in some cases you may also need parentheses for disambiguation, as with any other unary operator). That's the kind of thing I have in mind when I say propagation should be straightforward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What might help strengthen the argument is to talk about the experience of a reader in the middle of the propagation, who is less familiar with the code than the author. This is IMO where the readability hit is felt most -- otherwise as David says it can feel like an effective way to separate concerns. But a reader who is trying to understand the behavior of code in the middle and is unaware that control flow doesn't proceed as expected based on the locally visible code can be left completely lost and having to read a much larger amount of code both up and down the call stack to understand what the local behavior is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better?

@chandlerc chandlerc changed the base branch from master to trunk July 2, 2020 03:20
@geoffromer geoffromer changed the title Initial draft of error handling principles. Principles: Error handling Jul 16, 2020
Copy link
Contributor

@chandlerc chandlerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely like the direction here. Most of my suggestions are just trying to clarify and focus the text, not really change any of the high-level direction.

#### Examples

If Carbon supports assertions and/or contract checking, failed assertions will
not throw exceptions, even as an optional build mode. Assertion failures will
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like "throw exceptions" suddenly pulls a ton of context from C++ into this document...

Can this be phrased more generically?

Possible approach:

Suggested change
not throw exceptions, even as an optional build mode. Assertion failures will
not allow callers to detect and handle them, perhaps through a mechanism similar to C++ exceptions, even as an optional build mode. Assertion failures will

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better?

only be presented in ways that don't alter the program state, such as logging,
terminating the program, or trapping into a debugger.

### Memory exhaustion is not recoverable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beyond the caveat you give below (or maybe instead? see my comments below) I think it would be good to pretty clearly call out that the goal is to address the underlying requirements here, just in a different way.

Basically, I think we don't want people to take away from this that Carbon won't be applicable in a sharply memory constrained environment. I think we're pretty committed to having some way to support such uses of Carbon if we want this to be viable in a wide range of environments. Just that the approach isn't expected to be for the default heap allocation mechanism to allow for recoverable failure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tricky, since we're talking about the standard library here. Is the goal to make sure that we don't prevent those users from using Carbon, or is it to make sure that those use cases get first-class support in the standard library? The former would just mean we need to make sure the language doesn't get in their way, which I think we definitely do want, but the latter would mean we have to provide alternative memory-exhaustion-compatible APIs for everything in the standard library, which I think we definitely don't want. At the level of a principles doc, I don't know how to spell out where between those extremes we intend Carbon to land.

Comment on lines 115 to 118
Carbon will probably provide a low-level way to allocate heap memory that makes
allocation failure recoverable, because doing so appears to have few drawbacks.
However, users may need to build their own libraries on top of it, rather that
relying on the Carbon standard library, if they want to take advantage of it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like both of these statements are pushing a bit far into details and specifics that haven't materialized yet. I think they're more intended to be examples, but as written feel a bit sweeping in scope.

For example, I think we might work to enable parts of the standrad library to take advantage of different allocation strategies like this if we can find a clean way to incorporate it into the design. But it is a big "if", and I'm totally down with not overpromising. I just don't want to discourage too sharply either or preclude still open design exploration.

As I mentioned above, maybe we can replace specific caveats with a more general statement around working to explore and find ways of addressing the fundamental requirements of constrained systems programming which don't have as dramatic of an effect on the overall language and API design.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Working to explore" those use cases is pretty different from having them be an explicit goal (which you seem to be suggesting above), so I'm not sure what you're looking for here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "may need to build their own libraries on top of it" covers this adequately: it does leave open the possibility of a standard library that includes recovery from memory allocation failure.

I do want to avoid over-promising in the other case too, though: saying we may provide heap allocation that allows recovery from allocation failure rather than saying we definitely will.

Comment on lines 119 to 122
There probably will not be a way to recover from _stack_ exhaustion, because
there is no known way of doing that without major drawbacks, and users who can't
tolerate crashing due to stack overflow can normally prevent it via static
analysis.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little worried the second half here will be read to indicate that genuinely huge stack sizes will be necessary much like they are in C++.

I think we should (similar to above) actually address the use case for sharply limited stack size, but in a way that doesn't require recovering from arbitrary stack exhaustion.

As a concrete thing, I'd really love if we could allow threads to have very small data stacks by default while allowing them to grow cleanly to quite large when necessary. This would help reduce the address space pressure and other challenges of the current C++ model.

Anyways, mostly I worry we're getting too far into exactly how we will do this in Carbon rather than just the high level principle that the default memory allocation approach won't have recoverable errors on exhaustion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little worried the second half here will be read to indicate that genuinely huge stack sizes will be necessary much like they are in C++.

Can you say more? The connection between the two isn't obvious to me, because I don't see how recoverable stack-overflow errors can be used to mitigate a limited stack size, except in the very limited sense that they might let you isolate stack-overflow failures to a single computation, rather than the whole process. In other words, it seems like your system has to be designed so that the computations that you need to actually work will fit within the stack size limit, regardless of whether stack overflows are recoverable.

Anyways, mostly I worry we're getting too far into exactly how we will do this in Carbon rather than just the high level principle that the default memory allocation approach won't have recoverable errors on exhaustion.

At least in the case of stack exhaustion, isn't that pretty much what I've done?

only be presented in ways that don't alter the program state, such as logging,
terminating the program, or trapping into a debugger.

### Memory exhaustion is not recoverable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate comment -- maybe worth clarifying that this is true for the default memory allocation APIs, but not necessarily all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better?

Comment on lines 188 to 190
Given the ubiquity of this use case, Carbon must provide support for it that can
be used without altering the structure of the code, or making the non-error-case
logic less clear.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you con make this a bit more squishy without removing the importance of it:

Suggested change
Given the ubiquity of this use case, Carbon must provide support for it that can
be used without altering the structure of the code, or making the non-error-case
logic less clear.
Given the ubiquity of this use case, Carbon must provide support for it that can
be used with minimal changes to the structure of the code, or making the non-error-case
logic less clear.

I think this also avoids a debate over "is it really altering the structure?" by instead focusing on how much structural churn is necessary.

Comment on lines 194 to 196
Carbon will not establish an error hierarchy or other reusable error vocabulary,
and will not prioritize use cases that involve branching based on the properties
of a propagated error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second part of this doesn't really make sense to me when first reading it. Reading the rest, I think I understand, but maybe to help clarify:

Suggested change
Carbon will not establish an error hierarchy or other reusable error vocabulary,
and will not prioritize use cases that involve branching based on the properties
of a propagated error.
Carbon will not establish an error hierarchy or other reusable error vocabulary,
and will not prioritize use cases that involve classifying and reacting to any
common set of properties of a propagated error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, except that I've omitted "common set" because I actually mean this to cover any properties.

Comment on lines 231 to 233
potentially-failing operations. For example, if Carbon supports `try`/`catch`
statements, they will always have a single `catch` block, which will be invoked
for any error that escapes the `try` block.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example loses me, i think because it is imagining a fairly specific thing and I just don't have the context.

How essential is it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's important to provide a concrete example, because the previous discussion has been pretty abstract. However, if it's losing you, it's not doing that job. Let me try to elaborate, and then hopefully you can suggest how to phrase it more clearly without spending a paragraph on it:

There's a very common language feature that lets you specify a block of code and a set of pattern/handler pairs, and if an exception escapes the block, control is transferred to the handler whose pattern best matches the exception. try/catch in C++, Java, and JavaScript, try/except in Python, and do/catch in Swift are all examples. However, this feature really combines two separate pieces of functionality:

  1. Defining a scope at which exception propagation stops, and control is transferred back to user code
  2. Branching to one of several blocks of code based on the pattern that the exception matches

The primary practical consequence of the passage above is that Carbon will not have #2, but I don't want to just say "Carbon won't have try/catch", because nothing we've said so far has ruled out having #1 on its own, i.e. having a form of try/catch that doesn't incorporate pattern matching.

Comment on lines 227 to 230
those layers), and Carbon will support those use cases. However, it will do so
as a byproduct of general-purpose programming facilities such as pattern
matching; Carbon will not provide a separate sugar syntax for pattern-matching
error metadata, especially if that syntax can encompass multiple
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this is actually an independent principle that would be worth having: the desire to minimize (and potentially avoid) having fundamental language constructs or control flow constructs whose only purpose is error handling, and instead to try to ensure the general facilities of the language are sufficient. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, I regard that as a nice-to-have rather than a requirement, and I can easily imagine wanting to trade it off for priorities like readability, so I'm hesitant to enshrine it as a principle.


<!-- tocstop -->

## Problem
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be useful to have kept the "background" section here and collect all of the links about error handling that you and others have been surveying and referring to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd sort of rather keep that in the main principles doc (see the "Other resources" section), because I expect that to be what most people read.

Copy link
Contributor

@jonmeow jonmeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly style comments, although I do wonder about the principle vs design structure. I think others are covering the details better than I would, though.


## Problem

Error-handling is a pervasive aspect of language and library design, and Carbon
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we had a central design for how error handling should work in Carbon? Would there still be a need for a separate principle?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so. For example, the first principle affects the design of every language feature that can be used incorrectly (hence the example involving pointer dereferencing), and the second affects the design of quite a lot of the standard library. Some of the other principles have narrower applicability, but it's unclear exactly which language features they will apply to.

docs/project/principles/error_handling.md Outdated Show resolved Hide resolved
docs/project/principles/error_handling.md Outdated Show resolved Hide resolved
that might mean that the author of the function forgot to check some condition
before dereferencing, or that the caller incorrectly passed a dangling pointer,
or that some other code released the memory too early, among many other
possibilities. Consequently, the only way to (mostly) reliably recover from a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, please consider style when you're using parentheses. While I know you lean towards more use, I think things like this "(mostly)" could be better handled with a little rephrasing, like "the most effective way" or ".
https://developers.google.com/style/parentheses

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 46 to 50
known nor bounded. For example, if a function dereferences a dangling pointer,
that might mean that the author of the function forgot to check some condition
before dereferencing, or that the caller incorrectly passed a dangling pointer,
or that some other code released the memory too early, among many other
possibilities. Consequently, the only way to (mostly) reliably recover from a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repeated ", or" plus other commas (", if", ", that", ", among") in this sentence makes it hard to read. Consider rewording, e.g.:

For example, ... might mean:

  • The author ...
  • The caller ...
  • Some other code ...
  • Or some other possibility.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Markdown considers bulleted lists to be separate paragraphs (with vertical whitespace above and below), so I'd rather not go that route. How's this?

docs/project/principles/error_handling.md Outdated Show resolved Hide resolved
@geoffromer geoffromer added proposal rfc Proposal with request-for-comment sent out and removed WIP labels Jul 22, 2020
propagate errors across multiple layers of the stack so long as you control
those layers, and Carbon will support those use cases. However, it will do so as
a byproduct of general-purpose programming facilities such as pattern matching;
Carbon will not provide a separate sugar syntax for pattern-matching error
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels like this could use a justification

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be more specific? This is supposed to be a corollary of the general principle, which the previous three paragraphs are supposed to provide justification for.

error-reporting mechanisms to report programming errors. Furthermore, Carbon's
design will not prioritize use cases involving recovery from programming errors.

Recovering from an error generally consists of discarding any state that might
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option is to roll back the damage to the state that was done by the error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better?

early, or any number of other possibilities. Without more information, it's
impossible to know, so the only way to somewhat reliably recover from a
programming error is to discard the entire address space and terminate the
program.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the conclusion, but I don't agree with the reasoning :) I think the reasoning is important, because it determines what is considered a programming error.

Consider an issue that is typically considered a recoverable error -- an operation that opens a file for reading by name determines that there's no such file. "When such an error is detected, the original cause is neither known nor bounded." It can have many causes: the programmer forgot to check whether the file exists, the programmer forgot to call the routine that creates the file, the programmer forgot to handle "out of disk space" error from the routine that creates the file, the programmer that wrote the script that invokes this program creates the file in the wrong directory, etc. "Without more information, it's impossible to know, so the only way to somewhat reliably recover from a programming error is to discard the entire address space and terminate the program."

Given this explanation, is there a distinction between dereferencing a dandling pointer and file not found error?

If you ask me, I'd explain it in terms of preconditions of APIs and language features. Violating a precondition is a programming error that is non-recoverable. If an API or a programming language feature has a requirement that some condition must hold, but it can detect a violation and return control to the caller, then it is not a programming error -- it is regular control flow (which may be expressed using error handling language features if we so desire).

A precondition can be something an API requires (for example, a file must exist, input array must be non-empty, input array must be sorted etc.), or the programming language requires (a pointer to be dereferenced must point to valid memory, addition should not overflow etc.)

"File not found" is usually not a programming error, but it is entirely reasonable to design an API where "file not found" is a non-recoverable error that terminates the program (think of a map reduce batch job). In that case, file being present and readable is a precondition, and violating it is a programming error. So what is a precondition and what is an error that can be handled really depends on the designer of the API or of the language feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this explanation, is there a distinction between dereferencing a dandling pointer and file not found error?

The distinction I would make is that in the case of a file-not-found error, we may not know the cause for certain, but the program can know (at least roughly) what the likely causes are. In the case of a dangling pointer, on the other hand, the program generally can't even know that, at least not with enough specificity to plausibly recover. I've tried to rephrase to make that clearer; does that help?

If you ask me, I'd explain it in terms of preconditions of APIs and language features. Violating a precondition is a programming error that is non-recoverable.

I agree with that, but it seems to just assert the position that I'm trying to justify here: that programming errors should be considered non-recoverable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The distinction I would make is that in the case of a file-not-found error, we may not know the cause for certain, but the program can know (at least roughly) what the likely causes are. In the case of a dangling pointer, on the other hand, the program generally can't even know that, at least not with enough specificity to plausibly recover. I've tried to rephrase to make that clearer; does that help?

I still don't see much of a distinction. It is often possible to make an informed guess about why the pointer is dangling -- think about all those times when a report from ASan that such and such pointer is used after free is all one needs to implement a fix even when one can't reproduce the problem locally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but the code handling the error, and the programmer writing that code, doesn't have access to that ASan report. Or if they do, they should probably just fix the bug, rather than try to detect and programmatically recover from it. I've tried to make this point more explicit; does that help?

Comment on lines 195 to 201
function that originally raised them. However, this practice tends to be quite
brittle, because it almost inevitably requires relying on implementation
details: if a function's contract gives different meanings to different errors
it emits, it generally can't satisfy that contract by blindly propagating errors
from the functions it calls. Conversely, if it doesn't have such a contract, its
callers normally can't differentiate among the errors it emits without depending
on its implementation details.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an argument against including an easy to use construct to propagate errors, rather than an argument against universal error classification APIs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's an argument against having both convenient error propagation and universal error classification. But as argued above, I think we need convenient error propagation, so classification has to be what we drop.

Comment on lines +225 to +227
operations. For example, if Carbon supports `try`/`catch` statements, they will
always have a single `catch` block, which will be invoked for any error that
escapes the `try` block.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is becoming too specific for a principles doc. Allowing only a single catch block and asking users to use a match statement within it to distinguish errors vs. allowing multiple catch blocks and making try-catch-catch-catch resemble match-case-case-case sounds like a purely syntactic choice to me that should be discussed in the actual error handing proposal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just supposed to be an example application of the principle, and examples are supposed to be specific. And I don't think it's purely syntactic: providing syntactic sugar for a particular pattern is a way of encouraging that pattern, and the point of this principle is we don't want to encourage that pattern.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "will always" here is too absolute; it sounds like approving this proposal would put this specific hard constraint on future designs, whereas I think your intention is instead that this should be used as guidance only.

Maybe softening this a little would help:

Suggested change
operations. For example, if Carbon supports `try`/`catch` statements, they will
always have a single `catch` block, which will be invoked for any error that
escapes the `try` block.
operations. For example, if Carbon supports `try`/`catch` statements, the
`catch` statements should not invent a new mechanism for dispatching on the
kind of the exception.

file or an I/O error), which allows us to put a bound on the state that might
have been invalidated.

A _programming error_ is an error caused by incorrect user code, such as failing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest deleting this whole paragraph, and probably also the preceding paragraph. My rationale:
(1) Everyone already has an informal understanding of what a programming error is. I don't think anything in this proposal depends on making that understanding more precise, and I also don't think it's possible to be precise.
(2) These two paragraphs both depend on the distinction between cases where it is and where it isn't practical to know what the original cause of an error is. I agree that that distinction makes sense, but I don't think it lines up at all cleanly with things that are and aren't programming errors. Consider "file not found" versus "square root of a negative number": I don't think there's any significant difference between the two in how easy it is to find the original cause.
(3) The point about dereferencing a dangling pointer is well taken, but it's better put below as one of the example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably you'd also recommend deleting "Thus, we expect that supporting recovery from programming errors would provide little or no benefit" from the following paragraph? That would leave this principle without any discussion of the purported benefit of recovering from user error. I think that would be a serious omission: at least for me, the fact that I expect that benefit to be small is a key part of the rationale for this principle. I would be much more reluctant to adopt it if I thought that recovery from programming errors was a generally viable software engineering practice.

(1) Everyone already has an informal understanding of what a programming error is. I don't think anything in this proposal depends on making that understanding more precise, and I also don't think it's possible to be precise.

The first sentence of this paragraph is this document's only attempt to define "programming error". I don't intend it to make "programming error" precise, but only to make sure the reader and I are on the same page regarding the intuitive meaning of the term. I gather you agree, since you've suggested adding a similar definition on lines 30-31. If you're suggesting I define the term there instead of here, that's fine with me, assuming the style issues can be worked out.

These two paragraphs are primarily concerned not with defining "programming errors", but with explaining why recovering from those errors is unlikely to be practical.

(2) These two paragraphs both depend on the distinction between cases where it is and where it isn't practical to know what the original cause of an error is. I agree that that distinction makes sense, but I don't think it lines up at all cleanly with things that are and aren't programming errors. Consider "file not found" versus "square root of a negative number": I don't think there's any significant difference between the two in how easy it is to find the original cause.

The issue isn't "how easy it is to find the original cause", it's how feasible it is to anticipate the original cause when writing the code that will eventually handle that error. And in that respect, I think "file not found" is very different from "square root of a negative number": I find it very hard to imagine situations where the programmer can correctly anticipate that a "square root of negative number" error may occur, and correctly understand the cause of that error, but can't more easily just intervene to prevent that error from occurring in the first place.

I've revised to try to make that clearer; does that help?

Comment on lines +30 to +31
error-reporting mechanisms to report programming errors. Furthermore, Carbon's
design will not prioritize use cases involving recovery from programming errors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
error-reporting mechanisms to report programming errors. Furthermore, Carbon's
design will not prioritize use cases involving recovery from programming errors.
error-reporting mechanisms to report programming errors, i.e. errors caused by
incorrect user code. Furthermore, Carbon's design will not prioritize use cases
involving recovery from programming errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I had originally, but I changed it after @jonmeow pointed out it violated our style guide: https://developers.google.com/style/abbreviations#dont-use

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a reminder, you can trivially replace "i.e." with the literal meaning of "that is".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which I'm emphasizing here because Matt and Dmitri are suggesting a change in wording. Not simply the addition of latin.

Comment on lines +82 to +83
debugger.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
debugger.
debugger.
Dereferencing a dangling or null pointer will not be reported as a
recoverable error. Doing so would impose significant performance
overhead. It also wouldn't be useful; the original bug that resulted
in a bad pointer could have been anywhere, so the only reliable way
to recover from this situation is to discard the entire address space
and terminate the program.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This goes along with my suggestion of deleting the paragraph saying that the only reliable way to recover from programmer error is to terminate the whole program. I'm not convinced that's true in general, but I do think it's useful to have dereferencing a bad pointer as an explicit example.

docs/project/principles/error_handling.md Outdated Show resolved Hide resolved
docs/project/principles/error_handling.md Outdated Show resolved Hide resolved
docs/project/principles/error_handling.md Outdated Show resolved Hide resolved
Comment on lines 115 to 118
Carbon will probably provide a low-level way to allocate heap memory that makes
allocation failure recoverable, because doing so appears to have few drawbacks.
However, users may need to build their own libraries on top of it, rather that
relying on the Carbon standard library, if they want to take advantage of it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "may need to build their own libraries on top of it" covers this adequately: it does leave open the possibility of a standard library that includes recovery from memory allocation failure.

I do want to avoid over-promising in the other case too, though: saying we may provide heap allocation that allows recovery from allocation failure rather than saying we definitely will.


### No universal error categories

Carbon will not establish an error hierarchy or other reusable error vocabulary,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this conflates two questions.
(1) Does Carbon itself, either in the core language or in the standard library, establish an error hierarchy?
(2) Does Carbon allow/encourage/require users to define their own hierarchy?

The text itself mainly answers question 1, but the argument about brittle code also applies to question 2. I believe it's important to address both.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this conflates those questions, but I think that conflation is correct: the two questions are aspects of one underlying question, namely whether classifying propagated errors is a programming practice that Carbon will encourage. I've tweaked part of the next paragraph to be less specific to (1); are there other places that you think put too much emphasis on (1), or not enough emphasis on (2)?


### No universal error categories

Carbon will not establish an error hierarchy or other reusable error vocabulary,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"or other reusable error vocabulary" seems a bit over-broad to me. Go's error interface seems pretty harmless to me (in particular it doesn't require so many type shenanigans as Rust's Error trait), and your arguments about the downside of hierarchy and classification don't seem to apply to it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I phrased this poorly; I didn't intend to exclude things like that. Better?

geoffromer and others added 2 commits August 3, 2020 11:20
geoffromer and others added 2 commits August 3, 2020 11:37
Co-authored-by: josh11b <josh11b@users.noreply.github.com>
Comment on lines +60 to +62
amount of state. But this will almost always be much more difficult, and
probably much more brittle, than simply fixing the anticipated bug or verifying
its absence.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This, for me, is not the most important rationale for not supporting recovery from programming errors. I think the biggest motivation is to minimize the risk of a system silently operating in a failure mode (cf https://en.wikipedia.org/wiki/Systemantics#System_failure).

In my experience, if a system attempts to recover from programming errors, then some of those errors will go un-noticed, will not be prioritized when they're discovered, and eventually when the system fails, you'll find that the failure involved N different things going wrong in a subtle and hard-to-understand fashion, where any subset of those things going wrong by themselves would not have resulted in a visible system failure. Fixing each of the N bugs in isolation may be relatively easy, but merely understanding the set of circumstances that result in the failure of the supposedly fault-tolerant system may be substantially harder.


Memory exhaustion is not a programming error, and it is sometimes feasible to
write code that can successfully recover from it. However, the available
evidence indicates that very little C++ code actually does so correctly (for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only very little C++ code; at least in the case of physical memory exhaustion (as compared to virtual memory exhaustion), various current operating systems in their default configuration do not provide a mechanism to recover from memory exhaustion.

the Carbon standard library, if they want to take advantage of it. There
probably will not be a way to recover from _stack_ exhaustion, because there is
no known way of doing that without major drawbacks, and users who can't tolerate
crashing due to stack overflow can normally prevent it using static analysis.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an existence proof of this? Do we expect stack overflow avoidance approaches to be stable across changes to the optimizer or seemingly-minor changes to the code? (In C++, we know they aren't.)

I wonder if it is actually feasible to automatically recover from stack exhaustion in a way that's essentially free when recovery doesn't kick in. I have a totally-unproven idea of how to achieve that, assuming that Carbon doesn't support dynamic stack allocation.

Comment on lines +225 to +227
operations. For example, if Carbon supports `try`/`catch` statements, they will
always have a single `catch` block, which will be invoked for any error that
escapes the `try` block.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "will always" here is too absolute; it sounds like approving this proposal would put this specific hard constraint on future designs, whereas I think your intention is instead that this should be used as guidance only.

Maybe softening this a little would help:

Suggested change
operations. For example, if Carbon supports `try`/`catch` statements, they will
always have a single `catch` block, which will be invoked for any error that
escapes the `try` block.
operations. For example, if Carbon supports `try`/`catch` statements, the
`catch` statements should not invent a new mechanism for dispatching on the
kind of the exception.

@jonmeow jonmeow added WIP and removed proposal rfc Proposal with request-for-comment sent out labels Oct 22, 2020
@jonmeow jonmeow marked this pull request as draft April 20, 2021 16:21
@jonmeow jonmeow removed the WIP label Apr 20, 2021
@github-actions
Copy link

We triage inactive PRs and issues in order to make it easier to find active work. If this PR should remain active, please comment or remove the inactive label.
This PR is labeled inactive because the last activity was over 90 days ago. This PR will be closed and archived after 14 additional days without activity.

@github-actions github-actions bot added the inactive Issues and PRs which have been inactive for at least 90 days. label Aug 14, 2021
@github-actions
Copy link

We triage inactive PRs and issues in order to make it easier to find active work. If this PR should remain active or becomes active again, please reopen it.
This PR was closed and archived because there has been no new activity in the 14 days since the inactive label was added.

@github-actions github-actions bot closed this Aug 29, 2021
@github-actions github-actions bot added the proposal deferred Decision made, proposal deferred label Jul 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes PR meets CLA requirements according to bot. inactive Issues and PRs which have been inactive for at least 90 days. proposal deferred Decision made, proposal deferred proposal A proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.