Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Unify error handling strategies #90

Open
AngusGMorrison opened this issue Apr 2, 2024 · 3 comments
Open

RFC: Unify error handling strategies #90

AngusGMorrison opened this issue Apr 2, 2024 · 3 comments

Comments

@AngusGMorrison
Copy link
Contributor

Lox currently takes several approaches to error handling, which is confusing for authors and users. This proposal aims to establish a consistent approach to error handling throughout the library.

Current situtation

Lox currently employs three different error handling strategies:

  • Crate-level umbrella errors, which are enum types like LoxTimeError, LoxPyError and LoxBodiesError.
  • Module/function-specifc enum errors, like LoxCubicSplineError.
  • Module/function-specific struct errors, like UTCUndefinedError.

Deciding which to implement for a new error scenario leaves too much room for personal preference, and results in an inconsistent experience for library consumers.

List of Lox errors

Since it's useful to understand the state of play, and because I'll be referring back to it later, here are the definitions of all errors currently used in the Lox workspace:

#[derive(Clone, Debug, Error, Eq, PartialEq)]  
pub enum LoxCubicSplineError {  
    #[error("`x` and `y` must have the same length but were {0} and {1}")]  
    DimensionMismatch(usize, usize),  
    #[error("length of `x` and `y` must at least 4 but was {0}")]  
    InsufficientPoints(usize),  
    #[error("linear system could not be solved")]  
    Unsolvable,  
}
#[derive(Error, Debug, PartialEq)]  
pub enum LoxBodiesError {  
    #[error("unknown body `{0}`")]  
    UnknownBody(String),  
}
  #[derive(Clone, Debug, Error, PartialEq)]  
pub enum LoxEopError {  
    #[error("{0}")]  
    Csv(String),  
    #[error("{0}")]  
    Io(String),  
}
#[derive(Clone, Copy, Debug, Error, PartialEq)]  
#[error("sizes of `x`, `y`, `t` and `epochs` must match, but were x: {nx}, y: {ny}, t: {nt}, epochs: {nepochs}")]  
pub struct ArgumentSizeMismatchError {  
    nx: usize,  
    ny: usize,  
    nt: usize,  
    nepochs: usize,  
}
#[derive(Debug, PartialEq)]  
pub enum DafSpkError {  
    // The data type integer value does not match the ones in the spec  
    InvalidSpkSegmentDataType,  
    // The number of DAF components does not match the SPK specification  
    UnexpectedNumberOfComponents,  
    UnableToParse,  
    UnsupportedSpkArrayType { data_type: i32 },  
    // Unable to find the segment for a given center body and target body  
    UnableToFindMatchingSegment,  
    // Unable to find record for a given date  
    UnableToFindMatchingRecord,  
}
  • Note, DafSpkError doesn't actually implement Error.
#[derive(Error, Debug)]  
pub enum LoxPyError {  
    #[error("unknown time scale `{0}`")]  
    InvalidTimeScale(String),  
    #[error("unknown body `{0}`")]  
    InvalidBody(String),  
    #[error("unknown frame `{0}`")]  
    InvalidFrame(String),  
    #[error(transparent)]  
    LoxBodiesError(#[from] LoxBodiesError),  
    #[error(transparent)]  
    LoxTimeError(#[from] LoxTimeError),  
    #[error(transparent)]  
    PyError(#[from] PyErr),  
}
#[derive(Clone, Debug, Default, Error)]  
#[error("`{raw}` cannot be represented as a `TimeDelta`: {detail}")]  
pub struct TimeDeltaError {  
    pub raw: f64,  
    pub detail: String,  
}
#[derive(Clone, Error, Debug)]  
pub enum LoxTimeError {  
    #[error("invalid date `{0}-{1}-{2}`")]  
    InvalidDate(i64, i64, i64),  
    #[error("invalid time `{0}:{1}:{2}`")]  
    InvalidTime(u8, u8, u8),  
    #[error("seconds must be in the range [0.0, 60.0], but was `{0}`")]  
    InvalidSeconds(f64),  
    #[error("subsecond must be in the range [0.0, 1.0), but was `{0}`")]  
    InvalidSubsecond(f64),  
    #[error("day of year cannot be 366 for a non-leap year")]  
    NonLeapYear,  
}
#[derive(Debug, Error)]  
pub enum CoefficientsError {  
    #[error("kernel coefficients with key {key} had size {actual}, expected at least {actual}")]  
    TooFew {  
        key: String,  
        min: usize,  
        actual: usize,  
    },  
    #[error("kernel coefficients with key {key} had size {actual}, expected at most {actual}")]  
    TooMany {  
        key: String,  
        max: usize,  
        actual: usize,  
    },  
    #[error(  
        "barycenter nutation precession coefficients with key {key} had an odd number of \        terms, but an even number is required"    )]    OddNumber { key: String },  
}
#[derive(Clone, Copy, Debug, Error, PartialEq)]  
#[error("UTC is not defined for dates before 1960-01-01")]  
/// UTC is not defined for dates before 1960-01-01. Attempting to create a `UtcDateTime` with such/// a date results in this error.  
pub struct UtcUndefinedError;

The trouble with crate-level error enums

Our crate-level error enums have two, closely related issues.

Irrelevant variants

First, most of their variants are irrelevant to the operations that return them. For example, Subsecond::new returns Result<Subsecond, LoxTimeError>. LoxTimeError is a five-variant enum, but the only variant relevant to Subsecond::new is LoxTimeError::InvalidSubsecond.

To both library users and authors, LoxTimeError implies that an operation may return any of the possible variants, and forces the use of a match statement to distinguish between variants where there may only be one returned variant in practice. Both readability and usability are reduced.

Breaking changes

Second, adding or removing variants from public enums is a breaking change. For example, by removing the NonLeapYear variant from LoxTimeError, we would risk breaking error handling code for unrelated errors, such as InvalidSubsecond. This simple change would require a major version bump.

What good enum errors look like

LoxCubicSplineError is an example of an error enum done well. It has three variants, and all three of these variants may occur in a call to CubicSpline::new. In other words, it is a tight upper and lower bound on the possible error scenarios of a particular operation. Its variants are unlikely to change, and it doesn't attempt to encapsulate the errors of unrelated operations.

Proposed solution

How will Lox users handle errors?

Considering the list of Lox errors, they overwhelmingly relate to invalid inputs. This tells us that Lox errors are highly unlikely to be handled programmatically. If the input is wrong, it requires human attention to fix. Therefore, the universal error strategy will be to log the error and exit.

This tells us several things:

  • Crate-level enum error types provide little additional value to the user, since they are unlikely to ever match on them.
  • To users, Lox errors might as well be Box<dyn Error + Send + Sync>, since the only useful information they contain is the message describing the ways in which the input was invalid.
  • We have a responsibility to make error messages related to invalid inputs as descriptive as possible, since this will likely be the only aspect of our errors that users see.

Why not Box<dyn Error + Send + Sync>?

At runtime, Lox users are unlikely to need any further detail than a dyn Error is able to provide. However, well typed errors offer significant benefits to code authors and readers. Through the type system, they reveal the complete set of error scenarios for a given operation at a glance. For example, reading that Utc::try_from::<Time<Tai>> returns only UtcUndefinedError is more powerfully descriptive than either Box<dyn Error + Sync + Send> or LoxTimeError.

Struct and tuple error types additionally assist library authors in providing meaningful, standardized error messages. Constructing ad hoc errors from string literals is less likely to include the full wealth of data a consumer could use to debug their inputs, and is more prone to copy-pasting.

More granular errors are better for authors and readers, with no impact on consumers

For Lox, the ideal error is the narrowest type that fully represents all error scenarios that may occur during a function call.

  • If only one thing can go wrong, the returned error type should address that thing and that thing only.
  • If multiple things can go wrong, the returned error type should be an enum compsed of only the things that could go wrong.
  • We should avoid creating crate-wide umbrella errors that combine multiple, unrelated errors, since this harms readability and makes the addition of new errors breaking changes.
  • Contributors should be encouraged to use thiserror to define error types specific to the operation they're implementing, as this makes the intent and failure scenarios of their code clearer. thiserror also makes composing errors simple with its #[source], #[from] and #[transparent] annotations.
  • The proliferation of error types is unimportant to library users, who need only the error message. They can choose to propagate any Lox error as a Box<dyn Error + Send + Sync> without degradation of the debugging information available to them.

Examples

Good

#[derive(Clone, Debug, Error, Eq, PartialEq)]  
pub enum LoxCubicSplineError {  
    #[error("`x` and `y` must have the same length but were {0} and {1}")]  
    DimensionMismatch(usize, usize),  
    #[error("length of `x` and `y` must at least 4 but was {0}")]  
    InsufficientPoints(usize),  
    #[error("linear system could not be solved")]  
    Unsolvable,  
}

LoxCubicSplineError is the narrowest type that represents everything that could go wrong when instantiating a CubicSpline. It contains no unnecessary variants and gives the users the debugging information they need to resolve the issue.

#[derive(Clone, Copy, Debug, Error, PartialEq)]  
#[error("sizes of `x`, `y`, `t` and `epochs` must match, but were x: {nx}, y: {ny}, t: {nt}, epochs: {nepochs}")]  
pub struct ArgumentSizeMismatchError {  
    nx: usize,  
    ny: usize,  
    nt: usize,  
    nepochs: usize,  
}

ArgumentSizeMismatchError is a struct error representing the only thing that can go wrong for a given operation, and provides the user with the information they need to find the source of the problem.

Bad

#[derive(Clone, Error, Debug)]  
pub enum LoxTimeError {  
    #[error("invalid date `{0}-{1}-{2}`")]  
    InvalidDate(i64, i64, i64),  
    #[error("invalid time `{0}:{1}:{2}`")]  
    InvalidTime(u8, u8, u8),  
    #[error("seconds must be in the range [0.0, 60.0], but was `{0}`")]  
    InvalidSeconds(f64),  
    #[error("subsecond must be in the range [0.0, 1.0), but was `{0}`")]  
    InvalidSubsecond(f64),  
    #[error("day of year cannot be 366 for a non-leap year")]  
    NonLeapYear,  
}

LoxTimeError encapsulates error variants for many unrelated operations.

#[derive(Clone, Copy, Debug, Error, PartialEq)]  
#[error("UTC is not defined for dates before 1960-01-01")]  
/// UTC is not defined for dates before 1960-01-01. Attempting to create a `UtcDateTime` with such/// a date results in this error.  
pub struct UtcUndefinedError;

UtcUndefinedError does well to represent a single error scenario, but could be improved by including the invalid date in its error message to aid debugging.

Drop the Lox prefix

Finally, the Lox prefix for error types was useful when needed as a disambiguation from, e.g. std::io::Error, but is unnecessary when errors are specific to the operations that return them.

@helgee
Copy link
Member

helgee commented Apr 5, 2024

I agree with the general strategy and the recommendations. I would only add that application developers might want to use a crate like anyhow or color-eyre to display errors to users.

Could you add this as a short dev doc to the repo (e.g. docs/error_handling.md)?

@matzipan
Copy link
Collaborator

matzipan commented Apr 5, 2024

Note, DafSpkError doesn't actually implement Error.

Ooops. Thanks for spotting this. I wonder if there's a linter check that we could enable.

@matzipan
Copy link
Collaborator

matzipan commented Apr 5, 2024

LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants