-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider improving error handling and StatsError
type
#221
Comments
I do see good reason for all of these and I'd be open to making changes for all but the infallible All other public structs defined in the |
I see your point about consistency. I would personally value the expressiveness ("this call cannot fail") over the consistency ("all constructors return a Result and might need error handling"), but it doesn't matter much tbh. Hmm I'm not sure about renaming the |
Perhaps an Regardless, the overall discussion you bring up on the error type is valid. I'd merge an effort that does any of
|
During these changes, we should also work on removing support for degenerate distributions #102 and returning errors. |
Tried a different approach, even wrote out an example to test using it, but I didn't have a great motivating case, so I just made something up, but I wanted to see what it was like to actually handle those such errors. |
Maybe we should figure out what the errors should be able to convey first. Since @FreezyLemon posted on #242 I've realized I've not done enough of this bigger picture stuff for decisions. I think it's fair to narrow the discussion to errors from distribution construction since that's at a module scope and most of where errors are handled. The value in #258 is to make the errors flat, and delegates identifying what we call "bad parameters" to the caller. This is quite reasonable, as the conditions for errors are statically specified (up to decisions about what constitutes degeneracy), but there is no program state in which parameter conditions change. However it means that users of the library may repeat their work among projects where there's some input-validation cycle if they want useful error messages. The option I have in #247 is pretty dense, and it means all distributions' A few lists to help organize my thoughts: Assumptions
Possible targeted behaviors adjacent to error API
I'm very open to modifying these lists. |
I realize yet another option is to have the distribution error have too many variants for any one distribution, but use shared language This allows a shallow error, small step from unwrap for those writing analyses. And allows for deeper error which can rely more on handling programmatically. But I don't think this is a common idiom, the more I think about this, the more certain there's something easier I'm missing here and I don't think it's needed to replicate |
Most usages of Let's turn // beta.rs
#[derive(Copy, Clone, PartialEq, Debug)]
pub struct Beta {
shape_a: f64,
shape_b: f64,
}
impl Beta {
pub fn new(shape_a: f64, shape_b: f64) -> Result<Beta> {
if shape_a.is_nan()
|| shape_b.is_nan()
|| shape_a.is_infinite() && shape_b.is_infinite()
|| shape_a <= 0.0
|| shape_b <= 0.0
{
return Err(StatsError::BadParams);
};
Ok(Beta { shape_a, shape_b })
}
} into #[derive(Copy, Clone, PartialEq, Debug)]
pub struct Beta {
shape_a: f64,
shape_b: f64,
}
#[derive(Copy, Clone, PartialEq, Eq, Debug)]
pub enum BetaError {
ShapeANan,
ShapeAInfinite,
ShapeALessThanZero,
ShapeBNan,
ShapeBInfinite,
ShapeBLessThanZero,
}
impl std::fmt::Display for BetaError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
BetaError::ShapeANan => write!(f, "Shape A is NaN"),
// ...
}
}
}
impl std::error::Error for BetaError {}
impl Beta {
pub fn new(shape_a: f64, shape_b: f64) -> Result<Beta, BetaError> {
if shape_a.is_nan() {
Err(BetaError::ShapeANan)
} else if shape_a.is_infinite() {
Err(BetaError::ShapeAInfinite)
} else if shape_a <= 0.0 {
Err(BetaError::ShapeALessThanZero)
} else if shape_b.is_nan() {
Err(BetaError::ShapeBNan)
} else if shape_b.is_infinite() {
Err(BetaError::ShapeBInfinite)
} else if shape_b <= 0.0 {
Err(BetaError::ShapeBLessThanZero)
} else {
Ok(Beta { shape_a, shape_b })
}
}
} Maybe this example is too verbose (i.e. maybe This will return very specific errors, and users that don't care about it can still If new methods of creating a distribution are added later, the related error type could be extended or a separate error type, specific to the method of construction, could be added. It will need a fair bit of boiler plate, especially now that it needs to be implemented all at once, but is otherwise very simple to implement. A crate like I might be missing something here, but this is the obvious KISS solution IMHO |
Ah, it's true that the generic part can only come up once the type is correctly initialized. I'm good with only specifying which parameter is invalid, so we can just have one variant per parameter, the one message denoting that one of its (potentially multiple) constraints was violated. impl std::fmt::Display for BetaError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
BetaError::ShapeAInvalid => write!(f, "Shape A must be positive finite"),
BetaError::ShapeBInvalid => write!(f, "Shape B must be positive finite"),
}
}
} It will be a little boilerplatey, but it need not be flexible, sounds like a text editor problem. |
Alright, I'll prepare another draft pr to compare against the rest if you don't mind |
Thanks for this, I'll keep an eye out. |
enum StatsError
is supposed to be anThis indicates that it should not try to be a generic error type for any kind of statistics calculations, but instead only concern itself with the errors produced in
statrs
.With that basic assumption, there are currently some inconsistencies and outdated practices though (API guidelines for reference):
StatsError
implements bothSync
andSend
(this is more of a formality, but is trivial to implement and good future-proofing)Error::description
is deprecated and should not be implementedArgNotNegative
,ArgIntervalExclMin
, etc.), these are leftovers from older versions and should be removed because they're no longer needed bystatrs
.Result<T>
that cannot return an error:Empirical::new
. There's no real reason for this infallible API to return aResult<T, E>
. There might be others.I realize that most of these are breaking changes, but seeing that the crate is pre-1.0, I don't think there's a big problem doing this.
Other things that could be improved:
StatsError
is big: 40 bytes on a Linux x64 target. This is because there are variants which contain 2x&str
(2 x 16 = 32 bytes plus discriminant and padding). Is it really necessary to have strings in the error type? The implementation could be replaced mostly 1:1 with some sort ofArgName
enum, but there might be an even better solution that does not need this argument name wrapping at all.All
new
functions seem to just returnStatsError::BadParams
whenever the params are outside of the defined/allowed range. Is there a good reason for these to be so vague when compared to the more specific errors returned by other functions? After all, the more specific errors already exist, why not return more exact error information? There might even be value in providing multiple error types, to have errors that are more specific to the exact API being used.The text was updated successfully, but these errors were encountered: