Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structured error reporting and exhaustive parameter validation #234

Merged
merged 21 commits into from
Feb 17, 2023

Conversation

mih
Copy link
Member

@mih mih commented Feb 13, 2023

This is a further improvement of the constraints module (ping #115).

There is a new base class ConstraintError that can be used be
Constraint implementations to communicate validation errors. it
derived from ValueError (the former and current choice for raising
exceptions on validation errors) to keep except clauses in consumer
code valid.

With ConstraintErrors there is a second base class that can be used
for associating constraint violations with a particular context. The
specialization CommandParametrizationError uses this pattern to
communicate constraint violations for a full command parameterization
at once (for each parameter individually, and for the joint parameter
set in addition).

An improved EnsureCommandParameterization constraint uses these
new possibilities to offer more uniform and more detailed information
on parameter validations.

With this change, a caller can now also decide whether to perform an
exhaustive parameter validation, or to fail on first error.

The Constraint base class has received a new convenience method
raise_for() that can be used to raise the new ConstraintError
with minimal effort. Only a message (as before with ValueError),
and the value that was found to be in violation need to be passed.

Thew new approach to exception raising is demo'ed for AltConstraints,
and EnsureChoice. Beyond that, no other constraints have been adjusted
yet.

Closes #198
Closes #199 (ping #141)

bpoldrack
bpoldrack previously approved these changes Feb 13, 2023
Copy link
Member

@bpoldrack bpoldrack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense at looks good to me.

@codecov
Copy link

codecov bot commented Feb 14, 2023

Codecov Report

Base: 51.68% // Head: 51.47% // Decreases project coverage by -0.21% ⚠️

Coverage data is based on head (557521f) compared to base (796a639).
Patch coverage: 44.73% of modified lines in pull request are covered.

❗ Current head 557521f differs from pull request most recent head b6d0ea2. Consider uploading reports for the commit b6d0ea2 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #234      +/-   ##
==========================================
- Coverage   51.68%   51.47%   -0.21%     
==========================================
  Files          42       43       +1     
  Lines        3326     3419      +93     
==========================================
+ Hits         1719     1760      +41     
- Misses       1607     1659      +52     
Impacted Files Coverage Δ
datalad_next/constraints/basic.py 46.84% <0.00%> (ø)
datalad_next/patches/create_sibling_ghlike.py 24.44% <0.00%> (-0.56%) ⬇️
datalad_next/constraints/parameter.py 26.72% <18.18%> (-7.40%) ⬇️
datalad_next/constraints/base.py 54.16% <33.33%> (-1.06%) ⬇️
datalad_next/patches/interface_utils.py 75.00% <50.00%> (-1.20%) ⬇️
datalad_next/constraints/exceptions.py 65.95% <65.95%> (ø)
datalad_next/commands/__init__.py 82.35% <75.00%> (-2.27%) ⬇️
datalad_next/commands/create_sibling_webdav.py 31.00% <100.00%> (ø)
datalad_next/commands/credentials.py 31.00% <100.00%> (ø)
datalad_next/constraints/__init__.py 100.00% <100.00%> (ø)
... and 1 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@mih mih force-pushed the validation branch 4 times, most recently from 79a05c9 to a3a06ab Compare February 16, 2023 21:32
mih added 20 commits February 17, 2023 09:45
Even if they are provided by other parts of datalad_next, this helps to
keep the `contraints` module more self-contained. All constraints
should import outward-facing exceptions (i.e., those raised, and not
caught internally, to communicate particular issues).
This is a further improvement of the `constraints` module (ping datalad#115).

There is a new base class `ConstraintError` that can be used be
`Constraint` implementations to communicate validation errors. it
derived from `ValueError` (the former and current choice for raising
exceptions on validation errors) to keep `except` clauses in consumer
code valid.

With `ConstraintErrors` there is a second base class that can be used
for associating constraint violations with a particular context. The
specialization `CommandParametrizationError` uses this pattern to
communicate constraint violations for a full command parameterization
at once (for each parameter individually, and for the joint parameter
set in addition).

An improved `EnsureCommandParameterization` constraint uses these
new possibilities to offer more uniform and more detailed information
on parameter validations.

With this change, a caller can now also decide whether to perform an
exhaustive parameter validation, or to fail on first error.

The `Constraint` base class has received a new convenience method
`raise_for()` that can be used to raise the new `ConstraintError`
with minimal effort. Only a message (as before with `ValueError`),
and the value that was found to be in violation need to be passed.

Thew new approach to exception raising is demo'ed for `AltConstraints`,
and `EnsureChoice`. Beyond that, no other constraints have been adjusted
yet.

Closes datalad#198
Closes datalad#199 (ping datalad#141)
This extends and formalizes the validation of parameter sets (beyond
independent validation of individual parameters). The most significant
change here is the introduction of `ParameterConstraintContext`,
a dataclass representing such a validation context (see docstring).

Based on this new class, all other plain-text parameter label
identifiers have also been replaced with this class. This makes
reporting more uniform, and also makes the `__all__` context label
obsolete.

A basic test to demo this is included.
This is a new key type to be used in most command parameter
validation code.
Instead of a single monolithic `joint_validation()` method, the tests
are now broken into much smaller pieces. They are still incremental
and ordered, hence coercion and checks can be kept in distinct
implementations as needed. This included the ability to re-use
existing/generic `Constraint` implementations for higher-order
checks, and still have their (generic/context-free) error messages make
sense.

There is not need for the `ParamDictator` utility anymore, because all
args come as keyword args, and only relevant args needs to be
anticipated.

Error message generation no longer requires distributed string
interpolation, but messages are more or less constant -- and the
describe what has violated, which context, and how.

From a CLI point-of-view, this changes results in

```
❯ datalad credentials wrong
[ERROR  ] 1 command parameter constraint violation
action
  is not one of ('query', 'get', 'set', 'remove')

❯ datalad credentials remove
[ERROR  ] 1 command parameter constraint violation
action, name (remove-action requirements)
  no credential name provided

❯ datalad credentials get
[ERROR  ] 1 command parameter constraint violation
action, name, spec (get-action requirements)
  no name or credential property specified
```
Add test to show where that matters.
while at it, also improve the messaging to be more comprehensive
in case an actual closed interval is to be matched.
This shall be the place to implement any querying that would be needed
by consuming APIs for adequate error reporting/rendering.
This is an anticipation of a future implementation change,
where mutually independent checks are executed in parallel.

In such a case, 'immediately' does not make any sense anymore.
Likewise, 'raise-before-joint' is not necessary, because
via the `ParameterConstraintContext` specifications we know
exactly which validations depend on which parameters, and
we can reliably determine which ones can still make sense
to run (or not, because an earlier validation step for a
parameter has already failed, and we cannot assume that the
input assumptions of a later check are valid).

This leaves: 'raise-early', 'raise-at-end'
@mih mih merged commit d1793a3 into datalad:main Feb 17, 2023
@mih mih deleted the validation branch February 17, 2023 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants