Roadmap #566

Stranger6667 · 2024-10-04T22:39:41Z

This is a somewhat detailed roadmap for the development of this crate as I'd like to be transparent about what changes the user may anticipate. I will update it roughly once a month with all the main milestones.

Right now the current version is 0.26.0. This plan is not set in store and some other features could be implemented outside of these milestones (e.g. regex config support, or moving some dependencies behind features).

❤️ If you'd like to help the development, consider sponsoring me (like Sentry does)

Full JSON Schema support - DONE 🎉

Missing non-optional test cases:

6 failing unevaluatedItems ($ref, $recursiveRef & $dynamicRef support)
vocabulary
2 failing unevaluatedProperties ($ref ordering issue)
1 failing $ref in Draft 2019-09

Some optional will be resolved along with the ones above, however, some bignum tests for Draft 4 depend on parsing numbers with arbitrary precision which I plan to address later.

UPDATE 2024-10-12: Most of the unevaluatedItems keyword logic is implemented and only referencing support is left.
UPDATE 2024-10-20: Vocabulary support is done, now there are only 3 required tests not passing. These are bugs in the current implementation, so I'd consider support of Drafts 2019-09 & 2020-12 as done and will fix those cases later on.
UPDATE 2024-10-24: Finished the last few remaining cases.

Better errors

The main issue with errors is that validate returns an iterator, but I'd rather have validate + iter_errors where the former returns Result<(), ValidationError>

As the end goal the errors should:

Support customization of error messages
Be easily usable without lifetime issues

I'd also think about separating schema & instance errors.

It would be nice to rework the error iterator, so it is bound to the validator instance (as it uses its sub-validators). There are some performance issues I'd like to fix, i.e. collecting errors into vectors during validation.

UPDATE 2024-10-26: The older validate version is replaced with validate + iter_errors

Non-blocking resolving

Right now, the resolving of external references happens during the building phase, which makes it relatively straightforward to implement non-blocking resolving (actually, I'd like to retrieve resources in batches rather than sequentially too).

Along with this change, I'd like to expose custom resolvers to Python bindings + hopefully with async too, but not 100% sure about this.

Output formats

Adopt naming from the "next" draft + implement a hierarchical style. At this point, I want to separate all the annotation storage from the main graph, as it implies overhead for is_valid and validate.

Rework Python bindings

For a long time, I wanted to implement generic JSON input, I'll try to work on it at this stage. In Python bindings, sometimes the overhead data (de)-serialization during building a validator / validation is up to 80%, having generic input will greatly speed up everything.

Arbitrary precision also goes here, I'd like to have it behind a flag, so it could be disabled by default, but enabled in Python bindings.

WASM

It should "just work", maybe except for filesystem resolving, but network stuff could work via web_sys (I've tried it in css-inline).

Run tests on WASM as a part of CI
Demo website with WASM

Performance, performance, performance

I have tons of ideas about it as my main use case is to speed up the generation of instances that match a schema (in hypothesis-jsonschema & schemathesis), hence I am going to focus on the is_valid performance because the exact errors don't matter in this case.

The text was updated successfully, but these errors were encountered:

jpmckinney · 2024-12-22T00:14:31Z

Along with this change, I'd like to expose custom resolvers to Python bindings + hopefully with async too, but not 100% sure about this.

I presently use python-jsonschema, where I use custom resolvers for a few reasons:

The default resolver follows file:// URLs. If an application dereferences and renders a user-provided schema, this can be used to read any JSON file on the filesystem to which the web process has access. I therefore use a custom resolver that only resolves HTTP and HTTPS URLs (this is vulnerable to server-side request forgery, but I'm less worried about that).
I work with JSON Schema where users are allowed to "patch" the default schema, using JSON Merge Patch (RFC 7386). I use python-jsonschema's registry to make references to the default schema resolve to the patched schema.

It would be really nice to be able to control (1).

For (2), would a workaround be to provide only dereferenced schema to jsonschema_rs?

Stranger6667 · 2024-12-22T13:41:34Z

Hey @jpmckinney

Thank you for bringing this up!

Good to know this exact use case, I think that exposing resolvers should not be an issue and could be implemented similarly to how custom format validators work right now.
I am not 100% sure about a workaround (if you could provide an example, it would help), however, as this project practically followed the same registry-based design for references, I think we can expose the same functionality to Python bindings and it should be enough for your use case. It also seems like the feature is quite related to Validating individual definitions #432 & Support validating Open API sub-schemas #452

Stranger6667 pinned this issue Oct 4, 2024

Stranger6667 mentioned this issue Oct 4, 2024

[Question] Plans for this library #455

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Roadmap #566

Roadmap #566

Stranger6667 commented Oct 4, 2024 •

edited

Loading

jpmckinney commented Dec 22, 2024 •

edited

Loading

Stranger6667 commented Dec 22, 2024

Roadmap #566

Roadmap #566

Comments

Stranger6667 commented Oct 4, 2024 • edited Loading

Full JSON Schema support - DONE 🎉

Better errors

Non-blocking resolving

Output formats

Rework Python bindings

WASM

Performance, performance, performance

jpmckinney commented Dec 22, 2024 • edited Loading

Stranger6667 commented Dec 22, 2024

Stranger6667 commented Oct 4, 2024 •

edited

Loading

jpmckinney commented Dec 22, 2024 •

edited

Loading