Retool how validators register and are called: everything's a tag! #89

thockin · 2025-01-04T17:25:51Z

Looking at how to not fail silently when tags are used wrong. I started adding more lint rules for things like "you can't use +optional on a type". It struck me that this is something that should be handled more first-class, rather than ex post facto. We will inevitably forget to add the lint rules for some tags.

We have the tag docs struct which includes a "contexts" list, which indicates when a tag is usable. I tried automating that check as a lint rule and it's not ideal as-is. It also highlighted that most of our tag docs are just WRONG (cargo culted) and that the context definitions we have are not really great.

Examples:

optional/required should only be on types
eachVal/eachKey claim they can be used on list values (they can't today)
listMapKey claims it can be used on list values (I don't think that makes sense?)
subfield claims it can be used on list values (I don't think it can)
unionDiscriminator/unionMember claim they can be used on list values (I don't think that makes sense?)

Lastly, each ExtractValidations() call doesn't have enough information to DTRT even if we fixed the above.

This PR started with expanding the contexts to distinguish struct-fields from list/map keys/values. It evolved into a different way of approaching tags.

Rather than each plugin's ExtractValidations() being responsible for checking that it is used in an appropriate context (and getting new args to indicate the extra information), I tried to do something a little more "scaffolding" centric. It changes the init sequence to register validators which include the tag string and contexts (basically the docs struct, but more fine-grained), but also having generic code look up the tags, ensure correct context, and then call a method to produce the specific FunctionGen objects.

Basically, centralize the tag-processing. Now generation will fail with errors like:

failed to generate validations: k8s.io/code-generator/cmd/validation-gen/output_tests/optional.T1: tag "k8s:optional" cannot be specified on type definitions

or

failed to generate validations: field k8s.io/code-generator/cmd/validation-gen/output_tests/options.T1.S2: tag "k8s:ifOptionDisabled": takes exactly 1 argument

...but the optional plugin didn't have to do the work. Some plugins still need to do work, like if they only apply to strings, but that seems OK to me.

This PR includes all my commits, even when later commits undo or change part of it, so readers can see the thought process. In particular, this centralized tag handling is insufficient for things like union which needs to run on the TYPE, but the tags are on FIELDs.

Now there are "tag validators" and "type validators" (which run AFTER child fields). Union has tag handlers that are used to validate context (e.g. "+union only applies to struct fields") and accumulate info, and a type handler which uses the accumulated info.

I wanted to move registry to a different package, but to avoid a circular dep, it has to become 3 packages, and it didn't seem worthwhile.

thockin · 2025-01-06T01:10:16Z

For giggles I added some WIP commits to make +eachVal a "regular" tag. To do that, I had to do listMapKey and listType, too.

This probably could have been simpler, and list-map could be treated "specially" (with info passed to validators), but I wanted to see if I could do it functional-style.

It sorta works. There are some issues around pointerness, still, and it's not as understandable as I'd like, but I think you can get the idea. Basically there's a family of list-oriented validators which work together to collect info about keys and then extract old values when we can.

The big remaining problem is that tags are discovered in sort-order, which means that +eachVal is run before +listMapKey. I see two ways to solve it:

internally -- collect a list of funcs to call from the new FieldValidator, which runs after all tags.
let tags declare a sort order for discover (they already do for execution). E.g. "schematic" vs "content" (words TBD).

If we like the approach, we can see how subfield would work, too.

thockin · 2025-01-06T17:20:55Z

For more giggles, I made +k8s:subfield(ls)=+k8s:eachVal2=+k8s:format=ip-sloppy work - that's running on each value of a subfield of list type.

I can't yet reverse that, until I convert subfield to a regular tag.

staging/src/k8s.io/code-generator/cmd/validation-gen/validators/registry.go

staging/src/k8s.io/code-generator/cmd/validation-gen/validators/options.go

staging/src/k8s.io/code-generator/cmd/validation-gen/validators/validators.go

jpbetz · 2025-01-13T19:16:01Z

I've caught up on the commits. I'll put some comments in the code, but would like to start with some high level thoughts:

The improvements to how validators are registered and how context scopes are checked all looks great.

Separating out TypeValidators looks fine. Unions are the only tag we have (that I'm aware of) that needs to inspect multiple tags.

Re: Making each* normal tags- The idea of using our tag discovery in other generators (like openapi, defaulting, maybe even conversion) that we talked about last week is something I'd like to circle back on here. Currently, we have a nice separation between the discovery pass and the emit pass. But the childNode data structure that the passes share is specialized to validation. I like how making each* normal tags moves us in the direction of having less validation specific information in childNode. So I'm happy with this PR in the sense that it moves us in the right general direction.

I do wonder if we're doing to end up needing a third pass to make this all clean? Currently we have:

(gengo universe) -> discovery pass -> (childNode) -> validation emission pass -> (validation code)

Unless we add more passes and intermediate data structures, any validation specific code in childNode that we want to remove only has one places to go: the emission pass. Should we consider having additional passes? Maybe something more like:

(gengo universe) -> discovery pass -> (childNode) -> validation prep pass -> (validationChildNode) -> validation emission pass -> (validation code)

This would allow discovery pass to be fully generalized to use with other generators, while still giving us a data structure that we can use to fully represent the validation code that we'd like to emit.

I don't think we should hold this PR up on this idea. But this seems like a good place to discuss the longer term goals.

thockin · 2025-01-13T19:37:20Z

ACK the sentiment! I think the intermediate that you are looking for is ultimately going to be the "iptables" version of this. Since discussion on Friday, I have been thinking a lot about it, and I think I see how to get there, but I want to focus on delivering SOMETHING before retooling it all again. I think this "regularity" of tags helps.

I have spent some time hacking on weekend, cleaning up the output tests for tags (found some bugs!!) and when I am done witht htat I will rip out all the now unused generator code, but in discovery and emit passes. I think the result will make it plain how we can proceed.

Cleanup and tests will be in subsequent commits.

Upon trying to clean up the now obsolete builti-in code, I realized I had totally missed the non-tag iteration of lists (e.g. calling the type-validation func for each item). Fixing that was deeper than I expected. After this: - the eachVal/eachKey validators expose a helper function for use by main generation code (they are special). - discover() looks for the case of aliases to maps/slices and adds type-validation calls to "for each" the key/val types' validation funcs. - discoverStruct() does the same for map/slice fields - discoverAlias() is gone, it had been reduced to ~5 LOC and a comment. - all built-in iteration logic is gone, in favor of the eachVal and eachKey tag logic - safe.NewListMap is gone - hasValidations() is now a global rather than only available to generation-phase (with hazard documented) - the last built-in tag is gone - Some edge cases we just don't support (because it's hard and we don't need them): - alias to pointer - pointer to alias of slide/map/pointer I went through all the test differences and made sure they made sense. In hindsight, these issues should have been obvious in earlier commits. There's still some room for cleanup after this.

jpbetz · 2025-02-07T21:08:56Z

LGTM

I'm all caught up. This is excellent progress. Let's iterate on the dev branch from here

caozhuozi · 2025-02-08T03:31:24Z

Hi, sry to borther! I came across this PR and I'm interested in this topic. Is there any KEP that provides background information on it?

yongruilin · 2025-02-08T03:52:42Z

Hi, sry to borther! I came across this PR and I'm interested in this topic. Is there any KEP that provides background information on it?

Hi @caozhuozi , you can check on kubernetes/enhancements#5074 . Thanks for your interests.

caozhuozi · 2025-02-08T04:18:34Z

@yongruilin Thank you!!

thockin assigned jpbetz, aaron-prindle and yongruilin Jan 4, 2025

thockin changed the title ~~Retool how validators register and are called~~ DNM: Retool how validators register and are called Jan 4, 2025

thockin force-pushed the validation-gen_tag-descriptors branch from 5165747 to ca520e4 Compare January 6, 2025 01:02

thockin force-pushed the validation-gen_tag-descriptors branch from ca520e4 to 4e424a7 Compare January 6, 2025 17:19

thockin force-pushed the validation-gen_tag-descriptors branch 2 times, most recently from 2d63642 to 0cc2ffa Compare January 8, 2025 08:01

yongruilin reviewed Jan 8, 2025

View reviewed changes

thockin force-pushed the validation-gen_tag-descriptors branch 17 times, most recently from 188a158 to cb73101 Compare January 13, 2025 18:29

thockin added 24 commits February 7, 2025 12:17

Make +eachVal a regular tag

38a7290

Cleanup and tests will be in subsequent commits.

Add tag tests for +eachVal for lists

43c1369

Regenerate

78d9ba0

Add tag tests for +eachVal for maps

645b233

Regenerate

36b33be

Add tag tests for +eachVal for list-maps

adfa4f4

Regenerate

13d7ea9

Make +eachKey a regular tag

0af2f09

Cleanup and tests will be in subsequent commits.

Add tag tests for +eachKey for maps

eee05be

Regenerate

ed4df73

Make +subfield a regular tag

00af5b3

Cleanup and tests will be in subsequent commits.

Add tag tests for +subfield

05130d9

Regenerate

6390280

Clean up old eachKey code

7e8f2b5

Regenerate

8a87a9b

Clean up old eachVal code

a7d8fea

Regenerate

9b5dd66

Clean up old subfield code

248a975

Test scheme: update args in the wrong order

6bf363a

Get rid of forEachField, simpler

6ca9e0e

Clean up tag name consts

19216b2

If WrapperFunction has no args, just call the func

2a39408

Fix lint to use proper +default tag

45c00a6

thockin force-pushed the validation-gen_tag-descriptors branch from 34faa1e to 45c00a6 Compare February 7, 2025 20:17

jpbetz merged commit b1f7440 into jpbetz:validation-gen Feb 7, 2025
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retool how validators register and are called: everything's a tag! #89

Retool how validators register and are called: everything's a tag! #89

thockin commented Jan 4, 2025 •

edited

Loading

thockin commented Jan 6, 2025

thockin commented Jan 6, 2025 •

edited

Loading

jpbetz commented Jan 13, 2025 •

edited

Loading

thockin commented Jan 13, 2025

jpbetz commented Feb 7, 2025

caozhuozi commented Feb 8, 2025

yongruilin commented Feb 8, 2025

caozhuozi commented Feb 8, 2025

Retool how validators register and are called: everything's a tag! #89

Retool how validators register and are called: everything's a tag! #89

Conversation

thockin commented Jan 4, 2025 • edited Loading

thockin commented Jan 6, 2025

thockin commented Jan 6, 2025 • edited Loading

jpbetz commented Jan 13, 2025 • edited Loading

thockin commented Jan 13, 2025

jpbetz commented Feb 7, 2025

caozhuozi commented Feb 8, 2025

yongruilin commented Feb 8, 2025

caozhuozi commented Feb 8, 2025

thockin commented Jan 4, 2025 •

edited

Loading

thockin commented Jan 6, 2025 •

edited

Loading

jpbetz commented Jan 13, 2025 •

edited

Loading