-
Notifications
You must be signed in to change notification settings - Fork 908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Strict Metaschema Validation #1101
Conversation
d4cdf05
to
0b93060
Compare
Thanks for this @holmanb. I'm just starting the review amd will have some significant feedback tomorrow. In your suggested commit message could you capture the intent behind the strict_metaschema being a testing time validation of config module definitions rather than runtime validation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@holmanb thanks for putting this together. A couple of minor requests inline and a general question for you about moving toward pytest to avoid all the stdout/error context manager work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few inline questions/comments, but overall looking good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think do need to include property descriptions in docs. We might be able to drop the individual module documentation from the command-line, but we need to fixup the argparse setup for the devel schema subcommand.
b4443e7
to
89ebf39
Compare
1ab87a1
to
3779ff2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @holmanb any of these remaining comments are nits and I can +1 once you make any changes you think are relevant. Thank you for the restructure, I agree it is easier to read/maintain here and functionally works will with no degradation of existing behavior that I can see.
Approve pending any comments you decide to address and a passing CI run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combined diff of my musings for your thoughts. I think we want to minimally make the changes to simplify the get_json_validator return value to a 2-tuple instead of 3-tuple. But other diffs are included in case you find them reasonable.
diff --git a/cloudinit/config/schema.py b/cloudinit/config/schema.py
index 61f8cd9d..0c46cbdc 100644
--- a/cloudinit/config/schema.py
+++ b/cloudinit/config/schema.py
@@ -100,40 +100,48 @@ def get_jsonschema_validator():
# assignment-operation pylint warning which appears because this
# code path isn't written against the latest jsonschema).
types['string'] = (str, bytes) # pylint: disable=E1137
+ # disable bottom-level keys in meta-schema
+ strict_metaschema = deepcopymeta_schema=Draft4Validator.META_SCHEMA)
+ strict_metaschema["additionalProperties"] = False
cloudinitValidator = create(
- meta_schema=Draft4Validator.META_SCHEMA,
+ meta_schema=strict_metaschema,
validators=Draft4Validator.VALIDATORS,
version="draft4",
default_types=types)
- return (cloudinitValidator, FormatChecker, None)
+ return (cloudinitValidator, FormatChecker)
-def validate_cloudconfig_metaschema(schema: dict, throw=True):
+def validate_cloudconfig_metaschema(
+ validator, schema: dict, throw: bool = True
+):
"""Validate provided schema meets the metaschema definition. Return strict
Validator and FormatChecker for use in validation
+ @param validator: Draft4Validator instance used to validate the schema
@param schema: schema to validate
@param throw: Sometimes the validator and checker are required, even if
- the schema is invalid. Toggle for whether to raise SchemaErrors
-
- @returns: Tuple(jsonschema.Validator, jsonschema.FormatChecker, err)
+ the schema is invalid. Toggle for whether to raise
+ SchemaValidationError orlog warnings.
@raises: ImportError when jsonschema is not present
- @raises: jsonschema.exceptions.SchemaError if the schema is invalid
+ @raises: SchemaValidationError when the schema is invalid
"""
from jsonschema.exceptions import SchemaError
- (cloudinitValidator, FormatChecker, _) = get_jsonschema_validator()
try:
-
- # disable bottom-level keys
- cloudinitValidator.META_SCHEMA["additionalProperties"] = False
- cloudinitValidator.check_schema(schema)
- except SchemaError as e:
+ validator.check_schema(schema)
+ except SchemaError as err:
+ # Raise SchemaValidationError so we don't have jsonschema imports
+ # at call sites.
if throw:
- raise e
- return (cloudinitValidator, FormatChecker, e)
- return (cloudinitValidator, FormatChecker, None)
+ raise SchemaValidationError(
+ errors=(('.'.join([str(p) for p in err.path]), err.message),)
+ )
+ logging.warning(
+ "Meta-schema validation failed, attempting to validate config"
+ " anyway: %s",
+ err
+ )
def validate_cloudconfig_schema(
@@ -154,20 +162,14 @@ def validate_cloudconfig_schema(
against the provided schema.
"""
try:
- (cloudinitValidator, FormatChecker, err) = (
- get_jsonschema_validator()
- if not strict_metaschema
- else validate_cloudconfig_metaschema(schema, throw=False)
- )
+ (cloudinitValidator, FormatChecker) = get_jsonschema_validator()
+ if strict_metaschema:
+ validate_cloudconfig_metaschema(
+ cloudinitValidator, schema, throw=False
+ )
except ImportError:
logging.debug("Ignoring schema validation. jsonschema is not present")
return
- if err:
- logging.warning(
- "Meta-schema validation failed, attempting to validate config"
- " anyway: %s",
- err,
- )
validator = cloudinitValidator(schema, format_checker=FormatChecker())
errors = ()
for error in sorted(validator.iter_errors(config), key=lambda e: e.path):
dfabeaa
to
c93fde0
Compare
Thanks for the help on this. I wasn't sure about the best approach to the failure paths with this one. Besides one item, I think these are all improvements.
The problem with this is that get_jsonschema_validator() conditionally produces validator/formatchecker differently based on jsonschema version. The change proposed here only applies to the pre-3.0 jsonschema branch. I think we want both branches to be capable of strict validation. Assuming I don't see anything else in need of change I plan on applying the other proposals you made, in addition to a modified version of this one. Thanks! |
4aea9ce
to
7dfe838
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all the discussion and work here @holmanb +1. LGTM
Proposed Commit Message
Test Steps
Many of the files were only touched for function name/arg/schema/meta updates, however schema.py, test_schema.py, test_cli.py had the most substantial changes.
Checklist: