-
-
Notifications
You must be signed in to change notification settings - Fork 773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: redo validation by content type #760
Conversation
e7938e5
to
2a50660
Compare
connexion/decorators/validation.py
Outdated
if error: | ||
return error | ||
|
||
elif self.strict_validation: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is a global setting it means that if any endpoints accept non handled content types (e.g. binary uploads) then strict validation would have to be disabled for all endpoints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - I'm still thinking through the implications. The strict
flag seems pretty nebulous. AFAICT it mostly checks for extra form fields, and errors if those exist.
So far this diff makes connexion much more strict, which I personally think is a good thing, but others might disagree (and it may cause breakage for people that have been relying on connexion's past flexibility).
I think personally as a user I'd prefer to explicitly have a content handler for every content type that I accept. Part of my goal is to make it straightforward to register new content handlers.
We could also provide more 'batteries included' beyond just json and formdata. For example, I could add a default octet stream content handler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right I see. If it is straight forward to add a custom content handler that would be ideal for my case in #592 - especially if the custom handler/validator can process as a stream.
bd6d8a6
to
de1a3df
Compare
c7dfabf
to
d8429d6
Compare
29342ff
to
f87b84f
Compare
hey @jmcs - I'd love your thoughts on this diff/demo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments along the way.
connexion/lifecycle.py
Outdated
@@ -21,6 +20,7 @@ def __init__(self, | |||
self.json_getter = json_getter | |||
self.files = files | |||
self.context = context if context is not None else {} | |||
self.content_type = self.headers.get("Content-Type", "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's a derived thingy, perhaps a property and returning None
if a content-type not set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
from ..types import coerce_type | ||
from ..utils import is_null | ||
|
||
logger = logging.getLogger('connexion.serialization.deserializers') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't __name__
be sufficient here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why, but all of the other getLogger calls in connexion are hard-coded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't mind, I might go through everything and clean it up for ya. Not sure when, but perhaps I can find some time today's evening or tomorrow.
cdc2091
to
650abaa
Compare
connexion/content_types.py
Outdated
) | ||
|
||
|
||
KNOWN_CONTENT_TYPES = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a nit, you can define all constant lista as tuples. That way you limit the ammount of used memory + load time is a bit faster. Tuples are cached by python runtime and once defined they are kept as they were without a change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might skip creating KNOWN_CONTENT_TYPES
and use ContentHandler.__subclasses__()
instead. (see my comment on connexion/decorators/validation.py)
Any updates on the status of this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice direction. Lets finish it!
|
||
class JSONContentHandler(ContentHandler): | ||
name = "application/json" | ||
regex = re.compile(r'^application\/json.*|^.*\+json$') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_json_mimetype
checks for application/json
or application/.*\+json
but this checks for application/json
or .*+json
. They should probably both check for the same thing. So, maybe is_json_mimetype
should do the equivalent of this regex. It makes sense to be even more lenient and treat anything that ends in +json
as json.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several places that said validation can occur for text/plain
as well as application/json
- does this need to check for that too?
connexion/content_types.py
Outdated
) | ||
|
||
|
||
KNOWN_CONTENT_TYPES = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might skip creating KNOWN_CONTENT_TYPES
and use ContentHandler.__subclasses__()
instead. (see my comment on connexion/decorators/validation.py)
connexion/decorators/validation.py
Outdated
de(self.validator, | ||
self.schema, | ||
self.strict_validation, | ||
self.is_null_value_valid) for de in KNOWN_CONTENT_TYPES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.is_null_value_valid) for de in KNOWN_CONTENT_TYPES | |
self.is_null_value_valid) for de in ContentHandler.__subclasses__() |
that way all you have to do to register a new one is subclass ContentHandler and make sure to import your ContentHandler subclass before initializing Connexion (or more accurately before whatever point it is that creates the RequestBodyValidator).
Actually, I think I would turn this into a classmethod on ContentHandler
so that it can look for grandchildren not just children classes as described here: https://stackoverflow.com/a/3862957
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. I see register_content_handler
below for registering additional content types. I would still like to see a way to replace default content handlers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL!
connexion/decorators/validation.py
Outdated
if len(matches) > 1: | ||
logger.warning("Content could be handled by multiple validators") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want them to be able to vote? ie more than just a regex? That way someone could register a handler that should handle (for example) json with a higher priority than the internal json validator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that the regex can't handle is None
- what if a content handler wants to handle content_type = None (more likely to be used for serialization than deserialization/validation, assuming the same ContentHandlers are used for both processes).
- query params | ||
- form params | ||
- body params | ||
- path params ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the ?
here?
connexion/content_types.py
Outdated
|
||
logger = logging.getLogger('connexion.serialization.deserializers') | ||
logger = logging.getLogger('connexion.content_types') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't we just do logging.getLogger(__name__)
here (and everywhere else)? That's what I do in my other projects and it prevents pointless work when refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I think I was planning to stick to repo convention for now, and then doing a single PR to fix them all. Happy to do it incrementally though.
tests/api/test_parameters.py
Outdated
#time_start = 1010 | ||
#resp = app_client.get( | ||
# '/v1.0/nullable-parameters?time_start={}'.format(time_start)) | ||
#assert json.loads(resp.data.decode('utf-8', 'replace')) == time_start |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you comment this out? I see you did that with the empty_object_body test, but it looks like this was testing something different with parameters. Are you planning to adjust parameter validation to use ContentHandlers as well (that doesn't make complete sense to me)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, but I think it was a mistake.
request comes in -> connexionrequest (framework generic) | ||
API metrics recorded | ||
API security applied | ||
|
||
arrays are deserialized (spec dependent) | ||
- query params | ||
- form params | ||
- path params | ||
|
||
body is deserialized (spec dependent, content dependent) | ||
|
||
validation is applied (spec dependent, content dependent) | ||
- query params | ||
- form params | ||
- body params | ||
- path params ? | ||
|
||
parameter_to_arg (handler, what to call it) | ||
|
||
json serializer is applied | ||
response is validated if json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the plan, not what it currently does, right?
Mapping this out (current -> planned)
decorators.metrics
-> API metricsdecorators.security
-> API security (this is also spec dependent for oauth scopes and security schemes)decorators.uri_parsing
-> query & path params deserialiation- part of validation? -> form params
- part of validation -> body deserialization
decorators.validation
-> request validation: query, form, body, pathdecorators.parameter
-> parameter_to_arg (I would call this an argument mapper)- part of api -> response serializer (on content type handlers? Need a way to register custom like request validators)
- part of validation -> response validation (on content type handlers? Need a way to register custom like request validators)
Is that what you're thinking?
connexion/content_types.py
Outdated
def deserialize(self, request): | ||
return request.body | ||
|
||
def validate(self, request): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def validate(self, request): | |
def validate_request(self, request): |
Make the ContentHandlers useful in more contexts.
ie, we could add a validate_response_body()
as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea
650abaa
to
171cbb3
Compare
Hi @dtkav, |
Very basic question. So, in case if application/xml content type is used, is there any way of parsing xml at all. I am getting below error. |
Hey, |
This is a WIP, I'd like to make this a bit more pluggable, so that it is straightforward to add content-type validation handlers.
Related to #655 #755
Changes proposed in this pull request:
all_json
requirement that caused any non-json content types to preclude validating json