HTTP content-type isn't using charset #689

Cap-JaTu · 2023-10-12T14:24:01Z

On a typical API-schema, content types don't have charsets. For example application/json charset is assumed to be UTF-8 by RFC 8259. For example text/xml, there is relevance. In neither case API-schema doesn't specify multiple contents for different charsets.

Thus, not using HTTP-response charset for API-schema verification doesn't make sense. For those response verifications with assumed charset, there is no need to fail. For those responses which are in varying charsets, verifying data is impossible as there is no decoding the response data into Unicode used by Python internally. Again, charset doesn't have any relevance and can be ignored.

Ref.: https://datatracker.ietf.org/doc/html/rfc2046#section-4.1.2
Ref.: https://datatracker.ietf.org/doc/html/rfc8259#section-8.1

p1c2u · 2023-10-12T14:43:09Z

Hi @Cap-JaTu

do you have any specific use case where you need to drop charset? It looks similar case to #378

Unreleased version (current master) has charset handling for deserializing data (See #678).

For those response verifications with assumed charset, there is no need to fail.

This one was fixed with the mentioned change above. Please feel free to test it.

For those responses which are in varying charsets, verifying data is impossible as there is no decoding the response data into Unicode used by Python internally.

This one is done with media type deserializing process of the library.

Cap-JaTu · 2023-10-12T14:53:38Z

The easy stuff: obviously, I have no idea on the product's roadmap. There isn't a single word of documentation on charset handling.

For use case, I could go to ChatGPT and ask it to say my PR description using different words. That would be fruitless, rude even. Instead, as a person living in real world HTTP-responses do have charset specifier in them, I'd simply love them not to be part of validation pass/fail test.

p1c2u · 2023-10-12T16:33:45Z

As I mentioned, the only part charsets are considered in validation is media type object. I believe you ran into the issue with charset that was solved recently. Please consider testing unreleased version (or wait for alpha version release) and let me know if it solved your issue.

Cap-JaTu · 2023-10-17T09:59:27Z

In

openapi-core/openapi_core/deserializing/media_types/deserializers.py

Line 31 in df1f1e1

return self.deserializer_callable(value, **self.parameters)

Error:
TypeError: __init__() got an unexpected keyword argument 'charset'

Obviously JSON-library doesn't understand parameter:
{'charset': 'utf-8'}

I'd like to put emphasis on the fact: In real world HTTP-responses have Content-Types with "; charset=" definitions. For reason really unknown to me, this library ignores this.

p1c2u · 2023-10-17T14:41:04Z

@Cap-JaTu thanks for the report. I will fix this.

I'd like to put emphasis on the fact: In real world HTTP-responses have Content-Types with "; charset=" definitions. For reason really unknown to me, this library ignores this.

I'm in the process of deserialization re-implementation. Do you know of other places where charset is ignored?

p1c2u · 2023-10-31T15:34:41Z

It was fixed with #699 hence closing

HTTP content-type isn't using charset

78bb820

p1c2u closed this Oct 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTTP content-type isn't using charset #689

HTTP content-type isn't using charset #689

Cap-JaTu commented Oct 12, 2023

p1c2u commented Oct 12, 2023 •

edited

Loading

Cap-JaTu commented Oct 12, 2023

p1c2u commented Oct 12, 2023

Cap-JaTu commented Oct 17, 2023

p1c2u commented Oct 17, 2023 •

edited

Loading

p1c2u commented Oct 31, 2023

HTTP content-type isn't using charset #689

HTTP content-type isn't using charset #689

Conversation

Cap-JaTu commented Oct 12, 2023

p1c2u commented Oct 12, 2023 • edited Loading

Cap-JaTu commented Oct 12, 2023

p1c2u commented Oct 12, 2023

Cap-JaTu commented Oct 17, 2023

p1c2u commented Oct 17, 2023 • edited Loading

p1c2u commented Oct 31, 2023

p1c2u commented Oct 12, 2023 •

edited

Loading

p1c2u commented Oct 17, 2023 •

edited

Loading