Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deprecate OrderedMultiDict #2968

Closed
davidism opened this issue Oct 27, 2024 · 6 comments · Fixed by #2975
Closed

deprecate OrderedMultiDict #2968

davidism opened this issue Oct 27, 2024 · 6 comments · Fixed by #2975
Assignees
Milestone

Comments

@davidism
Copy link
Member

OrderedMultiDict is not used by Werkzeug. It has a complex implementation, and is documented to be slow. dict maintains key insertion order in all supported versions of Python. MultiDict already maintains the insertion order of values within a key. OrderedMultiDict maintains the insertion order of each item, so [(a, 1), (b, 2), (a, 3)] comes out the same instead of [(a, 1), (a, 3), (b, 2)]. It's not clear there's a use case for this, searching sourcegraph shows some use of OrderedMultiDict but it's not clear if the use is actually required vs using MultiDict. Other libraries such as boltons also provide implementations. Mark this as deprecated and see if anything comes up.

@davidism davidism added this to the 3.1.0 milestone Oct 27, 2024
@davidism
Copy link
Member Author

https://sourcegraph.com/search?q=context:global+lang:Python+werkzeug+OrderedMultiDict+-file:venv+-file:site-packages+-file:werkzeug+-repo:pallets/werkzeug+&patternType=keyword&sm=0 shows ~24 repos using OrderedMultiDict. Browsing through them, most uses do not seem necessary.

I did see a couple cases where knowing the exact order that request.args or request.form was parsed in was important. But in that case, it's more straightforward to compare the exact data sent rather than the parsed version:

  • request.query_string is the exact value from the URL that gets parsed into request.args
  • request.get_data() called before accessing request.form returns the exact request body body

@davidism
Copy link
Member Author

davidism commented Oct 28, 2024

@ThiefMaster Indico uses it in a few places: https://github.com/search?q=repo%3Aindico%2Findico%20orderedmultidict&type=code, thoughts? It's not clear that it's needed.

@ThiefMaster
Copy link
Member

ThiefMaster commented Oct 28, 2024

I think parameter_storage_class = ImmutableOrderedMultiDict (on the Request subclass) is the most interesting one, because I added that one due to how crappy the PayPal webhook validation is (they require you to send back data you received in the exact same order).

IIRC there was either something in the docs on on some SO post recommending this, so there may be more people using it when they have a paypal integration in their flask app.

request.get_data() called before accessing request.form returns the exact request body body

FWIW depending on the architecture of an application (especially when plugins are involved) it may be hard to ensure that gets called before anything else accesses the form data.


The other two places where it's used are most likely leftovers from before dict was guaranteed to be ordered by insertion order, so if the standard MultiDict already does that now as well it's enough.

@davidism
Copy link
Member Author

davidism commented Oct 28, 2024

PayPal IPN was one use case I saw, seems like a weird API design. The other use case I found seemed to be trying to check that a URL was canonical based not only on the query args but their exact order in the URL.

I know it's not great to have to change it, but doing something like the following would probably be enough for the PayPal case, if an app can't guarantee that the data is cached first. The query args case could probably just look at query_string or parse_qsl(query_string) instead.

class StoredImmutableMultiDict(ImmutableMultiDict):
    def __init__(self, mapping):
        assert isinstance(mapping, list)
        self.original_items: list[tuple[str, str]] = mapping
        super().__init__(mapping)

parameter_storage_class = StoredImmutableMultiDict

Then you can access form.original_items if needed. Even seems preferable to OrderedMultiDict, which is slower and uses even more memory.

I did see that Werkzeug's documentation for parameter_storage_class mentions ImmutableOrdereMultiDict as an option, so that's probably where the advice came from.

@ThiefMaster
Copy link
Member

FWIW I don't think paypal has any multi-value keys, so probably a normal MultiDict is enough nowadays...

@davidism
Copy link
Member Author

davidism commented Oct 28, 2024

Oh yeah, that level of ordering only matters if there are actually multiple values per key, and they're submitted with keys interleaved. Seems exceedingly unlikely.

@davidism davidism self-assigned this Oct 29, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants