Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChainMap has wrong or unintuitive type after #6042 #8430

Open
kaste opened this issue Jul 28, 2022 · 2 comments
Open

ChainMap has wrong or unintuitive type after #6042 #8430

kaste opened this issue Jul 28, 2022 · 2 comments

Comments

@kaste
Copy link
Contributor

kaste commented Jul 28, 2022

#6042/#6044 gives me new mypy complaints on my code base.

ChainMap only ever writes to the first mapping in maps. That's my understanding. For example ChainMap({}, defaults) is a typical use-case where the user gets a dict-like and cannot mutate the defaults. That's the typical stacking property of the ChainMap, you only ever mutate the top, most recent mapping.

From my understanding:

def __init__(self):
def __init__(self, __map: MutableMapping):
def __init__(self, __map: MutableMapping, *maps: ImmutableMapping):

I don't think we can have correct hints for the maps member, and new_child of course expects a mutable mapping because iirc it is exactly the next top most thing on the stack.

Originally posted by @kaste in #6042 (comment)

(I started this as a comment on the closed issue but closed issues don't move and I got initial supportive feedback on my comment from @hauntsaninja.)

@kaste
Copy link
Contributor Author

kaste commented Jul 28, 2022

@hauntsaninja wrote

That sounds reasonable to me. I would be in favour of:

class ChainMap(MutableMapping[_KT, _VT], Generic[_KT, _VT]):
     maps: list[Mapping[_KT, _VT]]
     def __init__(self) -> None: ...
     def __init__(self, __map: MutableMapping[_KT, _VT], *maps: Mapping[_KT, _VT]) -> None: ...

Also just to call something out:

My understanding is that the typeshed policy is to err on the side of false negatives ... Here is an example in a ChainMap subclass ...

While I'm not sure we've stated this out loud, historically, typeshed has very consistently preferred false positives for subclassers of classes over false positives for other users of classes.

ssbarnea added a commit to ssbarnea/python that referenced this issue Jan 21, 2023
Fixes problems reported by mypy.

Related: python/typeshed#8430
pawamoy pushed a commit to mkdocstrings/python that referenced this issue Jan 23, 2023
Fixes problems reported by mypy.

Related typeshed issue: python/typeshed#8430
PR #53: #53
pawamoy added a commit to mkdocstrings/python that referenced this issue Oct 9, 2023
This was done in commit 3cbe472
to avoid a type warning that the second and next arguments to ChainMap
are not mutable.
Related Mypy issue: python/typeshed#8430.

When we use `!relative` in MkDocs configuration, the config
then contains a MkDocs path placeholder instance, which itself contains
a reference to the MkDocs config, and all its plugins,
including mkdocstrings, and all its objects, etc.

Upon deepcopying this huge object tree, deepcopy fails on
mkdocstrings private attribute `_inv_futures`, which contains
`Future` and therefore `thread.RLock` objects, which are not serializable.
This failed with a `TypeError: cannot pickle '_thread.RLock` objects`.

So in this commit we stop deep-copying everything
just to avoid a Mypy warning, and instead we add a type-ignore comment.
If Mypy fixes this someday, we'll simply get a new warning that the
comment is unused.
@sanderr
Copy link
Contributor

sanderr commented Jan 23, 2025

I just discovered collections.ChainMap, and I have to say I was surprised that it requires all member mappings to be mutable. I like the proposed __init__ typing, and I'd like to add that perhaps it's not so bad that maps would be typed as (immutable) Mapping: if I understand correctly end users aren't expected to modify inner mappings directly anyway (chainmap.maps[1]["key"] = "value"), so from their perspective it's only right. The fact that the stdlib implementation itself "violates" it is unfortunate, but as I see it it's more an implementation detail than a true violation: it knows that the first element is mutable and acts on it. Whether it does that by storing a separate reference or by accessing it through maps is irrelevant.

I believe that the most troublesome part with this suggestion is that the ChainMap documentation states that maps may be modified by the end user (e.g. maps[0] = {}). If we type it as suggested above, that would allow the user to set an immutable mapping at maps[0], that the stdlib implementation would then happily modify. As long as maps is intended to be mutable directly by the end user, I don't think the suggested change is feasible (except with python/typing#548 as stated in the original thread), however much I'd like to see it.

I was intrigued by the idea of an immutable variant, so I gave it a shot. It does require (safe) type ignores if built on top of the stdlib ChainMap, but I think that's by far preferable over the alternative. I suspect it also takes a performance hit, considering that this implements the Python ABC, while ChainMap (I'm assuming, haven't checked) is written in C. That said, I'll put it here in case anyone is interested.

I made two variations: one that is in fact still mutable but protects its member mappings from mutation. And one that is immutable itself. The benefit of the latter is that it can confidently implement the maps and parents properties, while the meanings of those get (partially) lost for the former. I did not do extensive testing on either of these.

import collections
from collections.abc import Iterator, Mapping, MutableMapping, Sequence
from typing import Self


class OverlayChainMap[K, V](MutableMapping[K, V]):
    """
    ChainMap implementation that protects its member mappings from mutation, while still allowing mutations on the chain map
    itself.

    See collections.ChainMap for more details. Does not implement the `maps` and `parents` properties,
    because they lose much of their meaning due to the overlay mapping sitting in between the user-supplied ones.
    """
    def __init__(self, *maps: Mapping[K, V]) -> None:
        # construct backing ChainMap with owned, mutable dict as first map
        self._chainmap: collections.ChainMap[K, V] = collections.ChainMap(
            {},
            # This is a type violation due to a limitation of the ChainMap constructor (python/typeshed#8430).
            # This workaround builds on the (documented) knowledge that ChainMap only ever modifies the first mapping,
            # and the fact that the constructed chain map is never exposed to the end user.
            # Therefore the type violation is a safe, if imperfect, workaround to the typing limitation.
            *maps,  # type: ignore
        )

    def new_child(self, m: Mapping[K, V] | None = None, **kwargs: V) -> Self:
        return self.__class__(
            *self._chainmap.new_child(
                m,  # type: ignore
            ).new_child({}, **kwargs).maps
        )

    def __getitem__(self, key: K) -> V:
        return self._chainmap[key]

    def __setitem__(self, key: K, value: V) -> None:
        self._chainmap[key] = value

    def __delitem__(self, key: K) -> None:
        del self._chainmap[key]

    def __iter__(self) -> Iterator[K]:
        return iter(self._chainmap)

    def __len__(self) -> int:
        return len(self._chainmap)


class ImmutableChainMap[K, V](Mapping[K, V]):
    """
    Immutable ChainMap implementation. See collections.ChainMap for more details.
    """
    def __init__(self, *maps: Mapping[K, V]) -> None:
        # construct backing ChainMap with owned, mutable dict as first map
        self._chainmap: collections.ChainMap[K, V] = collections.ChainMap(
            # This is a type violation due to a limitation of the ChainMap constructor (python/typeshed#8430).
            # This workaround builds on the fact that we own the backing chain map, we never mutate it, and we only
            # expose it or its attributes as immutable properties.
            *maps  # type: ignore
        )

    def new_child(self, m: Mapping[K, V] | None = None) -> Self:
        return self.__class__(
            *self._chainmap.new_child(
                m,  # type: ignore
            ).maps
        )

    @property
    def maps(self) -> Sequence[Mapping[K, V]]:
        return self._chainmap.maps

    @property
    def parents(self) -> Self:
        return self.__class__(*self._chainmap.parents.maps)

    def __getitem__(self, key: K) -> V:
        return self._chainmap[key]

    def __iter__(self) -> Iterator[K]:
        return iter(self._chainmap)

    def __len__(self) -> int:
        return len(self._chainmap)



d1: MutableMapping[str, int] = {"one": 1, "shared": 1}
d1_copy: Mapping[str, int] = {**d1}
d2: MutableMapping[str, int] = {"two": 2, "shared": 2}
d2_copy: Mapping[str, int] = {**d2}

copied: Mapping[str, int] = {**d1, **d2}
chained: MutableMapping[str, int] = collections.ChainMap(d2, d1)
overlayed: MutableMapping[str, int] = OverlayChainMap(d2, d1)
immutably_chained: Mapping[str, int] = ImmutableChainMap(d2, d1)

assert (
    list(chained.items())
    == list(overlayed.items())
    == list(immutably_chained.items())
    == list(copied.items())
    == [("one", 1), ("shared", 2), ("two", 2)]
)

overlayed["three"] = 3  # adds a new item
overlayed["one"] = 42  # changes the value in overlayed, not in d1
overlayed["two"] = 42  # changes the value in overlayed, not in d2
assert list(overlayed.items()) == [("one", 42), ("shared", 2), ("two", 42), ("three", 3)]
assert d1 == d1_copy
assert d2 == d2_copy
chained["three"] = 3  # adds a new item
chained["one"] = 42  # changes the value in chained, not in d1
chained["two"] = 42  # changes the value in both chained and d2
assert list(chained.items()) == [("one", 42), ("shared", 2), ("two", 42), ("three", 3)]
assert d1 == d1_copy
assert d2 != d2_copy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants