Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC and WIP: Add MypyTypeInferenceProvider #831

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Commits on Dec 8, 2022

  1. RFC and WIP: Add MypyTypeInferenceProvider

    This change is RFC and WIP (please read whole change message).
    
    Add `MypyTypeInferenceProvider` as an alternative for
    `TypeInferenceProvider`. The provider infers types using mypy as
    library. The only requirement for the usage is to have the latest mypy
    installed. Types inferred are mypy types, since mypy type system is well
    designed, to avoid the conversion, and also to keep it simple. For
    compatibility and extensibility reasons, these types are stored in
    separate field `MypyType.mypy_type`.
    
    Let's assume we have the following code in the file `x.py` which we want
    to inspect:
    ```python
    x = [42]
    
    s = set()
    
    from enum import Enum
    
    class E(Enum):
        f = "f"
    
    e = E.f
    ```
    
    Then to get play with mypy types one should use the code like:
    ```python
    import libcst as cst
    
    from libcst.metadata import MypyTypeInferenceProvider
    
    filename = "x.py"
    module = cst.parse_module(open(filename).read())
    cache = MypyTypeInferenceProvider.gen_cache(".", [filename])[filename]
    wrapper = cst.MetadataWrapper(
        module,
        cache={MypyTypeInferenceProvider: cache},
    )
    
    mypy_type = wrapper.resolve(MypyTypeInferenceProvider)
    x_name_node = wrapper.module.body[0].body[0].targets[0].target
    set_call_node = wrapper.module.body[1].body[0].value
    e_name_node = wrapper.module.body[-1].body[0].targets[0].target
    
    print(mypy_type[x_name_node])
     # prints: builtins.list[builtins.int]
    
    print(mypy_type[x_name_node].fullname)
     # prints: builtins.list[builtins.int]
    
    print(mypy_type[x_name_node].mypy_type.type.fullname)
     # prints: builtins.list
    
    print(mypy_type[x_name_node].mypy_type.args)
     # prints: (builtins.int,)
    
    print(mypy_type[x_name_node].mypy_type.type.bases[0].type.fullname)
     # prints: typing.MutableSequence
    
    print(mypy_type[set_call_node])
     # prints: builtins.set
    
    print("issuperset" in mypy_type[set_call_node].mypy_type.names)
     # prints: True
    
    print(mypy_type[set_call_node.func])
     # prints: typing.Type[builtins.set]
    
    print(mypy_type[e_name_node].mypy_type.type.is_enum)
     # prints: True
    ```
    
    Why?
    
    1. `TypeInferenceProvider` requires pyre (`pyre-check` on PyPI) to be
       installed. mypy is more popular than pyre. If the organization uses
       mypy already (which is almost always the case), it may be difficult
       to assure colleagues (including security team) that "we need yet
       another type checker".
    2. Even though it is possible to run pyre without watchman installation,
       this is not advertised. watchman installation is not always possible
       because of system requirements, or because of the security
       requirements like "we install only our favorite GNU/Linux
       distribution packages".
    3. `TypeInferenceProvider` usage requires `pyre start` command to be run
       before the execution, and `pyre stop` - after the execution. This may
       be inconvenient, especially for the cases when pyre was not used
       before.
    4. Types produced by pyre in `TypeInferenceProvider` are just strings.
       For example, it's not easily possible to infer that some variable is
       enum instance. `MypyTypeInferenceProvider` makes it easy, see the
       code above.
    
    Drawbacks:
    
    1. Speed. mypy is slower than pyre, so is `MypyTypeInferenceProvider`
       comparing to `TypeInferenceProvider`.
       How to partially solve this:
       1. Implement AST tree caching in mypy. It may be difficult, however
          this will lead to speed improvements for all the projects that use
          this functionality.
       2. Implement inferred types caching inside LibCST. As far as I know,
          no caching at all is implemented inside LibCST, which is the
          prerequisite for inferred types caching, so the task is big.
       3. Implement LibCST CST to mypy AST. I am not sure if this possible
          at all. Even if it is possible, the task is huge.
    2. Two providers are doing similar things in LibCST will be present,
       this can potentially lead to the situation when there is a need
       install two type checkers to get all codemods from the library
       running.
       Alternatives considered:
       1. Put `MypyTypeInferenceProvider` inside separate library (say,
           LibCST-mypy or `libcst-mypy` on PyPI). This will explicitly
           separate `MypyTypeInferenceProvider` from the rest of LibCST.
          Drawbacks:
          1. The need to maintain separate library.
          2. Limited fame (people need to know that the library exists).
          3. Since some codemods cannot be implemented easily without the
             library, for example, `if-elif-else` to `match` converter
    	 (it needs powerful type inference), they are doomed to not be
    	 shipped with LibCST.
       2. Implement base class for inferred type, which inherits from `str`
          (to keep the compatibility with the existing codebase) and
          the mechanism for dynamically selecting `TypeInferenceProvider`
          type checker (mypy or pyre; user can do this via environmental
          variable). If the code inside LibCST requires just shallow type
          information (so, just `str` is enough), then the code can run with
          any type checker. The remaining code (such as `if-elif-else` to
          `match` converter) will still require mypy.
    
    Misc:
    
    Code does not lint in my env, by some reason `pyre check` cannot find
    `mypy` library.
    Roman Inflianskas committed Dec 8, 2022
    Configuration menu
    Copy the full SHA
    78f4b1d View commit details
    Browse the repository at this point in the history