Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Community Sprint] Add missing type hints and TorchScript support 🚀 #5657

Closed
rusty1s opened this issue Oct 12, 2022 · 7 comments
Closed

Comments

@rusty1s
Copy link
Member

rusty1s commented Oct 12, 2022

🚀 The feature, motivation and pitch

We are kicking off our very first community sprint!

The community sprint resolves around adding missing type hints and TorchScript support for various functions across PyG, aiming to improve and clean-up our core codebase. Each individual contribution is designed to only take around 30 minutes to two hours to complete.

The sprint begins Wednesday October 12th with a kick off meeting at 8am PST. The community sprint will last 2 weeks and we will have another live hangouts when the sprint has completed. If you are interested in helping out, please also join our PyG slack channel #community-sprint-type-hints for more information. You can assign yourself to the type hints you are planning to work on here.

🚀 Add missing type hints and TorchScript support

Type hints are currently used inconsistently in the torch-geometric repository, and it would be nice to make them a complete, consistent thing across all datasets, models and utilities. Adding type hint support in models also helps us to improve our TorchScript coverage across layers and models provided in nn.*.

Example

Take a look at the current implementation of contains_isolated_nodes:

def contains_isolated_nodes(edge_index, num_nodes=None):
    num_nodes = maybe_num_nodes(edge_index, num_nodes)
    edge_index, _ = remove_self_loops(edge_index)
    return torch.unique(edge_index.view(-1)).numel() < num_nodes

Adding type hints support to the function signature helps us to better understand its input and output, improving code readability:

def contains_isolated_nodes(
   edge_index: Tensor,
   num_nodes: Optional[int] = None,
) -> bool:
   ...

Importantly, it also lets us use it as part of a TorchScript model. Without it, all arguments that miss type hints are expected to be PyTorch tensors (which is clearly not the case for the num_nodes argument). Without it, torch.jit.script compilation will fail:

import torch

from torch_geometric.utils import contains_isolated_nodes

contains_isolated_nodes = torch.jit.script(contains_isolated_nodes)

contains_isolated_nodes(torch.tensor([[0, 1, 0], [1, 0, 0]])
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
  File ".../pytorch_geometric/torch_geometric/utils/isolated.py", line 29, in contains_isolated_nodes
    num_nodes = maybe_num_nodes(edge_index, num_nodes)
                ~~~~~~~~~~~~~~~ <--- HERE
    edge_index, _ = remove_self_loops(edge_index)
    return torch.unique(edge_index.view(-1)).numel() < num_nodes
RuntimeError: Cannot input a tensor of dimension other than 0 as a scalar argument

Guide to contributing

See here for a basic example to follow.

  1. Ensure you have read our contributing guidelines.
  2. Claim the functionality/model you want to improve here.
  3. Implement the changes as in [Type Hints] utils.contains_isolated_nodes #5603. At best, ensure in a test that the model/function is convertable to a TorchScript program and that it results in the same output. It is okay to add type hint support for multiple functions/models within a single PR as long as you have assigned yourself to each of them, and the number of file changes stays at a reasonable number to ease reviewing (e.g., not more than 10 touched files).
  4. Open a PR to the PyG repository and name it: "[Type Hints] {model_name/function_name}". In addition, ensure that documentation is rendered properly (CI will build the documentation automatically for you). Afterwards, add your PR number to the "Improved type hint support" line in CHANGELOG.md.

Tips for making your PR

  • If you are unfamiliar with how type hints work, you can read the Python library documentation on them, but it is probably even easier to just look at another PR that added them.
  • The types will usually be obvious, e.g., Tensor, bool, int, float. Wrap them within Optional[*] whenever the argument can be None.
  • Specialized PyG type hints (e.g., Adj) are defined in typing.py.
  • In some rare cases, type hints are challenging to add, e.g., whenever a model/function supports a Union of different types (which may be the case for functions/models that also support SparseTensor, e.g., edge_index: Union[Tensor, SparseTensor]). In that case, TorchScript support can be achieved via the torch.jit._overload decorator. See here for an example.
  • The corresponding tests of PyG models and functions can be found in the test/ directory. For example, tests for torch_geometric/utils/isolated.py can be found in test/utils/test_isolated.py. You can run individual test files via pytest test/utils/test_isolated.py. It is only necessary to test TorchScript support for the nn.* and utils.* packages. No changes necessary for datasets.* and transforms.* packages.
  • Ensure that the TorchScript variant compiles and achieves the same output:
    from torch_geometric.testing import is_full_test
    
    ...
    
    edge_index = torch.tensor([[0, 1, 2, 0], [1, 0, 2, 0]])
    assert contains_isolated_nodes(edge_index)
      
    if is_full_test():
        jit = torch.jit.script(contains_isolated_nodes)
        assert jit(edge_index)
    Note that we generally gate TorchScript tests behind is_full_test() which guarantees that TorchScript tests are only run nightly. You can enable full tests locally via the FULL_TEST=1 environment variable, e.g., FULL_TEST=1 pytest test/utils/test_isolated.py. It is only necessary to ensure TorchScript support for the nn.* and utils.* packages. No changes necessary for datasets.* and transforms.* packages.

Functions/Models to update

This list may be incomplete. If you still find a function without missing type hints/TorchScript tests, please let us know or add them on your own.

Alternatives

No response

Additional context

No response

@EdisonLeeeee
Copy link
Contributor

Looks great! I'm more than happy to take transforms.* :)

@rusty1s
Copy link
Member Author

rusty1s commented Oct 12, 2022

@EdisonLeeeee Sounds great. Please add yourself to the Google Sheet doc, but please leave room for other people to participate as well :)

@venomouscyanide
Copy link
Contributor

@rusty1s is nn.InnerProductEncoder supposed to be nn.InnerProductDecoder ? Not sure where nn.InnerProductEncoder is present in the codebase.

@rusty1s
Copy link
Member Author

rusty1s commented Oct 17, 2022

@venomouscyanide This is indeed a typo, good catch. Fixed :)

@EdisonLeeeee
Copy link
Contributor

EdisonLeeeee commented Oct 17, 2022

@rusty1s Some of the classes in transforms.* is missing (partially) type hints, e.g., transforms.Center and transforms.AddSelfLoops, which are not mentioned in this issue. Shall we also add type hints for them?

@rusty1s
Copy link
Member Author

rusty1s commented Oct 17, 2022

Yeah, if you have found some transforms not included here, please add them to the list :)

@rusty1s
Copy link
Member Author

rusty1s commented Oct 29, 2022

We reached 100% coverage of the community sprint! 🎉 This is amazing to see! I like to deeply thank everyone who participated in this sprint. Thanks for improving our code base and for contributing to PyG!

@rusty1s rusty1s closed this as completed Oct 29, 2022
rusty1s added a commit that referenced this issue Dec 4, 2023
Addresses
#3429 (reply in thread).
Related to #5657.

> Package maintainers who wish to support type checking of their code
MUST add a marker file named py.typed to their package supporting
typing. This marker applies recursively: if a top-level package includes
it, all its sub-packages MUST support type checking as well.

https://peps.python.org/pep-0561/#packaging-type-information

---------

Co-authored-by: rusty1s <matthias.fey@tu-dortmund.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants