Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First PR for WholeGraph 23.06 refactor and add RAPIDS CI scripts #16

Merged
merged 10 commits into from
May 26, 2023

Conversation

dongxuy04
Copy link
Contributor

Refactor for WholeGraph, some updates:

  • More WholeMemory types.
  • WholeMemory Embedding.
  • WholeMemory Embedding Cache support.
  • WholeMemory Embedding Sparse Optimizer.
  • DLPack export for WholeMemory.
  • Torch C10 API removed.
  • Packaging for conda.
  • RAPIDS CI scripts.

@BradReesWork BradReesWork added breaking Introduces a breaking change improvement Improves an existing functionality labels May 10, 2023
@BradReesWork
Copy link
Member

@dongxuy04 you need to add a conda environment yml file under "conda/environments"

@dongxuy04
Copy link
Contributor Author

@dongxuy04 you need to add a conda environment yml file under "conda/environments"

@BradReesWork added conda environment yml files generated by rapids-dependency-file-generator.

@seunghwak
Copy link

seunghwak commented May 10, 2023

Does WholeGraph provide C-API? or C++-API? All the public headers under the cpp/include directory has either .h or .cuh extensions and should I assume this is to support C users?

@dongxuy04
Copy link
Contributor Author

Does WholeGraph provide C-API? or C++-API? All the public headers under the cpp/include directory has either .h or .cuh extensions and should I assume this is to support C users?

Yes, for host API part, .h files provide C-API instead of C++-API, the initial consideration is not to introduce ABI issues. For device headers, .cuh aims to provide developing interface on WholeMemory, so it expose C++ device API to access WholeMemory.

@BradReesWork
Copy link
Member

@dongxuy04 can you rename build_components.sh to just build.sh to match other repos?

@dongxuy04
Copy link
Contributor Author

@dongxuy04 can you rename build_components.sh to just build.sh to match other repos?

@BradReesWork I have renamed build_components.sh to build.sh, and updated scripts called this script.

@MatthiasKohl
Copy link

@BradReesWork this PR has ~50K touched lines, is there something in particular you want me to review?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For another time: what is our strategy regarding GNN ops? @teju85 @BradReesWork
cugraph-ops always has been told to stay closed source. This is mostly duplicating the work we have in cugraph-ops. Further functionalities are duplicating stuff we have in cugraph-pyg and cugraph-dgl.
As discussed already two times offline, I thought the whole point of this refactoring is to make WholeGraph a storage-only solution.

@MatthiasKohl
Copy link

For every file and kernel under cpp/src/graph_ops, can we get a description of:

  1. what is the goal? what model this operation would be used for?
  2. was cugraph-ops considered? if yes, why was it not possible to use cugraph-ops/what is missing from cugraph-ops?

@MatthiasKohl
Copy link

tagging @BradReesWork since you asked for scope of duplication of operations and @dongxuy04 for visibility for the above comment

@teju85
Copy link
Member

teju85 commented May 12, 2023

I understand that this PR only focuses on refactor of WG into a more RAPIDS-like workflow. But I too still want to mention the cugraph-ops integration into WG and urge to put this on our immediate list of things todo for WG.

I agree with both Max's/Matthias' requests. cugraphops should already provide all (if not most) of those operations. So, it's not worth the duplication. I have filed issue #17 for this.

@dongxuy04
Copy link
Contributor Author

For every file and kernel under cpp/src/graph_ops, can we get a description of:

  1. what is the goal? what model this operation would be used for?
  2. was cugraph-ops considered? if yes, why was it not possible to use cugraph-ops/what is missing from cugraph-ops?

These files are legacy implementations left over from previous versions of WholeGraph code. Agreed that we need to remove these duplication and actually this is in processing. The goal might be to get most of these files replaced by cuGraph ops.

Here are the goal of the files under cpp/src/graph_ops.

  • append_unique*: AppendUnique op, it is an optimizing OP for ids between sampling of two layers. The action is append sampled neighbor nodes to center nodes, and the do unique while keep center nodes unchanged. May need to keep as it is not related to cuGraphOps.
  • csr_add_self_loop*: add self loop to sampled csr sub graph. Not sure if there is this kind of ops. If yes, we may remove that.
  • edge_weight_softmax*, gspmm_csr_weighted*, spadd_gat_csr*: ops related to GAT model, currently GAT module from cuGraphOps is integrating, will be removed after confirmed.
  • spmm_csr_no_weight*: Ops related to GraphSAGE op and GCN op, currently GraphSAGE module from cuGraphOps is integrated, will be removed later.

@dongxuy04
Copy link
Contributor Author

@MatthiasKohl @stadlmax @teju85 @BradReesWork Agreed that it doesn't worth duplication. I have verified that similar ops in cuGraphOps works and removed the duplicated ops under cpp/src/graph_ops.

@BradReesWork
Copy link
Member

the "cmake" folder should be under the cpp directory

@BradReesWork
Copy link
Member

the top level folder for python code should be called "python" and then all the moudules under that

.gitignore Outdated
Debug
build/

## Eclipse IDE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should keep the IDA and Jupyter ignores since many developers use those tools

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.gitignore updated.

.gitlab-ci.yml Outdated
@@ -0,0 +1,103 @@
# This file is a template, and might need editing before it works on your project.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is GitLab CI file stil needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. It is not needed and deleted.

README.md Outdated
### Examples

Checkout [GNN example](docs/GNNExample.md) for more details.
# rapids-wholegraph
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have a minimal introduction

@@ -0,0 +1,199 @@
#!/bin/bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

./build.sh without any arguments does nothing. Is that the intended function?

@@ -0,0 +1,199 @@
#!/bin/bash

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

./build.sh pylibwholegraph
Building for ALL supported GPU architectures...
env: ‘setup.py’: No such file or directory

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added ${REPODIR} to setup.py and fixed this issue

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

…using package; 2. add short README.md; 3. fix build.sh to find setup.py
@dongxuy04
Copy link
Contributor Author

the "cmake" folder should be under the cpp directory

@dongxuy04 dongxuy04 closed this May 26, 2023
@dongxuy04 dongxuy04 reopened this May 26, 2023
@BradReesWork BradReesWork changed the base branch from branch-23.06 to refactoring May 26, 2023 13:48
@BradReesWork BradReesWork merged commit e8350ba into rapidsai:refactoring May 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Introduces a breaking change improvement Improves an existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants