Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix inconsistent graph properties between the SG and the MG API #3757

Conversation

jnke2016
Copy link
Contributor

@jnke2016 jnke2016 commented Jul 28, 2023

Several graph methods are failing, some being an effect of migrating away from cython.cu renumbering.
This PR fixes couple graph methods and fixes the inconsistency in results returned by the SG and MG API

closes #3740
closes #3766

@jnke2016 jnke2016 requested a review from a team as a code owner July 28, 2023 17:35
@BradReesWork BradReesWork added bug Something isn't working non-breaking Non-breaking change labels Aug 1, 2023
@BradReesWork BradReesWork added this to the 23.08 milestone Aug 1, 2023
if self.properties.renumbered:
edgelist_df = self.renumber_map.unrenumber(
edgelist_df, simpleGraphImpl.srcCol
)
edgelist_df = self.renumber_map.unrenumber(
edgelist_df, simpleGraphImpl.dstCol
)

# FIXME: revisit this approach
print("the mapping = ", self.renumber_map.internal_to_external_col_names)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you remove this debug print?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I was going to remove it

Comment on lines 680 to 681
# FIXME: instead of hardcoded value, it should be 'simpleGraphImpl.srcCol'
# but there is no way to retrieve it with the current API
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the NumberMap instance is constructed with the src and dst column names (self.src_col_names, self.dst_col_names), can you use those?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right I didn't catch that those were also captured in NumberMap. Thanks

Copy link
Contributor Author

@jnke2016 jnke2016 Aug 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually no. Now I remember why I added the FIXME stating that we can't retrieve those column names. self.src_col_names and self.dst_col_names are the column names passed by the user as input and are not the internal referenced ones in simpleGraphImpl.srcCol and simpleGraphImpl.dstCol. With our current API, those ones are not retrievable in NumberMap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think I can capture that in NumberMap from simpleGraph


# FIXME: instead of hardcoded value, it should be 'simpleGraphImpl.srcCol'
# but there is no way to retrieve it with the current API
if column_name in [self.renumbered_src_col_name, "src"]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we had "src" and "dst" defined somewhere so that we always used a reference rather than a string that we could type wrong

Copy link
Contributor Author

@jnke2016 jnke2016 Aug 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we had it defined in simpleGraph but I didn't see that it was also stored in NumberMap (That's why I added a fixme so that I can come and revisit it) until @rlratzel pointed that out. It should be fixed in the next commit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually no, those are not reference in NumberMap. Now I remember why I added the FIXME stating that we can't retrieve those column names. self.src_col_names and self.dst_col_names are the column names passed by the user as input and are not the internal referenced ones in simpleGraphImpl.srcCol and simpleGraphImpl.dstCol. With our current API, those ones are not retrievable in NumberMap.

@@ -118,7 +118,7 @@ def calc_betweenness_centrality(
)

M = G.to_pandas_edgelist().rename(
columns={"src": "0", "dst": "1", "weights": "weight"}
columns={"src": "0", "dst": "1", "wgt": edge_attr}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is another place where I would like to see a references. We have switched from "wt" to "weight" to "weights", and now to "wgt". by using a reference we would only change the column names in one place

Copy link
Contributor Author

@jnke2016 jnke2016 Aug 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using the datasets API to get a graph with graph_file.get_graph, this implicitly has the column names hardcoded to src, dst, and wgt. For instance the karate datasets in the yaml file

@rlratzel rlratzel added the DO NOT MERGE Hold off on merging; see PR for details label Aug 3, 2023
@rlratzel
Copy link
Contributor

rlratzel commented Aug 3, 2023

added the DO NOT MERGE label since @jnke2016 is adding another test.

@rlratzel rlratzel removed the DO NOT MERGE Hold off on merging; see PR for details label Aug 3, 2023
@BradReesWork
Copy link
Member

/merge

@rapids-bot rapids-bot bot merged commit 62ecea2 into rapidsai:branch-23.08 Aug 4, 2023
54 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG]: Broken graph methods Inconsistent num edges and vertices between the SG and MG API
3 participants