Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API] create a graph with additional edge properties #2521

Merged
merged 11 commits into from
Aug 26, 2022

Conversation

seunghwak
Copy link
Contributor

@seunghwak seunghwak commented Aug 9, 2022

Partially address #2479

We currently support only edge weight as edge properties. This PR's goal is to update cuGraph to support additional edge properties (currently only edge ID and type, eventually any arithmetic types or thrust tuple of arithmetic types).

This PR defines an API. There will be a separate PR for implementation.

@seunghwak seunghwak requested a review from a team as a code owner August 9, 2022 17:27
@seunghwak seunghwak self-assigned this Aug 9, 2022
@seunghwak seunghwak added feature request New feature or request 2 - In Progress non-breaking Non-breaking change labels Aug 9, 2022
@seunghwak seunghwak added this to the 22.10 milestone Aug 9, 2022
@seunghwak seunghwak requested a review from ChuckHastings August 9, 2022 20:00
@seunghwak seunghwak changed the title [WIP] create a grpah with additional edge properties [API] create a grpah with additional edge properties Aug 9, 2022
@seunghwak
Copy link
Contributor Author

rerun tests

@codecov-commenter
Copy link

codecov-commenter commented Aug 16, 2022

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.10@cc05758). Click here to learn what that means.
The diff coverage is n/a.

@@               Coverage Diff               @@
##             branch-22.10    #2521   +/-   ##
===============================================
  Coverage                ?   61.11%           
===============================================
  Files                   ?      106           
  Lines                   ?     5634           
  Branches                ?        0           
===============================================
  Hits                    ?     3443           
  Misses                  ?     2191           
  Partials                ?        0           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@seunghwak seunghwak requested review from naimnv and jnke2016 August 23, 2022 15:54
@ChuckHastings ChuckHastings changed the title [API] create a grpah with additional edge properties [API] create a graph with additional edge properties Aug 23, 2022
Copy link
Contributor

@naimnv naimnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good and understandable to me.

Comment on lines +88 to +117
auto view() const
{
using const_value_iterator = decltype(get_dataframe_buffer_cbegin(buffers_[0]));

std::vector<const_value_iterator> edge_partition_value_firsts(buffers_.size());
std::vector<edge_type> edge_partition_edge_counts(buffers_.size());
for (size_t i = 0; i < edge_partition_value_firsts.size(); ++i) {
edge_partition_value_firsts[i] = get_dataframe_buffer_cbegin(buffers_[i]);
edge_partition_edge_counts[i] = size_dataframe_buffer(buffers_[i]);
}

return detail::edge_property_view_t<edge_type, const_value_iterator>(
edge_partition_value_firsts, edge_partition_edge_counts);
}

auto mutable_view()
{
using value_iterator = decltype(get_dataframe_buffer_begin(buffers_[0]));

std::vector<value_iterator> edge_partition_value_firsts(buffers_.size());
std::vector<edge_type> edge_partition_edge_counts(buffers_.size());
for (size_t i = 0; i < edge_partition_value_firsts.size(); ++i) {
edge_partition_value_firsts[i] = get_dataframe_buffer_begin(buffers_[i]);
edge_partition_edge_counts[i] = size_dataframe_buffer(buffers_[i]);
}

return detail::edge_property_view_t<edge_type, value_iterator>(edge_partition_value_firsts,
edge_partition_edge_counts);
}

Copy link
Contributor

@naimnv naimnv Aug 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seunghwak Not sure if it's a good idea if we merge this two functions into one and use flag to return mutable or immutable view?

auto mutable_or_immutable_view(bool mutable){

    using value_iterator = mutable? decltype(get_dataframe_buffer_begin(buffers_[0])):decltype(get_dataframe_buffer_cbegin(buffers_[0]));

    std::vector<value_iterator> edge_partition_value_firsts(buffers_.size());
    std::vector<edge_type> edge_partition_edge_counts(buffers_.size());
    for (size_t i = 0; i < edge_partition_value_firsts.size(); ++i) {
      edge_partition_value_firsts[i] = mutable?get_dataframe_buffer_begin(buffers_[i]): get_dataframe_buffer_cbegin(buffers_[i]);
      edge_partition_edge_counts[i]  = size_dataframe_buffer(buffers_[i]);
    }
return detail::edge_property_view_t<edge_type, value_iterator>(
      edge_partition_value_firsts, edge_partition_edge_counts);
}
auto view() const{
    return mutable_or_immutable_view(false);
}
auto mutable_view(){
    return mutable_or_immutable_view(true);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This naming somewhat comes with a sort of RAPIDS convention (or I'd better say cuDF's naming scheme).

https://github.com/rapidsai/cudf/blob/branch-22.10/cpp/include/cudf/column/column_device_view.cuh#L928
and this convention somewhat leaked to cuCollection as well (https://github.com/NVIDIA/cuCollections/blob/dev/include/cuco/static_map.cuh#L772 not surprisingly as cuDF & cuCollection developers have an overlap).

I'm more inclined to stick with this convention unless there is a very strong reason to not follow this.

Copy link
Contributor

@naimnv naimnv Aug 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was rather thinking if we can have one common function with boolean flag to indicate constant view or mutable view. And depending on the boolean value the common function would return constant or mutable view.

`auto view() const{
return common_function(false);
}

auto mutable_view(){
return common_function(true);
}`

And that common function needs not to be exposed outside. The objective is to get rid of nearly duplicate code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the code above won't compile unless mutable is a non-type template parameter (so can be evaluated in compile time).

using value_iterator = mutable? decltype(get_dataframe_buffer_begin(buffers_[0])):decltype(get_dataframe_buffer_cbegin(buffers_[0]));
 edge_partition_value_firsts[i] = mutable?get_dataframe_buffer_begin(buffers_[i]): get_dataframe_buffer_cbegin(buffers_[i]);
some = A ? B : C;  // B and C should have the same type to compile.

Need to use std::conditional_t and if constexpr(mutalbe) but in this case, mutable can't be a run time variable.

If I create

template <bool mutable>
auto mutable_or_immutable_view() {
  ...
}

I guess the benefit in the binary size is gone. Not sure avoiding code duplication is worth the added additional complexity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think of that. Seems like we better merge as it is.

namespace detail {

template <typename edge_t, typename ValueIterator>
class edge_partition_edge_property_device_view_t {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would edge_property_partition_device_view_t read better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, there is edge_partition_endpoint_property_device_view_t as well.

This name implies a device view object for edge properties in an edge partition (and edge_partition_endpoint_property_device_view_t means a device view object for edge endpoint (source/destination) properties in an edge partition).

Copy link
Contributor Author

@seunghwak seunghwak Aug 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And there will be vertex_partition_vertex_property_device_view_t in the future as well (to support vertex masking)... These names should be considered in this context.

edge_partition or vertex_partition comes first to emphasize that these work on a vertex partition or an edge partition.

"edge_property_partition" or "edge_endpoint_partition" somewhat sounds like we are partitioning edge properties or edge endpoints instead of edges... Edge partitioning precedes and edge properties or endpoint properties just follow this edge partition.

Based on this assumption, let me know if you have suggestions for better names.

Copy link
Contributor

@naimnv naimnv Aug 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According the what you described, it's probably best to adhere with this naming convention.

@ChuckHastings
Copy link
Collaborator

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 1f83c6b into rapidsai:branch-22.10 Aug 26, 2022
@seunghwak seunghwak deleted the fea_edge_property branch October 20, 2022 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants