Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of closeness_centrality #593

Merged
merged 11 commits into from
Mar 10, 2023
3 changes: 3 additions & 0 deletions docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ Centrality
:toctree: apiref

retworkx.betweenness_centrality
retworkx.closeness_centrality

.. _traversal:

Expand Down Expand Up @@ -289,6 +290,7 @@ the functions from the explicitly typed based on the data type.
retworkx.digraph_spring_layout
retworkx.digraph_num_shortest_paths_unweighted
retworkx.digraph_betweenness_centrality
retworkx.digraph_closeness_centrality
retworkx.digraph_unweighted_average_shortest_path_length
retworkx.digraph_bfs_search
retworkx.digraph_dijkstra_search
Expand Down Expand Up @@ -336,6 +338,7 @@ typed API based on the data type.
retworkx.graph_spring_layout
retworkx.graph_num_shortest_paths_unweighted
retworkx.graph_betweenness_centrality
retworkx.graph_closeness_centrality
retworkx.graph_unweighted_average_shortest_path_length
retworkx.graph_bfs_search
retworkx.graph_dijkstra_search
Expand Down
29 changes: 29 additions & 0 deletions releasenotes/notes/closeness-centrality-459c5c7e35cb2e63.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
---
features:
- |
Added a new function, :func:`~retworkx.closeness_centrality` to compute
closeness centrality of all nodes in a :class:`~retworkx.PyGraph` or
:class:`~retworkx.PyDiGraph` object.

The closeness centrality of a node :math:`u` is the reciprocal of the
average shortest path distance to :math:`u` over all :math:`n-1` reachable
nodes.

.. math::

C(u) = \frac{n - 1}{\sum_{v=1}^{n-1} d(v, u)},

where :math:`d(v, u)` is the shortest-path distance between :math:`v` and
:math:`u`, and :math:`n` is the number of nodes that can reach :math:`u`.

Wasserman and Faust propose an improved formula for graphs with more than
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a link to the paper for this? If so it'd be good to include a link to it. (also in the docstrings)

one connected component. The result is "a ratio of the fraction of actors
in the group who are reachable, to the average distance" from the reachable
actors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph looks identical to what's in the NetworkX documentation for this option: https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.centrality.closeness_centrality.html

It might be better to put this in your own words instead of copying exactly what NetworkX has in the documentation or provide a link to the original text as a comment to cite the original source of this text (same with the docstrings).


.. math::

C_{WF}(u) = \frac{n-1}{N-1} \frac{n - 1}{\sum_{v=1}^{n-1} d(v, u)},

where :math:`N` is the number of nodes in the graph. By default, this is
enabled.
81 changes: 75 additions & 6 deletions retworkx-core/src/centrality.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,11 @@ use std::collections::VecDeque;
use std::sync::RwLock;

use hashbrown::HashMap;
use petgraph::algo::dijkstra;
use petgraph::graph::NodeIndex;
use petgraph::visit::{
GraphBase,
GraphProp, // allows is_directed
IntoNeighborsDirected,
IntoNodeIdentifiers,
NodeCount,
NodeIndexable,
GraphBase, GraphProp, IntoEdges, IntoEdgesDirected, IntoNeighborsDirected, IntoNodeIdentifiers,
NodeCount, NodeIndexable, Reversed, Visitable,
};
use rayon::prelude::*;

Expand Down Expand Up @@ -297,3 +294,75 @@ where
sigma,
}
}

/// Compute the closeness centrality of each node in the graph.
///
/// The closeness centrality of a node `u` is the reciprocal of the average
/// shortest path distance to `u` over all `n-1` reachable nodes.
///
/// Wasserman and Faust propose an improved formula for graphs with more than
/// one connected component. The result is "a ratio of the fraction of actors
/// in the group who are reachable, to the average distance" from the reachable
/// actors. You can enable this by setting `wf_improved` to `true`.
///
/// Arguments:
///
/// * `graph` - The graph object to run the algorithm on
/// * `wf_improved` - If `true`, scale by the fraction of nodes reachable.
///
/// # Example
/// ```rust
/// use retworkx_core::petgraph;
/// use retworkx_core::centrality::closeness_centrality;
///
/// // Calculate the closeness centrality of Graph
/// let g = petgraph::graph::UnGraph::<i32, ()>::from_edges(&[
/// (0, 4), (1, 2), (2, 3), (3, 4), (1, 4)
/// ]);
/// let output = closeness_centrality(&g, true);
/// assert_eq!(
/// vec![Some(1./2.), Some(2./3.), Some(4./7.), Some(2./3.), Some(4./5.)],
/// output
/// );
///
/// // Calculate the closeness centrality of DiGraph
/// let dg = petgraph::graph::DiGraph::<i32, ()>::from_edges(&[
/// (0, 4), (1, 2), (2, 3), (3, 4), (1, 4)
/// ]);
/// let output = closeness_centrality(&dg, true);
/// assert_eq!(
/// vec![Some(0.), Some(0.), Some(1./4.), Some(1./3.), Some(4./5.)],
/// output
/// );
/// ```
pub fn closeness_centrality<G>(graph: G, wf_improved: bool) -> Vec<Option<f64>>
where
G: NodeIndexable
+ IntoNodeIdentifiers
+ GraphBase
+ IntoEdges
+ Visitable
+ NodeCount
+ IntoEdgesDirected,
G::NodeId: std::hash::Hash + Eq,
{
let max_index = graph.node_bound();
let mut closeness: Vec<Option<f64>> = vec![None; max_index];
for node_s in graph.node_identifiers() {
let is = graph.to_index(node_s);
let map = dijkstra(Reversed(&graph), node_s, None, |_| 1);
let reachable_nodes_count = map.len();
let dists_sum: usize = map.into_iter().map(|(_, v)| v).sum();
if reachable_nodes_count == 1 {
closeness[is] = Some(0.0);
continue;
}
closeness[is] = Some((reachable_nodes_count - 1) as f64 / dists_sum as f64);
if wf_improved {
let node_count = graph.node_count();
closeness[is] = closeness[is]
.map(|c| c * (reachable_nodes_count - 1) as f64 / (node_count - 1) as f64);
}
}
closeness
}
100 changes: 100 additions & 0 deletions retworkx-core/tests/centrality.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
// Licensed under the Apache License, Version 2.0 (the "License"); you may
// not use this file except in compliance with the License. You may obtain
// a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
// WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
// License for the specific language governing permissions and limitations
// under the License.

use petgraph::visit::Reversed;
mtreinish marked this conversation as resolved.
Show resolved Hide resolved
use retworkx_core::centrality::closeness_centrality;
use retworkx_core::petgraph::graph::{DiGraph, UnGraph};

#[test]
fn test_simple() {
let g = UnGraph::<i32, ()>::from_edges(&[(1, 2), (2, 3), (3, 4), (1, 4)]);
let c = closeness_centrality(&g, true);
assert_eq!(
vec![
Some(0.0),
Some(0.5625),
Some(0.5625),
Some(0.5625),
Some(0.5625)
],
c
);
}

#[test]
fn test_wf_improved() {
let g = UnGraph::<i32, ()>::from_edges(&[(0, 1), (1, 2), (2, 3), (4, 5), (5, 6)]);
let c = closeness_centrality(&g, true);
assert_eq!(
vec![
Some(1.0 / 4.0),
Some(3.0 / 8.0),
Some(3.0 / 8.0),
Some(1.0 / 4.0),
Some(2.0 / 9.0),
Some(1.0 / 3.0),
Some(2.0 / 9.0)
],
c
);
let cwf = closeness_centrality(&g, false);
assert_eq!(
vec![
Some(1.0 / 2.0),
Some(3.0 / 4.0),
Some(3.0 / 4.0),
Some(1.0 / 2.0),
Some(2.0 / 3.0),
Some(1.0),
Some(2.0 / 3.0)
],
cwf
);
}

#[test]
fn test_digraph() {
let g = DiGraph::<i32, ()>::from_edges(&[(0, 1), (1, 2)]);
let c = closeness_centrality(&g, true);
assert_eq!(vec![Some(0.), Some(1. / 2.), Some(2. / 3.)], c);

let cr = closeness_centrality(Reversed(&g), true);
assert_eq!(vec![Some(2. / 3.), Some(1. / 2.), Some(0.)], cr);
}

#[test]
fn test_k5() {
let g = UnGraph::<i32, ()>::from_edges(&[
(0, 1),
(0, 2),
(0, 3),
(0, 4),
(1, 2),
(1, 3),
(1, 4),
(2, 3),
(2, 4),
(3, 4),
]);
let c = closeness_centrality(&g, true);
assert_eq!(
vec![Some(1.0), Some(1.0), Some(1.0), Some(1.0), Some(1.0)],
c
);
}

#[test]
fn test_path() {
let g = UnGraph::<i32, ()>::from_edges(&[(0, 1), (1, 2)]);
let c = closeness_centrality(&g, true);
assert_eq!(vec![Some(2.0 / 3.0), Some(1.0), Some(2.0 / 3.0)], c);
}
47 changes: 47 additions & 0 deletions retworkx/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -1557,6 +1557,53 @@ def _graph_betweenness_centrality(graph, normalized=True, endpoints=False, paral
)


@functools.singledispatch
def closeness_centrality(graph, wf_improved=True):
r"""Returns the closeness centrality of each node in the graph.

The closeness centrality of a node :math:`u` is the reciprocal of the
average shortest path distance to :math:`u` over all :math:`n-1` reachable
nodes.

.. math::

C(u) = \frac{n - 1}{\sum_{v=1}^{n-1} d(v, u)},

where :math:`d(v, u)` is the shortest-path distance between :math:`v` and
:math:`u`, and :math:`n` is the number of nodes that can reach :math:`u`.

Wasserman and Faust propose an improved formula for graphs with more than
one connected component. The result is "a ratio of the fraction of actors
in the group who are reachable, to the average distance" from the reachable
actors.

.. math::

C_{WF}(u) = \frac{n-1}{N-1} \frac{n - 1}{\sum_{v=1}^{n-1} d(v, u)},

where :math:`N` is the number of nodes in the graph.

:param graph: The input graph. Can either be a
:class:`~retworkx.PyGraph` or :class:`~retworkx.PyDiGraph`.
:param bool wf_improved: This is optional; the default is True. If True,
scale by the fraction of nodes reachable.

:returns: A dictionary mapping each node index to its closeness centrality.
:rtype: dict
"""
raise TypeError("Invalid input type %s for graph" % type(graph))


@closeness_centrality.register(PyDiGraph)
def _digraph_closeness_centrality(graph, wf_improved=True):
return digraph_closeness_centrality(graph, wf_improved=wf_improved)


@closeness_centrality.register(PyGraph)
def _graph_closeness_centrality(graph, wf_improved=True):
return graph_closeness_centrality(graph, wf_improved=wf_improved)


@functools.singledispatch
def vf2_mapping(
first,
Expand Down
89 changes: 89 additions & 0 deletions src/centrality.rs
Original file line number Diff line number Diff line change
Expand Up @@ -132,3 +132,92 @@ pub fn digraph_betweenness_centrality(
.collect(),
}
}

/// Compute the closeness centrality of all nodes in a PyGraph.
///
/// The closeness centrality of a node :math:`u` is the reciprocal of the
/// average shortest path distance to :math:`u` over all :math:`n-1` reachable
/// nodes.
///
/// .. math::
///
/// C(u) = \frac{n - 1}{\sum_{v=1}^{n-1} d(v, u)},
///
/// where :math:`d(v, u)` is the shortest-path distance between :math:`v` and
/// :math:`u`, and :math:`n` is the number of nodes that can reach :math:`u`.
///
/// Wasserman and Faust propose an improved formula for graphs with more than
/// one connected component. The result is "a ratio of the fraction of actors
/// in the group who are reachable, to the average distance" from the reachable
/// actors.
///
/// .. math::
///
/// C_{WF}(u) = \frac{n-1}{N-1} \frac{n - 1}{\sum_{v=1}^{n-1} d(v, u)},
///
/// where :math:`N` is the number of nodes in the graph.
///
/// :param PyGraph graph: The input graph
/// :param bool wf_improved: If True, scale by the fraction of nodes reachable.
///
/// :returns: a read-only dict-like object whose keys are the node indices and
/// values are its closeness centrality score for each node.
/// :rtype: CentralityMapping
#[pyfunction(wf_improved = "true")]
#[pyo3(text_signature = "(graph, /, wf_improved=True)")]
pub fn graph_closeness_centrality(graph: &graph::PyGraph, wf_improved: bool) -> CentralityMapping {
let closeness = centrality::closeness_centrality(&graph.graph, wf_improved);
CentralityMapping {
centralities: closeness
.into_iter()
.enumerate()
.filter_map(|(i, v)| v.map(|x| (i, x)))
.collect(),
}
}

/// Compute the closeness centrality of all nodes in a PyDiGraph.
mtreinish marked this conversation as resolved.
Show resolved Hide resolved
///
/// The closeness centrality of a node :math:`u` is the reciprocal of the
/// average shortest path distance to :math:`u` over all :math:`n-1` reachable
/// nodes.
///
/// .. math::
///
/// C(u) = \frac{n - 1}{\sum_{v=1}^{n-1} d(v, u)},
///
/// where :math:`d(v, u)` is the shortest-path distance between :math:`v` and
/// :math:`u`, and :math:`n` is the number of nodes that can reach :math:`u`.
///
/// Wasserman and Faust propose an improved formula for graphs with more than
/// one connected component. The result is "a ratio of the fraction of actors
/// in the group who are reachable, to the average distance" from the reachable
/// actors.
///
/// .. math::
///
/// C_{WF}(u) = \frac{n-1}{N-1} \frac{n - 1}{\sum_{v=1}^{n-1} d(v, u)},
///
/// where :math:`N` is the number of nodes in the graph.
///
/// :param PyDiGraph graph: The input digraph
/// :param bool wf_improved: If True, scale by the fraction of nodes reachable.
///
/// :returns: a read-only dict-like object whose keys are the node indices and values are its
/// closeness centrality score for each node.
/// :rtype: CentralityMapping
#[pyfunction(wf_improved = "true")]
#[pyo3(text_signature = "(graph, /, wf_improved=True)")]
pub fn digraph_closeness_centrality(
graph: &digraph::PyDiGraph,
wf_improved: bool,
) -> CentralityMapping {
let closeness = centrality::closeness_centrality(&graph.graph, wf_improved);
CentralityMapping {
centralities: closeness
.into_iter()
.enumerate()
.filter_map(|(i, v)| v.map(|x| (i, x)))
.collect(),
}
}
Loading