gtool now supports 2 graph file formats.
- common graph edge list format
- matrix market format
Common graph edge list format starts with serveral commented lines:
# This is a comment. # You can put vertex number or edge number here. # Or you can put anything you want.
They are followed by \( e \) lines, and each line u v
represents an
edge \( u→ v\). For example,
1 2 1 3 2 1
represent 3 edge \( 1→ 2, 1→ 3, 2→ 1 \).
The whole example of common graph edge list format is:
# This is a comment. # You can put vertex number or edge number here. # Or you can put anything you want. 1 2 1 3 2 1
The Martrix Market Format is a little bit more complicated. See Their Site for details.
[Updated]
Use these methods to build static graphs:
build_graph_from_file
to generate graph from a filebuild_graph_from_edgelist
to generate graph from an edge-list [rarely used]
Use StreamBuilder
to build a stream graph.
#include <filesystem>
#include <gtool/graph.h>
#include <gtool/builder.h>
using Node = int;
// replace with your graph file path
std::filesystem::path graph_file_path{"./dataset/soc-Slashdot0811.txt"};
if (!exists(graph_file_path)) {
std::cerr << "Error: graph file not found!" << std::endl;
return -1;
}
auto graph = gtool::build_graph_from_file<Node>(graph_file_path.string());
Or if you want to build a stream graph:
#include <gtool/stream_graph.h>
#include <gtool/stream_builder.h>
gtool::StreamBuilder<Node> builder{graph_file_path.string()};
gtool::StreamGraph<Node> graph = builder.build_csr();
There are serveral build options. For static graph,
directed
indicates whether the graph inputed is directed. If for any edge \( u→ v \), there exists a reversed edge \( v→ u\), then you should set directed to true. The value is set to false by default.symmetrize
indicates if you want to add a reversed edge for every edge in the file. Notice that if the graph is already indirected (you setdirected
to true), reversed edges will not be added.
For stream graph,
stream_ratio
indicates how many edges you want to make be the streaming edges, which will be add by the processing of streaming. The parameter is a proportion between 0 and 1.
Examples:
// Input graph is a directed graph and it will be made undirected
// (add reverse edges).
gtool::Builder<Node> builder{graph_file_path.string(), false, true};
// Make 10% of edges in the input file to be streaming edges,
// which means they won't be added into the graph at the first
// place.
gtool::StreamBuilder<Node> s_builder{graph_file_path.string(), 0.1};
There are actually some utilities for stream graphs. They’re
Streamer
and DStreamer
.
A Streamer receives a graph reference and an edge list. It provides methods to split the edge list into batches with given size. You can use it to stream batches of additions to the stream graph. For example,
gtool::StreamBuilder<Node> builder{graph_file_path.string()};
gtool::StreamGraph<Node> graph = builder.build_csr();
// Use edges extracted from the graph by StreamBuilder as graph updates
gtool::Streamer<Node, Node, decltype(builder.delta())> streamer(graph, builder.delta());
std::cout << "Edge number: " << graph.get_edge_number() << std::endl;
// The batch size is set to 1'000 by default.
std::cout << "Max batch size: " << streamer.max_batch_size() << std::endl;
// If you want to make 100 batches and need a proper batch size
int new_batch_size{};
std::cout << "Suggested batch size: " << (new_batch_size = streamer.suggest_max_batch_size(100)) << std::endl;
streamer.max_batch_size(new_batch_size);
// You must ensure at least one batch exists before start streaming.
while (streamer.has_more_batches()) {
std::cout << "Now stream a batch of edges into graph...";
streamer.stream_next_batch();
std::cout << "Edge number: " << graph.get_edge_number() << std::endl;
// rewrite the previous state with new state
streamer.update_graph();
}
A DStreamer receives a graph reference and an edge list. It provides methods to split the edge list into batches with given size. In contrast with Streamer, it streams the batches as deletions to the stream graph.
Check the header files for details.