Sp-ization Algorithm #1423

Marsella8 · 2024-06-25T22:30:50Z

Description of changes:

Added basic sp-ization algorithm and associated tests

Related Issues:

Linked Issues:

Issue #

Issues closed by this PR:

Closes #

This change is

codecov · 2024-06-25T22:46:33Z

Codecov Report

Attention: Patch coverage is 97.78096% with 31 lines in your changes missing coverage. Please review.

Project coverage is 65.36%. Comparing base (8f9082f) to head (8ba896a).

Files	Patch %	Lines
...s/graph/serial_parallel/serial_parallel_metrics.cc	81.25%	12 Missing ⚠️
.../utils/graph/serial_parallel/digraph_generation.cc	88.40%	8 Missing ⚠️
...ls/graph/serial_parallel/serial_parallel_splits.cc	60.00%	4 Missing ⚠️
...tils/graph/digraph/algorithms/is_2_terminal_dag.cc	0.00%	3 Missing ⚠️
.../utils/graph/digraph/algorithms/get_descendants.cc	88.88%	2 Missing ⚠️
...h/serial_parallel/serial_parallel_decomposition.cc	94.59%	2 Missing ⚠️

Additional details and impacted files

@@                Coverage Diff                @@
##           repo-refactor    #1423      +/-   ##
=================================================
+ Coverage          61.44%   65.36%   +3.91%     
=================================================
  Files                479      495      +16     
  Lines              13055    14431    +1376     
  Branches             410      446      +36     
=================================================
+ Hits                8022     9433    +1411     
+ Misses              5033     4998      -35

Flag	Coverage Δ
unittests	`65.36% <97.78%> (+3.91%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files	Coverage Δ
.../utils/include/utils/containers/unordered_set_of.h	`100.00% <ø> (ø)`
...ils/graph/serial_parallel/serial_parallel_splits.h	`100.00% <ø> (ø)`
lib/utils/src/utils/graph/algorithms.cc	`19.44% <ø> (+1.85%)`	⬆️
lib/utils/src/utils/graph/digraph/algorithms.cc	`100.00% <100.00%> (ø)`
...h/algorithms/get_longest_path_lengths_from_root.cc	`100.00% <100.00%> (ø)`
...aph/digraph/algorithms/get_topological_ordering.cc	`100.00% <ø> (ø)`
...hms/get_topological_ordering_from_starting_node.cc	`100.00% <100.00%> (ø)`
...s/src/utils/graph/digraph/algorithms/is_acyclic.cc	`100.00% <100.00%> (+29.41%)`	⬆️
...s/graph/digraph/algorithms/transitive_reduction.cc	`91.30% <100.00%> (-0.37%)`	⬇️
...raph/serial_parallel/normalize_sp_decomposition.cc	`100.00% <100.00%> (ø)`
... and 18 more

... and 13 files with indirect coverage changes

lockshaw

Reviewed 4 of 5 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @Marsella8)

lib/utils/include/utils/graph/simple_sp_ization.h line 10 at r2 (raw file):

namespace FlexFlow {

DiGraph simple_sp_ization(DiGraph g);

Suggestion:

DiGraph simple_sp_ization(DiGraphView const &g);

lib/utils/src/graph/simple_sp_ization.cc line 12 at r2 (raw file):

namespace FlexFlow {

DiGraph simple_sp_ization(DiGraph g) {

Maybe instead just output the serial-parallel decomposition of the resulting graph as produced by the existing serial-parallel algorithm? It seems like what you really want for this is a serial-parallel structure, not actually a graph, and this also solves the problem of what to do with these random barrier nodes that otherwise get inserted.

lib/utils/src/graph/simple_sp_ization.cc line 14 at r2 (raw file):

DiGraph simple_sp_ization(DiGraph g) {

  std::unordered_set<Node> sources = get_sources(g);

Break this into two functions: one that finds the distances of all of the nodes from one the sources, and one that actually inserts the barrier nodes

lib/utils/src/graph/simple_sp_ization.cc line 44 at r2 (raw file):

    if (layers.count(layer + 1)) { // Skip if last layer
      std::vector<Node> successors = add_nodes(sp, layers.at(layer + 1).size());
      Node barrier_node = add_nodes(sp, 1)[0];

Suggestion:

    Node barrier_node = sp.add_node();

lib/utils/src/graph/simple_sp_ization.cc line 45 at r2 (raw file):

      std::vector<Node> successors = add_nodes(sp, layers.at(layer + 1).size());
      Node barrier_node = add_nodes(sp, 1)[0];
      for (auto const &p : predecessors) {

Suggestion:

      for (Node const &p : predecessors) {

lib/utils/src/graph/simple_sp_ization.cc line 51 at r2 (raw file):

        sp.add_edge({barrier_node, s});
      }
      predecessors = successors;

Try to not re-assign variables as much as possible, it makes the code much less readable when I have to track all the state changes through the loop. Here you could just have

std::vector<Node> predecessors = add_nodes(sp, layers.at(current_layer));
std::vector<Node> successors = add_nodes(sp, layers.at(current_layer + 1));

At the top of the loop head, which avoids all this reassignment. Even better, make layers a vector and then iterate through pairs of elements (and if a function that produces a vector of pairs of elements is not already in utils/containers, add it) (though this may all be obviated by the comment at the top of this function)

lib/utils/test/src/test_simple_sp_ization.cc line 0 at r2 (raw file):
Add this file to the list of tests in the utils/test/CMakeLists.txt

lib/utils/test/src/test_simple_sp_ization.cc line 13 at r2 (raw file):

  TEST_CASE("Simple SP-ization: sanity checks") {
    /*
    graph TD;

Ideally prefer graphviz/dot syntax over mermaid (we just use mermaid in readmes because github auto-renders it in those cases). Here, the equivalent graphviz would be

digraph {
    n1 -> n2;
    n1 -> n3;
    n2 -> n4;
    n2 -> n5;
    n3 -> n4;
    n3 -> n5;
    n4 -> n6;
    n5 -> n6;
}

Also, isn't this graph serial-parallel as long as you treat all-to-all connections as sequential splits (which we do)?

lib/utils/test/src/test_simple_sp_ization.cc line 50 at r2 (raw file):

    std::unordered_set<DirectedEdge> sp_edges = get_edges(sp);
    CHECK(sp_edges.size() == 3 + 4 + 3); // 3 for first layer, 4 for second, ...

Add a check that the resulting graph is sp-decomposable?

lib/utils/test/src/test_simple_sp_ization.cc line 53 at r2 (raw file):

  }

  TEST_CASE("Simple SP-ization: integrity checks") {

Why two separate test cases here for the same input graph? Probably better to just move the checks into the previous test case?

Marsella8

For both get_longest_path_lengths and the simple_sp_ization the tests are failing due to the directed graph flipping the edges issue (I've checked and constructing the graphs in reverse makes the tests pass).

Reviewable status: 0 of 8 files reviewed, 10 unresolved discussions (waiting on @lockshaw)

lib/utils/src/graph/simple_sp_ization.cc line 12 at r2 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Maybe instead just output the serial-parallel decomposition of the resulting graph as produced by the existing serial-parallel algorithm? It seems like what you really want for this is a serial-parallel structure, not actually a graph, and this also solves the problem of what to do with these random barrier nodes that otherwise get inserted.

Yes, sorry for that. Currently returns a SerialParallelDecomposition (which is a Serial Node that contains a vector of Parallel Nodes, each representing a layer).

lib/utils/src/graph/simple_sp_ization.cc line 14 at r2 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Break this into two functions: one that finds the distances of all of the nodes from one the sources, and one that actually inserts the barrier nodes

Done.

lib/utils/src/graph/simple_sp_ization.cc line 44 at r2 (raw file):

    if (layers.count(layer + 1)) { // Skip if last layer
      std::vector<Node> successors = add_nodes(sp, layers.at(layer + 1).size());
      Node barrier_node = add_nodes(sp, 1)[0];

Done.

lib/utils/src/graph/simple_sp_ization.cc line 45 at r2 (raw file):

      std::vector<Node> successors = add_nodes(sp, layers.at(layer + 1).size());
      Node barrier_node = add_nodes(sp, 1)[0];
      for (auto const &p : predecessors) {

Done.

lib/utils/src/graph/simple_sp_ization.cc line 51 at r2 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Try to not re-assign variables as much as possible, it makes the code much less readable when I have to track all the state changes through the loop. Here you could just have
std::vector<Node> predecessors = add_nodes(sp, layers.at(current_layer));
std::vector<Node> successors = add_nodes(sp, layers.at(current_layer + 1));
At the top of the loop head, which avoids all this reassignment. Even better, make layers a vector and then iterate through pairs of elements (and if a function that produces a vector of pairs of elements is not already in utils/containers, add it) (though this may all be obviated by the comment at the top of this function)

Done.

lib/utils/test/src/test_simple_sp_ization.cc line 13 at r2 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Ideally prefer graphviz/dot syntax over mermaid (we just use mermaid in readmes because github auto-renders it in those cases). Here, the equivalent graphviz would be
digraph {
    n1 -> n2;
    n1 -> n3;
    n2 -> n4;
    n2 -> n5;
    n3 -> n4;
    n3 -> n5;
    n4 -> n6;
    n5 -> n6;
}
Also, isn't this graph serial-parallel as long as you treat all-to-all connections as sequential splits (which we do)?

Changed the graph to non-sp

lib/utils/test/src/test_simple_sp_ization.cc line 50 at r2 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Add a check that the resulting graph is sp-decomposable?

Done.

lib/utils/test/src/test_simple_sp_ization.cc line 53 at r2 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why two separate test cases here for the same input graph? Probably better to just move the checks into the previous test case?

Done.

lib/utils/test/src/test_simple_sp_ization.cc line at r2 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Add this file to the list of tests in the utils/test/CMakeLists.txt

Done.

Marsella8

Reviewable status: 0 of 8 files reviewed, 10 unresolved discussions (waiting on @lockshaw)

lib/utils/include/utils/graph/simple_sp_ization.h line 10 at r2 (raw file):

namespace FlexFlow {

DiGraph simple_sp_ization(DiGraph g);

Done.

lockshaw

Update to include #1449 and then re-request review

Reviewed 1 of 7 files at r8.
Reviewable status: 1 of 12 files reviewed, 22 unresolved discussions (waiting on @Marsella8)

lib/utils/src/graph/sp_ization.cc line 15 at r5 (raw file):

Previously, Marsella8 wrote…

changed to is_2_terminal_sp_compliant

How about is_two_terminal_dag, since none of these properties actually have to do with SP

…support

Marsella8

Done

Reviewable status: 1 of 54 files reviewed, 22 unresolved discussions (waiting on @lockshaw)

lib/utils/src/graph/sp_ization.cc line 15 at r5 (raw file):

Previously, lockshaw (Colin Unger) wrote…

How about is_two_terminal_dag, since none of these properties actually have to do with SP

Done.

lockshaw

Reviewed 52 of 53 files at r11, 2 of 2 files at r13, all commit messages.
Reviewable status: all files reviewed, 58 unresolved discussions (waiting on @Marsella8)

lib/utils/include/utils/containers.h line 26 at r11 (raw file):

#include <map>
#include <numeric>
#include <set>

Why?

lib/utils/include/utils/containers/as_unordered_set.h line 9 at r11 (raw file):

template <typename C, typename T = typename C::value_type>
std::unordered_set<T> as_unordered_set(C const &c) {

Use unordered_set_of (yes the naming is a bit irregular, but let's not try to fix that in this PR)

lib/utils/include/utils/containers/invert_map.h line 19 at r11 (raw file):

  }
  return m_inv;
}

Unless there's a need for the extra polymorphism it's probably simpler to just have it defined for unordered_map

Suggestion:

template <typename K, typename V>
auto invert_map(std::unordered_map<K, V> const &m) {
  std::unordered_map<V, std::unordered_set<K>> result;
  for (auto const &[key, value] : m) {
    result[value].insert(key);
  }
  return result;
}

lib/utils/include/utils/graph/digraph/algorithms/get_longest_path_lengths_from_root.h line 9 at r11 (raw file):

namespace FlexFlow {

std::unordered_map<Node, float> get_weighted_longest_path_lengths_from_root(

Maybe add a quick doxygen comment to explain what these functions do? Up to you, only do it if you can come up with a good prose explanation/docstring

lib/utils/include/utils/graph/digraph/algorithms/get_topological_ordering.h line 11 at r11 (raw file):

std::vector<Node>
    get_topological_ordering_from_starting_node(DiGraphView const &g,

Move to its own header file

lib/utils/include/utils/graph/serial_parallel/graph_generation.h line 19 at r11 (raw file):

                                   DataflowGraphView const &g2);

std::unordered_map<Node, Node> parallel_extend(DiGraph &g,

Why were these added?

lib/utils/include/utils/graph/serial_parallel/parallel_composition.h line 8 at r11 (raw file):

namespace FlexFlow {

SerialParallelDecomposition parallel_composition(

Probably can just be part of serial_parallel_decomposition.h as it feels more like a primitive operation

lib/utils/include/utils/graph/serial_parallel/serial_composition.h line 8 at r11 (raw file):

namespace FlexFlow {

SerialParallelDecomposition serial_composition(

Probably should just be part of serial_parallel_decomposition.h as this feels like a rather primitive operation

lib/utils/include/utils/graph/serial_parallel/serial_parallel_decomposition.variant.toml line 8 at r11 (raw file):

  "fmt",
]
explicit_constructors = false

Use explicit_constructors unless you have a strong reason--it's a bit more verbose but the code is usually a lot clearer and less bug prone

lib/utils/include/utils/graph/serial_parallel/serial_parallel_isempty.h line 8 at r11 (raw file):

namespace FlexFlow {

bool isempty(SerialParallelDecomposition const &sp);

Should probably be moved to serial_parallel_decomposition.h, this feels like a rather primitive operation

Suggestion:

bool is_empty(SerialParallelDecomposition const &sp);

lib/utils/include/utils/graph/serial_parallel/serial_parallel_metrics.h line 11 at r11 (raw file):

std::unordered_map<Node, size_t>
    node_frequency_counter(SerialParallelDecomposition const &sp);

Might be more clear if named get_node_num_occurences_map, or get_node_frequency_map?

lib/utils/include/utils/graph/serial_parallel/serial_parallel_metrics.h line 13 at r11 (raw file):

    node_frequency_counter(SerialParallelDecomposition const &sp);

size_t num_nodes(SerialParallelDecomposition const &sp);

Move to serial_parallel_decomposition.h

lib/utils/include/utils/graph/serial_parallel/serial_parallel_normalize.h line 0 at r11 (raw file):
Rename to normalize_sp_decomposition.h

lib/utils/include/utils/graph/serial_parallel/serial_parallel_normalize.h line 13 at r11 (raw file):

 * @details This function performs the following semantic substitutions:
 * - Deletes every empty SerialSplit and ParallelSplit item:
 *   S(P(S()), Node(1), Node(2)) -> S(Node(1), Node(2))

Suggestion:

 * - Deletes every empty SerialSplit and ParallelSplit item, e.g., 
 *   <tt>S(P(S()), Node(1), Node(2)) -> S(Node(1), Node(2))</tt>

lib/utils/include/utils/graph/serial_parallel/serial_parallel_normalize.h line 16 at r11 (raw file):

 *
 * - Replaces SerialSplit and ParallelSplit of size 1 with their content:
 *   S(S(Node(1)), P(Node(2))) -> S(Node(1), Node(2))

Suggestion:

 * - Replaces SerialSplit and ParallelSplit of size 1 with their content, e.g.,
 *   <tt>S(S(Node(1)), P(Node(2))) -> S(Node(1), Node(2))</tt>)

lib/utils/include/utils/graph/serial_parallel/serial_parallel_normalize.h line 19 at r11 (raw file):

 *
 */
SerialParallelDecomposition normalize(SerialParallelDecomposition const &sp);

Suggestion:

normalize_sp_decomposition

lib/utils/include/utils/graph/serial_parallel/serial_parallel_splits.h line 15 at r11 (raw file):

struct SerialSplit {
public:
  SerialSplit() = delete;

I'm not a big fan of default-constructibility, why should this have a default constructor?

lib/utils/include/utils/graph/serial_parallel/sp_ization/critical_path_preserving_sp_ization.h line 10 at r11 (raw file):

namespace FlexFlow {

SerialParallelDecomposition

A couple brief doxygen comments would be rather nice here if you can come up with some

lib/utils/src/utils/graph/digraph/algorithms/get_longest_path_lengths_from_root.cc line 53 at r11 (raw file):

}

std::unordered_map<Node, float> get_weighted_longest_path_lengths_from_root(

Probably better to just merge the checked and unchecked versions here, unless you have a strong counterargument

lib/utils/src/utils/graph/digraph/algorithms/is_acyclic.cc line 33 at r13 (raw file):

}

bool is_acyclic(DiGraphView const &g) {

This should either be simplified or additional explanation added. Why are there three states? What invariants are expected to hold? Collapsing dfs_is_acyclic down to a lambda could also help. Also a better/clearer name for dfs_is_acylic would likely help.

lib/utils/src/utils/graph/serial_parallel/parallel_composition.cc line 8 at r13 (raw file):

SerialParallelDecomposition parallel_composition(
    std::unordered_set<SerialParallelDecomposition> const &sp_compositions) {

What if sp_compositions is empty?

lib/utils/src/utils/graph/serial_parallel/parallel_composition.cc line 14 at r13 (raw file):

  ParallelSplit composition({});
  for (SerialParallelDecomposition const &sp_comp : sp_compositions) {
    if (sp_comp.has<ParallelSplit>()) {

Might be cleaner with overload?

lib/utils/src/utils/graph/serial_parallel/serial_composition.cc line 8 at r13 (raw file):

SerialParallelDecomposition serial_composition(
    std::vector<SerialParallelDecomposition> const &sp_compositions) {

What if sp_compositions is empty?

lib/utils/src/utils/graph/serial_parallel/serial_composition.cc line 14 at r13 (raw file):

  SerialSplit composition{};
  for (SerialParallelDecomposition const &sp_comp : sp_compositions) {
    if (sp_comp.has<SerialSplit>()) {

Might be cleaner with overload

lib/utils/src/utils/graph/serial_parallel/serial_parallel_isempty.cc line 9 at r13 (raw file):

namespace FlexFlow {

bool isempty(SerialParallelDecomposition const &sp) {

Likely simpler to just add overloads for SerialSplit, ParallelSplit, etc. and then you can also avoid all the widening

Also probably cleaner with overload

lib/utils/src/utils/graph/serial_parallel/serial_parallel_metrics.cc line 14 at r13 (raw file):

std::unordered_map<Node, size_t>
    node_frequency_counter(SerialParallelDecomposition const &sp) {

This function needs simplifying

lib/utils/src/utils/graph/serial_parallel/serial_parallel_metrics.cc line 67 at r13 (raw file):

float critical_path_cost(SerialParallelDecomposition const &sp,
                         std::unordered_map<Node, float> const &cost_map) {
  if (sp.has<Node>()) {

overload and visit would probably clean up this code

lib/utils/src/utils/graph/serial_parallel/serial_parallel_metrics.cc line 81 at r13 (raw file):

                             [&](std::variant<SerialSplit, Node> const &child) {
                               return critical_path_cost(
                                   widen<SerialParallelDecomposition>(child),

Avoid the widen calls, preferably add critical_path_cost overloads for ParallelSplit and SerialSplit

Code quote:

widen<SerialParallelDecomposition>(

lib/utils/src/utils/graph/serial_parallel/serial_parallel_normalize.cc line 27 at r13 (raw file):

}

SerialParallelDecomposition normalize(SerialParallelDecomposition const &sp) {

Why is this function so complicated? Is there not a cleaner way of doing this? Is it due to the alternating Serial-Parallel invariant? At the very least it seems like you could transform sp back into a binary decomposition tree and then use get_final_ast to re-flatten it which I think should handle normalization?

lib/utils/src/utils/graph/serial_parallel/sp_ization/critical_path_preserving_sp_ization.cc line 0 at r13 (raw file):
A bunch of these functions could use some simplifying, in addition to ideally some docstrings as their names are rather technical (e.g., critical_path_preserving_sp_ization_unchecked_with_coalescing is a bit hard to understand unless you're Pietro 😛)

lib/utils/src/utils/graph/serial_parallel/sp_ization/work_preserving_sp_ization.cc line 0 at r13 (raw file):
Try to simplify these functions--right now they're unnecessarily hard to read.

Also, while I appreciate the effort to use functions from utils/containers, they're not always the most readable version of the code--often they are, but this does need to be determined on a case-by-case basis

lib/utils/src/utils/graph/serial_parallel/sp_ization/work_preserving_sp_ization.cc line 33 at r13 (raw file):

namespace FlexFlow {

std::vector<std::unordered_set<Node>> naive_layer_split(DiGraphView const &g) {

Try not to overload layer--maybe tier or stratum or something derived from the timeline analogy?

Code quote:

std::vector<std::unordered_set<Node>> naive_layer_split(DiGraphView const &g) {

lib/utils/src/utils/graph/serial_parallel/sp_ization/work_preserving_sp_ization.cc line 33 at r13 (raw file):

namespace FlexFlow {

std::vector<std::unordered_set<Node>> naive_layer_split(DiGraphView const &g) {

get_strata_assuming_unit_cost?

Code quote:

naive_layer_split(

lib/utils/src/utils/graph/serial_parallel/sp_ization/work_preserving_sp_ization.cc line 43 at r13 (raw file):

  }
  return layer_to_nodes;
}

Suggestion:

std::vector<std::unordered_set<Node>> naive_layer_split(DiGraphView const &g) {
  std::vector<std::unordered_set<Node>> result;
  
  for (auto const &[node, depth] : get_longest_path_lengths_from_root(g)) {
    result[depth].insert(node);
  }
  
  return result;
}

lib/utils/src/utils/graph/serial_parallel/sp_ization/work_preserving_sp_ization.cc line 46 at r13 (raw file):

static SerialParallelDecomposition
    naive_layer_merge(std::vector<std::unordered_set<Node>> layer_to_node) {

Usually we use a_to_b for maps, it's a bit weird to have it for a vector?

Code quote:

layer_to_node

lib/utils/src/utils/graph/serial_parallel/sp_ization/work_preserving_sp_ization.cc line 55 at r13 (raw file):

    sp.children.push_back(layer);
  }
  return normalize(SerialParallelDecomposition(sp));

Suggestion:

  return serial_composition(transform(layer_to_node,
                                      [](std::unordered_set<Node> const &stratum_nodes) {
                                        return ParallelSplit{stratum_nodes};  
                                      });

lib/utils/test/CMakeLists.txt line 6 at r13 (raw file):

  SRC_PATTERNS
    src/utils/*.cc
    # src/*.cc

Suggestion:

    src/utils/*.cc

lib/utils/test/src/test_algorithms.cc line 248 at r13 (raw file):

  }

  TEST_CASE("get_longest_path_lengths_from_root - linear graph") {

Move these tests of get_longest_path_lengths_from_root to lib/utils/test/src/utils/graph/digraph/algorithms/get_longest_path_lengths_from_root.cc

lib/utils/test/src/utils/graph/digraph/algorithms/is_acyclic.cc line 26 at r11 (raw file):

    CHECK(result == correct);
  }

Apply to all subcases below

Suggestion:

  TEST_CASE("is_acyclic") {
    DiGraph g = DiGraph::create<AdjacencyDiGraph>();

    SUBCASE("empty graph") {
      bool correct = true;
      bool result = is_acyclic(g);

      CHECK(result == correct);
    }

    SUBCASE("is_acyclic - single node") {
      add_nodes(g, 1);

      bool correct = true;
      bool result = is_acyclic(g);
      
      CHECK(result == correct);
    }
  }

lib/utils/test/src/utils/graph/serial_parallel/serial_parallel_isempty.cc line 8 at r11 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("isempty function") {

Suggestion:

  TEST_CASE("is_empty(SerialParallelDecomposition)") {

lib/utils/test/src/utils/graph/serial_parallel/serial_parallel_normalize.cc line 8 at r11 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("normalize function") {

Suggestion:

  TEST_CASE("normalize_sp_decomposition") {

lib/utils/test/src/utils/graph/serial_parallel/serial_parallel_normalize.cc line 9 at r11 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("normalize function") {
    Node n1{1};

Personally I prefer this syntax, but this isn't a huge deal, just would help with style consistency

Suggestion:

    Node n1 = Node{1};

lib/utils/test/src/utils/graph/serial_parallel/serial_parallel_normalize.cc line 16 at r11 (raw file):

      SerialParallelDecomposition sp{n1};
      SerialParallelDecomposition expected = sp;
      CHECK(expected == normalize(sp));

Apply this pattern to all of the SUBCASEs below as well

Suggestion:

      SerialParallelDecomposition input = {n1};
      SerialParallelDecomposition result = normalize(sp);
      SerialParallelDecomposition correct = input;
      CHECK(result == correct);

lib/utils/test/src/utils/graph/serial_parallel/serial_parallel_splits.cc line 12 at r13 (raw file):

    // The following definitions are equivalent
    ParallelSplit p1{n[0],

Prefer bounds-checked accesses

Suggestion:

  ParallelSplit p1{n.at(0),

lib/utils/test/src/utils/graph/serial_parallel/serial_parallel_splits.cc line 21 at r13 (raw file):

                     SerialSplit{n[1], n[2]},
                     SerialSplit{n[3], ParallelSplit{n[0], n[5], n[4]}}};
    ParallelSplit p4{SerialSplit{n[3], ParallelSplit{n[5], n[4], n[0]}},

Rather than using a bunch of complicated test cases each time, try to have small/minimal test cases for each of the properties, and then if you want add a big end-to-end test case at the end. For example, it might be better to have a test case checking that ParallelSplit{n.at(0), n.at(1)} == ParallelSplit{n.at(1), n.at(0)} instead all of these big splits where it's difficult to tell what's getting tested

lib/utils/test/src/utils/graph/serial_parallel/serial_parallel_splits.cc line 53 at r13 (raw file):

  }

  TEST_CASE("ParallelSplit::operator== - nested SP") {

How are these nested test cases any different than the test case above?

Code quote:

- nested SP")

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/critical_path_preserving_sp_ization.cc line 13 at r13 (raw file):

TEST_SUITE(FF_TEST_SUITE) {

  TEST_CASE("critical_path_preserving_sp_ization") {

Add some simple test cases in addition to composite ones

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/critical_path_preserving_sp_ization.cc line 20 at r13 (raw file):

      add_edges(g,
                {
                    DirectedEdge{n[0], n[1]},

Prefer bounds-checked accessors

Suggestion:

n.at(0),

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/critical_path_preserving_sp_ization.cc line 33 at r13 (raw file):

          {n[0], 3}, {n[1], 2}, {n[2], 1}, {n[3], 1}, {n[4], 1}, {n[5], 5}};

      CHECK(work_cost(g, cost_map) == 13);

Why are there two work_cost checks?

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/critical_path_preserving_sp_ization.cc line 49 at r13 (raw file):

        CHECK(expected == result);
      }
      SUBCASE("work cost") {

Try to avoid testing a bunch of different functions all at once--it seems like an easy way to get additional coverage with less work, but the test cases become rather hard to read as the cases aren't chosen for that function, they're just big giant things intended to test everything all at once. For example, work_cost doesn't take in a function, it takes in an sp-decomposition--but here it's all tied up with a graph. Focus on testing one property of one function at a time

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/critical_path_preserving_sp_ization.cc line 92 at r13 (raw file):

      }
      SUBCASE("work cost") {
        float expected = 16;

Suggestion:

        float correct = 16;

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/work_preserving_sp_ization.cc line 15 at r13 (raw file):

  TEST_CASE("work_preserving_") {

    SUBCASE("Sample Graph #1") {

See comments about having smaller test cases and testing one property of one function at a time--right now I can't tell what these test cases are actually trying to check

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/benchmarking/distributions.h line 4 at r13 (raw file):

#define _FLEXFLOW_DISTRIBUTIONS_H

#include "utils/graph/node/graph.h"

Suggestion:

#include "utils/graph/node/node.dtg.h"

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/benchmarking/distributions.h line 29 at r13 (raw file):

struct Bernoulli {
  float p;
  Bernoulli(float p = 0.5) : p(p) {}

Move implementations to .cc file

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/benchmarking/sample_graphs.h line 29 at r13 (raw file):

  std::vector<DirectedEdge> edges = {
      DirectedEdge(inputs[0], sep[1]), DirectedEdge(inputs[0], id[1]),

Prefer bounds-checked variants

Suggestion:

inputs.at(0)

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/benchmarking/sample_graphs.h line 243 at r13 (raw file):

  std::vector<DirectedEdge> edges = {
      DirectedEdge(n[0], n[1]),   DirectedEdge(n[0], n[2]),

Prefer bracket initialization for consistency

Suggestion:

      DirectedEdge{n[0], n[1]}

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/benchmarking/sample_graphs.h line 264 at r13 (raw file):

  // From the TASO paper, pg 57
  DiGraph g = DiGraph::create<AdjacencyDiGraph>();
  Node root = get_only(add_nodes(g, 1));

Suggestion:

  Node root = g.add_node();

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/benchmarking/sample_graphs.h line 270 at r13 (raw file):

  std::vector<Node> avg = add_nodes(g, 3);
  std::vector<Node> add = add_nodes(g, 5);
  Node concat = get_only(add_nodes(g, 1));

Suggestion:

 Node concat = g.add_node();

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/benchmarking/sample_graphs.h line 289 at r13 (raw file):

  add_edges(g, edges);

  for (auto const &a : add) {

Suggestion:

  for (Node const &a : add) {

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/benchmarking/sample_graphs.h line 306 at r13 (raw file):

    }
  }
  auto sinks = get_sinks(g);

Prefer actual type names, i.e., std::unordered_set<Node>

Code quote:

  auto sinks = get_sinks(g);

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/benchmarking/sp_ization_benchmarking.cc line 1 at r13 (raw file):

// #include "distributions.h"

What's happening with all of this commented-out code?

Marsella8

Reviewable status: 16 of 68 files reviewed, 58 unresolved discussions (waiting on @lockshaw)

lib/utils/include/utils/containers/invert_map.h line 19 at r11 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Unless there's a need for the extra polymorphism it's probably simpler to just have it defined for unordered_map

changed

lib/utils/include/utils/graph/digraph/algorithms/get_longest_path_lengths_from_root.h line 9 at r11 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Maybe add a quick doxygen comment to explain what these functions do? Up to you, only do it if you can come up with a good prose explanation/docstring

done

lib/utils/include/utils/graph/serial_parallel/graph_generation.h line 19 at r11 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why were these added?

Needed them for building the sample graphs for the benchmarking

lib/utils/include/utils/graph/serial_parallel/serial_parallel_metrics.h line 11 at r11 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Might be more clear if named get_node_num_occurences_map, or get_node_frequency_map?

changed to get_node_frequency_map.

lib/utils/include/utils/graph/serial_parallel/serial_parallel_metrics.h line 13 at r11 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Move to serial_parallel_decomposition.h

Done. for get_nodes is it relevant that duplicate nodes are not considered? I use duplicate nodes fairly extensively within Sp-ization to track which nodes are duplicate of each other

lib/utils/include/utils/graph/serial_parallel/serial_parallel_splits.h line 15 at r11 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I'm not a big fan of default-constructibility, why should this have a default constructor?

Makes it less annoying to deal with unordered_maps (see https://stackoverflow.com/questions/695645/why-does-the-c-map-type-argument-require-an-empty-constructor-when-using?noredirect=1&lq=1), prefer the [] syntax over things like emplace and find. Semantically, I'd say for SerialSplit and ParallelSplit it makes sense.

lib/utils/include/utils/graph/serial_parallel/sp_ization/critical_path_preserving_sp_ization.h line 10 at r11 (raw file):

Previously, lockshaw (Colin Unger) wrote…

A couple brief doxygen comments would be rather nice here if you can come up with some

I've added probably too much information and not enough at the same time, let me know how deep I should go into explaining it (general principle behind it, just what it outputs, how it differs from the others, do I need to explain what SPization is etc...)

lib/utils/src/utils/graph/digraph/algorithms/get_longest_path_lengths_from_root.cc line 53 at r11 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Probably better to just merge the checked and unchecked versions here, unless you have a strong counterargument

done.

lib/utils/src/utils/graph/digraph/algorithms/is_acyclic.cc line 33 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

This should either be simplified or additional explanation added. Why are there three states? What invariants are expected to hold? Collapsing dfs_is_acyclic down to a lambda could also help. Also a better/clearer name for dfs_is_acylic would likely help.

cleaned up and added some comments / correction. I can also change the implementation to be simpler but with quadratic runtime (current one is something like O(E+V) )

lib/utils/src/utils/graph/serial_parallel/serial_parallel_metrics.cc line 14 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

This function needs simplifying

broken up with overloads

lib/utils/src/utils/graph/serial_parallel/serial_parallel_metrics.cc line 67 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

overload and visit would probably clean up this code

cleanup by overloading, let me know if you want me to do it with directly withoverload.

lib/utils/src/utils/graph/serial_parallel/serial_parallel_metrics.cc line 81 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Avoid the widen calls, preferably add critical_path_cost overloads for ParallelSplit and SerialSplit

not sure I can avoid widen without making the code far more verbose (messy because we are pulling fomr the children, which are variants of Node and Parallel/SerialSplit

lib/utils/src/utils/graph/serial_parallel/sp_ization/critical_path_preserving_sp_ization.cc line at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

A bunch of these functions could use some simplifying, in addition to ideally some docstrings as their names are rather technical (e.g., critical_path_preserving_sp_ization_unchecked_with_coalescing is a bit hard to understand unless you're Pietro 😛)

added docstrings / comments, changed variable names and simplified the code a bit, let me know if it need further changes.

lib/utils/src/utils/graph/serial_parallel/sp_ization/work_preserving_sp_ization.cc line 33 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Try not to overload layer--maybe tier or stratum or something derived from the timeline analogy?

I'll use stratum from now on to refer to all the work preserving syncing stuff (prefer it over tier, which gives me a hierarchical impression, and makes sense since we are stratifying things).

lib/utils/src/utils/graph/serial_parallel/sp_ization/work_preserving_sp_ization.cc line 33 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

get_strata_assuming_unit_cost?

Done.

lib/utils/src/utils/graph/serial_parallel/sp_ization/work_preserving_sp_ization.cc line 43 at r13 (raw file):

  }
  return layer_to_nodes;
}

done (not sure what I was thinking)

lib/utils/src/utils/graph/serial_parallel/sp_ization/work_preserving_sp_ization.cc line 46 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Usually we use a_to_b for maps, it's a bit weird to have it for a vector?

changed

lib/utils/src/utils/graph/serial_parallel/sp_ization/work_preserving_sp_ization.cc line 55 at r13 (raw file):

    sp.children.push_back(layer);
  }
  return normalize(SerialParallelDecomposition(sp));

Done (added extra initializers to serial_parallel_splits.cc to avoid ugly casting patterns)

lib/utils/src/utils/graph/serial_parallel/sp_ization/work_preserving_sp_ization.cc line at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Try to simplify these functions--right now they're unnecessarily hard to read.

Also, while I appreciate the effort to use functions from utils/containers, they're not always the most readable version of the code--often they are, but this does need to be determined on a case-by-case basis

simplified / made things clearer, should be more understandable

lib/utils/test/src/test_algorithms.cc line 248 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Move these tests of get_longest_path_lengths_from_root to lib/utils/test/src/utils/graph/digraph/algorithms/get_longest_path_lengths_from_root.cc

Done.

lib/utils/test/src/utils/graph/digraph/algorithms/is_acyclic.cc line 26 at r11 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Apply to all subcases below

done. putCHECK(is_acyclic(g)) instead of result = ...; expected=... felt unnecessarily verbose

lib/utils/test/src/utils/graph/serial_parallel/serial_parallel_splits.cc line 12 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Prefer bounds-checked accesses

Done.

lib/utils/test/src/utils/graph/serial_parallel/serial_parallel_splits.cc line 21 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Rather than using a bunch of complicated test cases each time, try to have small/minimal test cases for each of the properties, and then if you want add a big end-to-end test case at the end. For example, it might be better to have a test case checking that ParallelSplit{n.at(0), n.at(1)} == ParallelSplit{n.at(1), n.at(0)} instead all of these big splits where it's difficult to tell what's getting tested

changed the test cases (they were overly complicated and unnecessary)

lib/utils/test/src/utils/graph/serial_parallel/serial_parallel_splits.cc line 53 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

How are these nested test cases any different than the test case above?

Done.

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/critical_path_preserving_sp_ization.cc line 20 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Prefer bounds-checked accessors

Done.

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/critical_path_preserving_sp_ization.cc line 33 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why are there two work_cost checks?

I check work_cost, critical_path_cost both for the original graph (to show what the metrics are for the graph) and for the sp decomposition. For the work invariant algorithms, the work_cost remains the same, while the critical_path_cost (generally) increases. Viceversa for the critical path cost invariant algorithm.

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/critical_path_preserving_sp_ization.cc line 92 at r13 (raw file):

      }
      SUBCASE("work cost") {
        float expected = 16;

Done.

lib/utils/include/utils/containers.h line 26 at r11 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why?

now removed

lib/utils/include/utils/containers/as_unordered_set.h line 9 at r11 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Use unordered_set_of (yes the naming is a bit irregular, but let's not try to fix that in this PR)

removed as_unordered_set

lib/utils/include/utils/graph/digraph/algorithms/get_topological_ordering.h line 11 at r11 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Move to its own header file

Done.

lib/utils/include/utils/graph/serial_parallel/parallel_composition.h line 8 at r11 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Probably can just be part of serial_parallel_decomposition.h as it feels more like a primitive operation

Done.

lib/utils/include/utils/graph/serial_parallel/serial_composition.h line 8 at r11 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Probably should just be part of serial_parallel_decomposition.h as this feels like a rather primitive operation

Done.

lib/utils/include/utils/graph/serial_parallel/serial_parallel_decomposition.variant.toml line 8 at r11 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Use explicit_constructors unless you have a strong reason--it's a bit more verbose but the code is usually a lot clearer and less bug prone

removed

lib/utils/include/utils/graph/serial_parallel/serial_parallel_isempty.h line 8 at r11 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Should probably be moved to serial_parallel_decomposition.h, this feels like a rather primitive operation

Done.

lib/utils/src/utils/graph/serial_parallel/parallel_composition.cc line 8 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

What if sp_compositions is empty?

Returns an empty ParallelSplit, wrapped around aSerialParallelDecomposition (can make it more explicit if you want to).

lib/utils/src/utils/graph/serial_parallel/parallel_composition.cc line 14 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Might be cleaner with overload?

cleaned it up, not sure overloading would make it cleaner.

lib/utils/src/utils/graph/serial_parallel/serial_composition.cc line 8 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

What if sp_compositions is empty?

Returns an empty SerialSplit, wrapped around aSerialParallelDecomposition (can make it more explicit if you want to).

lib/utils/src/utils/graph/serial_parallel/serial_composition.cc line 14 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Might be cleaner with overload

cleaned it up, not sure overloading would make it cleaner.

lib/utils/src/utils/graph/serial_parallel/serial_parallel_isempty.cc line 9 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Likely simpler to just add overloads for SerialSplit, ParallelSplit, etc. and then you can also avoid all the widening

Also probably cleaner with overload

Changed, now cleaner (couldn't avoid the widening since the children are variant<Node, Serial/Parallel>).

lib/utils/src/utils/graph/serial_parallel/serial_parallel_normalize.cc line 27 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why is this function so complicated? Is there not a cleaner way of doing this? Is it due to the alternating Serial-Parallel invariant? At the very least it seems like you could transform sp back into a binary decomposition tree and then use get_final_ast to re-flatten it which I think should handle normalization?

I've simplified it a fair bit, let me know if this is sufficient.

template <typename SplitType>
SerialParallelDecomposition normalize_sp_decomposition(SplitType const &split) {
  auto normalized_children = transform(filter_empty(split.children), [](auto const &child) {
    return normalize_sp_decomposition(widen<SerialParallelDecomposition>(child));
  });

  if (normalized_children.size() == 1) {
    return get_only(normalized_children);
  }

  if (std::is_same_v<SplitType, SerialSplit>) {
    return serial_composition(as_vector(normalized_children));
  } else if (std::is_same_v<SplitType, ParallelSplit>) {
    return parallel_composition(unordered_set_of(normalized_children));
  }
}

We can also add this in place of overloads for SerialSplit, ParallelSplit, though it's a bit ugly imo.

lib/utils/test/CMakeLists.txt line 6 at r13 (raw file):

  SRC_PATTERNS
    src/utils/*.cc
    # src/*.cc

What do you wanna do with these tests? Low priority, I can make a PR down the line to sort them out.

lib/utils/test/src/utils/graph/serial_parallel/serial_parallel_isempty.cc line 8 at r11 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("isempty function") {

Done.

lib/utils/test/src/utils/graph/serial_parallel/serial_parallel_normalize.cc line 8 at r11 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("normalize function") {

Done.

lib/utils/test/src/utils/graph/serial_parallel/serial_parallel_normalize.cc line 16 at r11 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Apply this pattern to all of the SUBCASEs below as well

Done.

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/benchmarking/sp_ization_benchmarking.cc line 1 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

What's happening with all of this commented-out code?

moved the benchmarking stuff to bin/.

lib/utils/include/utils/graph/serial_parallel/serial_parallel_normalize.h line 13 at r11 (raw file):

 * @details This function performs the following semantic substitutions:
 * - Deletes every empty SerialSplit and ParallelSplit item:
 *   S(P(S()), Node(1), Node(2)) -> S(Node(1), Node(2))

Done.

lib/utils/include/utils/graph/serial_parallel/serial_parallel_normalize.h line 16 at r11 (raw file):

 *
 * - Replaces SerialSplit and ParallelSplit of size 1 with their content:
 *   S(S(Node(1)), P(Node(2))) -> S(Node(1), Node(2))

Done.

lib/utils/include/utils/graph/serial_parallel/serial_parallel_normalize.h line 19 at r11 (raw file):

 *
 */
SerialParallelDecomposition normalize(SerialParallelDecomposition const &sp);

Done.

lib/utils/include/utils/graph/serial_parallel/serial_parallel_normalize.h line at r11 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Rename to normalize_sp_decomposition.h

Done.

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/benchmarking/distributions.h line 4 at r13 (raw file):

#define _FLEXFLOW_DISTRIBUTIONS_H

#include "utils/graph/node/graph.h"

Done.

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/benchmarking/distributions.h line 29 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Move implementations to .cc file

Done.

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/benchmarking/sample_graphs.h line 29 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Prefer bounds-checked variants

Done.

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/benchmarking/sample_graphs.h line 243 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Prefer bracket initialization for consistency

any reason to prefer {} over () other than avoiding narrowing?

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/benchmarking/sample_graphs.h line 264 at r13 (raw file):

  // From the TASO paper, pg 57
  DiGraph g = DiGraph::create<AdjacencyDiGraph>();
  Node root = get_only(add_nodes(g, 1));

Done.

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/benchmarking/sample_graphs.h line 270 at r13 (raw file):

  std::vector<Node> avg = add_nodes(g, 3);
  std::vector<Node> add = add_nodes(g, 5);
  Node concat = get_only(add_nodes(g, 1));

Done.

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/benchmarking/sample_graphs.h line 289 at r13 (raw file):

  add_edges(g, edges);

  for (auto const &a : add) {

Done.

lib/utils/test/src/utils/graph/serial_parallel/sp_ization/benchmarking/sample_graphs.h line 306 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Prefer actual type names, i.e., std::unordered_set<Node>

Done.

…m SerialParallelDecomposition

lockshaw and others added 14 commits June 3, 2024 21:04

Start on pcg builder

510c2d9

Add tests and some implementation for pcg builder

7b55ed1

Add pcg tests, make dtgen constructors explicit to fix bug

c379efd

Add remainder of PCG tests

35fa653

Merge remote-tracking branch 'origin/repo-refactor' into pcg-builder

865a28e

Fix build issues in local-execution

f379539

Format

2dbb3b9

Address Reyna comments, add topological_order function for PCG

4050c99

Pre multidigraph refactor

42c1968

Removing visitable from sp code

3be816f

Add open dataflow graph, start to replace pcg dataflow graph

6d68324

Start refactoring substitutions

64a3403

Add utility functions to support pattern matching

7d4c7be

Spization Algorithm

3279be0

Marsella8 requested a review from lockshaw June 25, 2024 22:30

Small fix

b0b5a8d

Merge branch 'repo-refactor' into spization

f0ca00a

lockshaw requested changes Jun 26, 2024

View reviewed changes

lockshaw added the repo-refactor Topics related to the repo and search refactors label Jun 26, 2024

lockshaw and others added 8 commits June 26, 2024 13:34

Pre-refactor inputs

9ab9eb2

Merge remote-tracking branch 'origin/repo-refactor' into dataflow-graph

7ae7c65

Fix proj url

f9b129e

implementation + tests for get_longest_path_lengths

c8d5177

minor fix

18752c8

Simple sp-ization refactor

c53a9d2

Naive Sp-ization algo refactor

0b27de9

Minor mix

3873d65

Marsella8 commented Jun 27, 2024

View reviewed changes

lockshaw and others added 8 commits July 24, 2024 17:05

Fix local-execution build

3791e86

Add implementations and tests for applying series/parallel reductions

267b72d

Additional SP-BenchMarking

71108b3

Formatting

f6a371a

Refactoring

0d8720d

Format

bb2769a

Clean up sp decomposition interface and tests

39cb7b3

Format

ce0234d

lockshaw requested changes Jul 27, 2024

View reviewed changes

lockshaw and others added 4 commits July 30, 2024 23:08

Add comments for top-level substitutions functions, add proj doxygen …

3dc3ec6

…support

Fix proj invocation in CI

f4dfd66

merge with dataflow-graph

e83651c

Merge with repo-refactor

4ce79c2

Marsella8 marked this pull request as draft August 2, 2024 22:25

Pietro Max Marsella added 3 commits August 3, 2024 18:50

Sp-ization reorganization

6552c47

Merge remote-tracking branch 'flexflow/repo-refactor' into spization

02e82b6

Refactoring + bringing up-to-date with DataFlow-graph interface

8d69da4

Marsella8 marked this pull request as ready for review August 6, 2024 05:24

Marsella8 commented Aug 6, 2024

View reviewed changes

lockshaw and others added 4 commits August 5, 2024 22:26

Merge branch 'repo-refactor' into spization

f77cdbb

minor fix

3222976

Merge branch 'spization' of github.com:Marsella8/FlexFlow into spization

e9fea70

Additional is_acyclic testing

06706e9

lockshaw requested changes Aug 6, 2024

View reviewed changes

Fixes

8782d4f

Marsella8 commented Aug 9, 2024

View reviewed changes

Pietro Max Marsella added 3 commits August 14, 2024 13:58

Additional SP-ization metric + Transitive Reduction Fix + DiGraph fro…

c2b4870

…m SerialParallelDecomposition

Benchmarking updates

41cff89

formatting

8ba896a

lockshaw marked this pull request as draft October 4, 2024 22:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sp-ization Algorithm #1423

Sp-ization Algorithm #1423

Marsella8 commented Jun 25, 2024 •

edited by wmdi

Loading

codecov bot commented Jun 25, 2024 •

edited

Loading

lockshaw left a comment

Marsella8 left a comment

Marsella8 left a comment

lockshaw left a comment

Marsella8 left a comment

lockshaw left a comment

Marsella8 left a comment

Sp-ization Algorithm #1423

Are you sure you want to change the base?

Sp-ization Algorithm #1423

Conversation

Marsella8 commented Jun 25, 2024 • edited by wmdi Loading

codecov bot commented Jun 25, 2024 • edited Loading

Codecov Report

lockshaw left a comment

Choose a reason for hiding this comment

Marsella8 left a comment

Choose a reason for hiding this comment

Marsella8 left a comment

Choose a reason for hiding this comment

lockshaw left a comment

Choose a reason for hiding this comment

Marsella8 left a comment

Choose a reason for hiding this comment

lockshaw left a comment

Choose a reason for hiding this comment

Marsella8 left a comment

Choose a reason for hiding this comment

Marsella8 commented Jun 25, 2024 •

edited by wmdi

Loading

codecov bot commented Jun 25, 2024 •

edited

Loading