[Performance Bug Report] Specific input cause time differences in CallGraph.java #489

SomeoneAlice · 2021-08-15T15:07:32Z

We are working on the Algorithmic Complexity Denial-of-Service problem and detected a performance bug from your code.

We didn’t create a pull request because we're not sure whether this bug can be triggered from the user interface. We also do not understand the functionality of this code snippet as you do. Thanks for your understanding.

Outcome

In CallGraph.java the method findCycles would have a time difference under specific condition.

Reasons

The findCycles function would take longer time in specific condition. When finding the cycles, the function would traverse the nodes again if any change is made. It results that the runtime is not just related to the input size, but also related to the input graph structure. For an acyclic graph, the algorithm complexity would be O(n), while for a chain graph of the same size, the algorithm complexity can increase to O(n^2). With massive input, the time consumption differences can be considerable.

Repeatability

A simplified test case with two inputs is provided here.

import java.util.ArrayDeque;
import java.util.Collections;
import java.util.Deque;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Map;
import java.util.Objects;
import java.util.Set;
import java.util.regex.Pattern;

public final class CallGraph {
  private static final Pattern MID_SPLIT_PTN = Pattern.compile("::");
  private final Set<Node> nodes = new HashSet<>();
  private final Set<Node> startingNodes = new HashSet<>();

  public static String methodId(String name, String desc) {
    return name + "::" + desc;
  }

  public static String[] method(String methodId) {
    if (methodId.contains("::")) {
      return MID_SPLIT_PTN.split(methodId);
    }
    return new String[0];
  }

  public void addEdge(String fromId, String toId) {
    Node fromNode = null;
    Node toNode = null;
    for (Node n : nodes) {
      if (n.id.equals(fromId)) {
        fromNode = n;
      }
      if (n.id.equals(toId)) {
        toNode = n;
      }
      if (fromNode != null && toNode != null) break;
    }
    if (fromNode == null) {
      fromNode = new Node(fromId);
      nodes.add(fromNode);
    }
    if (toNode == null) {
      toNode = new Node(toId);
      nodes.add(toNode);
    }
    Edge e = new Edge(fromNode, toNode);
    fromNode.addOutgoing(e);
    toNode.addIncoming(e);
  }

  private Set<Node> findCycles() {
    if (nodes.size() < 2) return Collections.emptySet();

    Map<String, Node> checkingNodes = new HashMap<>();
    for (Node n : nodes) {
      Node newN = checkingNodes.get(n.id);
      if (newN == null) {
        newN = new Node(n.id);
        checkingNodes.put(n.id, newN);
      }
      for (Edge e : n.incoming) {
        Node fromN = checkingNodes.get(e.from.id);
        if (fromN == null) {
          fromN = new Node(e.from.id);
          checkingNodes.put(e.from.id, fromN);
        }
        Edge ee = new Edge(fromN, newN);
        newN.addIncoming(ee);
        fromN.addOutgoing(ee);
      }
      for (Edge e : n.outgoing) {
        Node toN = checkingNodes.get(e.to.id);
        if (toN == null) {
          toN = new Node(e.to.id);
          checkingNodes.put(e.to.id, toN);
        }
        Edge ee = new Edge(newN, toN);
        newN.addOutgoing(ee);
        toN.addIncoming(ee);
      }
    }

    boolean changesMade = false;
    Set<Node> sortedNodes = new HashSet<>(checkingNodes.values());
    do {
      changesMade = false;

      Iterator<Node> iter = sortedNodes.iterator();
      while (iter.hasNext()) {
        Node n = iter.next();
        if ((n.incoming.isEmpty() && !startingNodes.contains(n)) || n.outgoing.isEmpty()) {
          changesMade = true;
          for (Edge e : new HashSet<>(n.incoming)) {
            e.delete();
          }
          for (Edge e : new HashSet<>(n.outgoing)) {
            e.delete();
          }
          iter.remove();
        }
      }
    } while (changesMade);
    return sortedNodes;
  }

  public static class Node {
    private final String id;
    private final Set<Edge> incoming = new HashSet<>();
    private final Set<Edge> outgoing = new HashSet<>();

    public Node(String id) {
      this.id = id;
    }

    public void addIncoming(Edge e) {
      incoming.add(e);
    }

    public void addOutgoing(Edge e) {
      outgoing.add(e);
    }

    public void removeIncoming(Edge e) {
      incoming.remove(e);
    }

    public void removeOutgoing(Edge e) {
      outgoing.remove(e);
    }

    @Override
    public boolean equals(Object obj) {
      if (obj == null) {
        return false;
      }
      if (getClass() != obj.getClass()) {
        return false;
      }
      Node other = (Node) obj;
      return Objects.equals(id, other.id);
    }

    @Override
    public int hashCode() {
      int hash = 7;
      hash = 11 * hash + (id != null ? id.hashCode() : 0);
      return hash;
    }

    @Override
    public String toString() {
      StringBuilder sb = new StringBuilder();
      sb.append("Node{id='").append(id).append("'}");
      sb.append("\n");
      sb.append("incomming:\n");
      sb.append("=============================\n");
      for (Edge e : incoming) {
        sb.append(e.from.id).append("\n");
      }
      sb.append("=============================\n");
      sb.append("outgoing:\n");
      for (Edge e : outgoing) {
        sb.append(e.to.id).append("\n");
      }
      sb.append("=============================\n");

      return sb.toString();
    }
  }

  public static class Edge {
    private final Node from;
    private final Node to;

    public Edge(Node from, Node to) {
      this.from = from;
      this.to = to;
    }

    public void delete() {
      from.removeOutgoing(this);
      to.removeIncoming(this);
    }

    @Override
    @SuppressWarnings("ReferenceEquality")
    public boolean equals(Object obj) {
      if (obj == null) {
        return false;
      }
      if (getClass() != obj.getClass()) {
        return false;
      }
      Edge other = (Edge) obj;
      if (!Objects.equals(from, other.from)) {
        return false;
      }
      return Objects.equals(to, other.to);
    }

    @Override
    public int hashCode() {
      int hash = 5;
      hash = 37 * hash + (from != null ? from.hashCode() : 0);
      hash = 37 * hash + (to != null ? to.hashCode() : 0);
      return hash;
    }
  }
  
  public static void main(String[] args) {
	  CallGraph graph1 = new CallGraph();
	  graph1.addEdge("1", "2");
	  graph1.addEdge("2", "3");
	  graph1.addEdge("3", "4");
	  graph1.addEdge("4", "5");
	  graph1.addEdge("5", "6");
	  graph1.addEdge("6", "7");
	  graph1.addEdge("7", "8");
	  graph1.addEdge("8", "9");
	  graph1.addEdge("9", "10");
	  graph1.addEdge("10", "11");
	  graph1.addEdge("11", "12");
	  graph1.addEdge("12", "1");
	  graph1.findCycles();

	  CallGraph graph2 = new CallGraph();
	  graph2.addEdge("1", "2");
	  graph2.addEdge("2", "3");
	  graph2.addEdge("3", "4");
	  graph2.addEdge("4", "5");
	  graph2.addEdge("5", "6");
	  graph2.addEdge("6", "7");
	  graph2.addEdge("7", "8");
	  graph2.addEdge("8", "9");
	  graph2.addEdge("9", "10");
	  graph2.addEdge("10", "11");
	  graph2.addEdge("11", "12");
	  graph2.findCycles();
  }
}

The text was updated successfully, but these errors were encountered:

jbachorik · 2021-10-03T17:26:42Z

Hi @SomeoneAlice, thanks for the report!

Do you have a recommendation for a better performing algorithm?

Currently, there is very little chance of getting 'massive' input as the cycle detection is isolated to on BTrace script so the input will be limited by the number of methods one usually puts into one such script.

jbachorik · 2021-10-11T18:14:35Z

Hi @SomeoneAlice - do you want to proceed here? If not I will close this issue within this week.

SomeoneAlice · 2021-10-18T16:47:28Z

Hi, the worst-case condition can be easily avoided by checking the node on the other side while deleting an edge, the idea is similar to topological sort.

Specifically, the below code can be applyed to replace the original line 191-210.

// first identify nodes to remove
Set<Node> q;
Iterator<Node> iter = sortedNodes.iterator();
while (iter.hasNext()) {
    Node n = iter.next();
    if ((n.incoming.isEmpty() && !startingNodes.contains(n)) || n.outgoing.isEmpty()) {
        q.add(n);
    }
}

// then iteratively update q
while (!q.isEmpty()) {
    Node node = q.iterator().next();
    q.remove(node);
    sortedNodes.remove(node);
    for (Edge e : new HashSet<>(node.incoming)) {
        e.delete();
        if (e.from.incoming.isEmpty() && !startingNodes.contains(e.from)) {
            queue.add(e.from);
        }
    }
    for (Edge e : new HashSet<>(node.outgoing)) {
        e.delete();
        if (e.to.outgoing.isEmpty()) {
            queue.add(e.to);
        }
    }
}

jbachorik · 2021-11-08T20:14:16Z

Hi @SomeoneAlice , thanks for the contribution.
Your proposal has been integrated.

jbachorik closed this as completed in c56607b Nov 8, 2021

jbachorik added this to the 2.2.1 milestone Nov 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance Bug Report] Specific input cause time differences in CallGraph.java #489

[Performance Bug Report] Specific input cause time differences in CallGraph.java #489

SomeoneAlice commented Aug 15, 2021

jbachorik commented Oct 3, 2021

jbachorik commented Oct 11, 2021

SomeoneAlice commented Oct 18, 2021

jbachorik commented Nov 8, 2021

[Performance Bug Report] Specific input cause time differences in CallGraph.java #489

[Performance Bug Report] Specific input cause time differences in CallGraph.java #489

Comments

SomeoneAlice commented Aug 15, 2021

Outcome

Reasons

Repeatability

jbachorik commented Oct 3, 2021

jbachorik commented Oct 11, 2021

SomeoneAlice commented Oct 18, 2021

jbachorik commented Nov 8, 2021