Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Performance Bug Report] Specific input cause time differences in CallGraph.java #489

Closed
SomeoneAlice opened this issue Aug 15, 2021 · 4 comments
Milestone

Comments

@SomeoneAlice
Copy link

We are working on the Algorithmic Complexity Denial-of-Service problem and detected a performance bug from your code.

We didn’t create a pull request because we're not sure whether this bug can be triggered from the user interface. We also do not understand the functionality of this code snippet as you do. Thanks for your understanding.

Outcome

In CallGraph.java the method findCycles would have a time difference under specific condition.

Reasons

The findCycles function would take longer time in specific condition. When finding the cycles, the function would traverse the nodes again if any change is made. It results that the runtime is not just related to the input size, but also related to the input graph structure. For an acyclic graph, the algorithm complexity would be O(n), while for a chain graph of the same size, the algorithm complexity can increase to O(n^2). With massive input, the time consumption differences can be considerable.

Repeatability

A simplified test case with two inputs is provided here.

import java.util.ArrayDeque;
import java.util.Collections;
import java.util.Deque;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Map;
import java.util.Objects;
import java.util.Set;
import java.util.regex.Pattern;

public final class CallGraph {
  private static final Pattern MID_SPLIT_PTN = Pattern.compile("::");
  private final Set<Node> nodes = new HashSet<>();
  private final Set<Node> startingNodes = new HashSet<>();

  public static String methodId(String name, String desc) {
    return name + "::" + desc;
  }

  public static String[] method(String methodId) {
    if (methodId.contains("::")) {
      return MID_SPLIT_PTN.split(methodId);
    }
    return new String[0];
  }

  public void addEdge(String fromId, String toId) {
    Node fromNode = null;
    Node toNode = null;
    for (Node n : nodes) {
      if (n.id.equals(fromId)) {
        fromNode = n;
      }
      if (n.id.equals(toId)) {
        toNode = n;
      }
      if (fromNode != null && toNode != null) break;
    }
    if (fromNode == null) {
      fromNode = new Node(fromId);
      nodes.add(fromNode);
    }
    if (toNode == null) {
      toNode = new Node(toId);
      nodes.add(toNode);
    }
    Edge e = new Edge(fromNode, toNode);
    fromNode.addOutgoing(e);
    toNode.addIncoming(e);
  }

  private Set<Node> findCycles() {
    if (nodes.size() < 2) return Collections.emptySet();

    Map<String, Node> checkingNodes = new HashMap<>();
    for (Node n : nodes) {
      Node newN = checkingNodes.get(n.id);
      if (newN == null) {
        newN = new Node(n.id);
        checkingNodes.put(n.id, newN);
      }
      for (Edge e : n.incoming) {
        Node fromN = checkingNodes.get(e.from.id);
        if (fromN == null) {
          fromN = new Node(e.from.id);
          checkingNodes.put(e.from.id, fromN);
        }
        Edge ee = new Edge(fromN, newN);
        newN.addIncoming(ee);
        fromN.addOutgoing(ee);
      }
      for (Edge e : n.outgoing) {
        Node toN = checkingNodes.get(e.to.id);
        if (toN == null) {
          toN = new Node(e.to.id);
          checkingNodes.put(e.to.id, toN);
        }
        Edge ee = new Edge(newN, toN);
        newN.addOutgoing(ee);
        toN.addIncoming(ee);
      }
    }

    boolean changesMade = false;
    Set<Node> sortedNodes = new HashSet<>(checkingNodes.values());
    do {
      changesMade = false;

      Iterator<Node> iter = sortedNodes.iterator();
      while (iter.hasNext()) {
        Node n = iter.next();
        if ((n.incoming.isEmpty() && !startingNodes.contains(n)) || n.outgoing.isEmpty()) {
          changesMade = true;
          for (Edge e : new HashSet<>(n.incoming)) {
            e.delete();
          }
          for (Edge e : new HashSet<>(n.outgoing)) {
            e.delete();
          }
          iter.remove();
        }
      }
    } while (changesMade);
    return sortedNodes;
  }

  public static class Node {
    private final String id;
    private final Set<Edge> incoming = new HashSet<>();
    private final Set<Edge> outgoing = new HashSet<>();

    public Node(String id) {
      this.id = id;
    }

    public void addIncoming(Edge e) {
      incoming.add(e);
    }

    public void addOutgoing(Edge e) {
      outgoing.add(e);
    }

    public void removeIncoming(Edge e) {
      incoming.remove(e);
    }

    public void removeOutgoing(Edge e) {
      outgoing.remove(e);
    }

    @Override
    public boolean equals(Object obj) {
      if (obj == null) {
        return false;
      }
      if (getClass() != obj.getClass()) {
        return false;
      }
      Node other = (Node) obj;
      return Objects.equals(id, other.id);
    }

    @Override
    public int hashCode() {
      int hash = 7;
      hash = 11 * hash + (id != null ? id.hashCode() : 0);
      return hash;
    }

    @Override
    public String toString() {
      StringBuilder sb = new StringBuilder();
      sb.append("Node{id='").append(id).append("'}");
      sb.append("\n");
      sb.append("incomming:\n");
      sb.append("=============================\n");
      for (Edge e : incoming) {
        sb.append(e.from.id).append("\n");
      }
      sb.append("=============================\n");
      sb.append("outgoing:\n");
      for (Edge e : outgoing) {
        sb.append(e.to.id).append("\n");
      }
      sb.append("=============================\n");

      return sb.toString();
    }
  }

  public static class Edge {
    private final Node from;
    private final Node to;

    public Edge(Node from, Node to) {
      this.from = from;
      this.to = to;
    }

    public void delete() {
      from.removeOutgoing(this);
      to.removeIncoming(this);
    }

    @Override
    @SuppressWarnings("ReferenceEquality")
    public boolean equals(Object obj) {
      if (obj == null) {
        return false;
      }
      if (getClass() != obj.getClass()) {
        return false;
      }
      Edge other = (Edge) obj;
      if (!Objects.equals(from, other.from)) {
        return false;
      }
      return Objects.equals(to, other.to);
    }

    @Override
    public int hashCode() {
      int hash = 5;
      hash = 37 * hash + (from != null ? from.hashCode() : 0);
      hash = 37 * hash + (to != null ? to.hashCode() : 0);
      return hash;
    }
  }
  
  public static void main(String[] args) {
	  CallGraph graph1 = new CallGraph();
	  graph1.addEdge("1", "2");
	  graph1.addEdge("2", "3");
	  graph1.addEdge("3", "4");
	  graph1.addEdge("4", "5");
	  graph1.addEdge("5", "6");
	  graph1.addEdge("6", "7");
	  graph1.addEdge("7", "8");
	  graph1.addEdge("8", "9");
	  graph1.addEdge("9", "10");
	  graph1.addEdge("10", "11");
	  graph1.addEdge("11", "12");
	  graph1.addEdge("12", "1");
	  graph1.findCycles();

	  CallGraph graph2 = new CallGraph();
	  graph2.addEdge("1", "2");
	  graph2.addEdge("2", "3");
	  graph2.addEdge("3", "4");
	  graph2.addEdge("4", "5");
	  graph2.addEdge("5", "6");
	  graph2.addEdge("6", "7");
	  graph2.addEdge("7", "8");
	  graph2.addEdge("8", "9");
	  graph2.addEdge("9", "10");
	  graph2.addEdge("10", "11");
	  graph2.addEdge("11", "12");
	  graph2.findCycles();
  }
}
@jbachorik
Copy link
Collaborator

Hi @SomeoneAlice, thanks for the report!

Do you have a recommendation for a better performing algorithm?

Currently, there is very little chance of getting 'massive' input as the cycle detection is isolated to on BTrace script so the input will be limited by the number of methods one usually puts into one such script.

@jbachorik
Copy link
Collaborator

Hi @SomeoneAlice - do you want to proceed here? If not I will close this issue within this week.

@SomeoneAlice
Copy link
Author

Hi, the worst-case condition can be easily avoided by checking the node on the other side while deleting an edge, the idea is similar to topological sort.

Specifically, the below code can be applyed to replace the original line 191-210.

// first identify nodes to remove
Set<Node> q;
Iterator<Node> iter = sortedNodes.iterator();
while (iter.hasNext()) {
    Node n = iter.next();
    if ((n.incoming.isEmpty() && !startingNodes.contains(n)) || n.outgoing.isEmpty()) {
        q.add(n);
    }
}

// then iteratively update q
while (!q.isEmpty()) {
    Node node = q.iterator().next();
    q.remove(node);
    sortedNodes.remove(node);
    for (Edge e : new HashSet<>(node.incoming)) {
        e.delete();
        if (e.from.incoming.isEmpty() && !startingNodes.contains(e.from)) {
            queue.add(e.from);
        }
    }
    for (Edge e : new HashSet<>(node.outgoing)) {
        e.delete();
        if (e.to.outgoing.isEmpty()) {
            queue.add(e.to);
        }
    }
}

@jbachorik
Copy link
Collaborator

Hi @SomeoneAlice , thanks for the contribution.
Your proposal has been integrated.

@jbachorik jbachorik added this to the 2.2.1 milestone Nov 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants