Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure discovered descriptor graphs are acyclic #1451

Merged
merged 1 commit into from
Jun 8, 2018

Conversation

sormuras
Copy link
Member

@sormuras sormuras commented Jun 5, 2018

Overview

Prior this commit only a null-check was performed by the DefaultLauncher against the root test descriptor returned by an engine. Other properties where not checked, especially if the graph of that root descriptor contained cycles it could lead to stack overflow errors when traversing
the graph for execution.

This commit addresses this issue by explicitly checking the returned graph of the root test descriptor for cycles. If a cycle is detected a PreconditionViolationException is thrown.

Fixes: #1447

@codecov
Copy link

codecov bot commented Jun 5, 2018

Codecov Report

Merging #1451 into master will decrease coverage by <.01%.
The diff coverage is 90.9%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1451      +/-   ##
============================================
- Coverage     91.77%   91.77%   -0.01%     
- Complexity     3202     3208       +6     
============================================
  Files           295      296       +1     
  Lines          7710     7729      +19     
  Branches        654      658       +4     
============================================
+ Hits           7076     7093      +17     
- Misses          472      475       +3     
+ Partials        162      161       -1
Impacted Files Coverage Δ Complexity Δ
.../junit/platform/launcher/core/DefaultLauncher.java 94.73% <100%> (-0.07%) 25 <0> (-1)
.../launcher/core/EngineDiscoveryResultValidator.java 90% <90%> (ø) 8 <8> (?)
...jupiter/engine/execution/ExtensionValuesStore.java 88.57% <0%> (-1.43%) 21% <0%> (-1%)
...java/org/junit/platform/engine/TestDescriptor.java 100% <0%> (+4%) 17% <0%> (ø) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2b4ed35...9f662e5. Read the comment docs.

Copy link
Contributor

@jbduncan jbduncan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good to me! I love that you caught out that cycles may happen, so well done on that!

Just three minor comments which I think should be addressed, then as an external observer I think this PR would be good to go. See below. :)

static boolean isAcyclic(TestDescriptor root) {
Set<UniqueId> visited = new HashSet<>();
visited.add(root.getUniqueId());
Stack<TestDescriptor> stack = new Stack<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: Could be Deque<TestDescriptor> stack = new ArrayDeque<> to increase performance slightly. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any link to a proof? Are we talking 10%, 1% or 0.001% here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any link to a proof?

Yep! In the Java 8 API docs for ArrayDeque, the class javadocs say,

This class is likely to be faster than Stack when used as a stack, and faster than LinkedList when used as a queue.

Are we talking 10%, 1% or 0.001% here?

I don't know, actually! And I don't actually know how to use JMH or another benchmarking tool to show how much... My bad. :P

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hehe, okay. Taking the likely as solid hint. (-:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you were discussing Stack vs Array, let me add my one cent. Stack is actually a subclass of Vector which is synchronized and in essence is part of the 'legacy' classes preexisting the java 2 introduced collections framework. In Vector's javadoc we can read:

As of the Java 2 platform v1.2, this class was retrofitted to implement the List interface, making it a member of the Java Collections Framework. Unlike the new collection implementations, Vector is synchronized. If a thread-safe implementation is not needed, it is recommended to use ArrayList in place of Vector.

Also in Stack's javadoc we can see another hint that Stack is legacy, reading:

A more complete and consistent set of LIFO stack operations is provided by the Deque interface and its implementations, which should be used in preference to this class.

On the other hand when we want a stack it is natural to use Stack... why would someone want to use a Deque..? It makes no sense... 😳

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gaganis I know, it's very weird and the complete opposite of intuitive. This is one of the many drawbacks of Java's policy on maintaining backwards compatibility as much as possible.

I don't know why they originally implemented Stack by extending Vector all those years ago, or why they made it a concrete class instead of an interface which would have allowed alternative stack implementations. However, as I think you've noticed, due to those historical decisions and how Stack consequently uses synchronized internally, Stack does perform not as well as ArrayDeque as a stack in single-thread situations.

As for why we're only provided with a Deque interface and not also a super-interface that provides just push() and pop(), I've no idea as to why it was done that way. Seems a bit silly to me, personally.

...Not sure what else to say. :)

@@ -69,6 +70,27 @@
return testEngines;
}

/**
* @return {@code true} if the graph does _not_ contain a cycle; else {@code false}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the _not_ should be replaced with <em>not</em>?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

}
}
return true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sormuras I'm not clear on what algorithm isAcyclic() uses to check for cycles. Can you place a comment towards the top of the body so that it's clear, so something like the following?

static boolean isAcyclic() {
    // <name-of-algorithm-here>
    . . .
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, never invented a name for some lines of code before. stack-based-traversing-with-bread-crumb-checks?

Copy link
Contributor

@jbduncan jbduncan Jun 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha!

I was hoping that your algorithm had an official name like Kahn's algorithm for topological sorting or Tarjan's algorithm for finding strongly-connected components, as I understand from these StackOverflow answers that those algorithms can be used to find cycles in graphs.

If alternatively your algorithm is home-grown and it was inspired by an existing graph traversal algorithm like depth-first search or breadth-first search, then it would help clarify things (for me, at least) if that was noted down somewhere, along with what change(s) were made to the traversal algorithm to allow cycles to be detected.

But maybe I'm asking for way too much here, in which case I'd be happy to slowly read through the code in my own time later today to figure out what it's doing. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we don't actually have a directed graph. Rather, we have a tree.

So we should probably change the title of the issue and PR to avoid confusion. 😉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And that's a breadth-first traversal of the tree of test descriptors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but I'd still refer to it as a tree in this context. 😇

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And that's a breadth-first traversal of the tree of test descriptors.

@sbrannen Cool, thank you for the clarification/confirmation! 😀👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah cool, rereading the code with the knowledge that it uses breadth-first search, I understand what it's doing now: it's using a stack and continuous calls to stack.pop().getChildren() to iterate over the entire tree in breath-first order, and it keeps track of visited nodes to ensure there are no cycles. Got it! 👍

I'd personally love it if a minor comment were placed at the top of the body like // based on breath-first search, but I won't insist. 🙂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, if @sormuras describes it as "breath-first [sic]", I would find that totally amusing...

Take a deep breath first... and then dive deep! 🤣

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha! facepalm

This is what I get for typing things with my phone. :P

@sormuras sormuras force-pushed the issues-1447-ensure-acyclic-graphs branch from 16dedb9 to 2ba57d1 Compare June 6, 2018 03:17
Copy link
Member

@marcphilipp marcphilipp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for noticing and improving, @sormuras!

return false; // id already known: cycle detected!
}
if (child.isContainer()) {
stack.add(child);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using pop() and add() will reverse the iteration order. Shouldn't hurt, but might make debugging harder.

Copy link
Contributor

@jbduncan jbduncan Jun 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcphillip Your saying that has prompted me to think more, and I think you're almost right; AFAICT, it will actually cause the tree to be traversed in depth-first order rather than as well as reversing the iteration order. As you said, it wouldn't hurt (except for debugging), but I think swapping the stack out for a Queue<> queue = new ArrayDeque<>() should fix things.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You‘re right, it‘s indeed depth-first. However since we‘re visiting all descriptors anyway it doesn’t really matter. 😉

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Queue. So be it.

@@ -132,6 +155,9 @@ private Root discoverRoot(LauncherDiscoveryRequest discoveryRequest, String phas
() -> String.format(
"The discover() method for TestEngine with ID '%s' must return a non-null root TestDescriptor.",
testEngine.getId()));
Preconditions.condition(isAcyclic(engineRoot),
() -> String.format("The discover() method for TestEngine with ID '%s' returned a cyclic graph.",
testEngine.getId()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should extract these checks into a separate EngineDiscoveryResultValidator class or sth. similar.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move also the not-null-check?

Prior this commit only a null-check was performed by the DefaultLauncher
against the root test descriptor returned by an engine. Other properties
where not checked, especially if the graph of that root descriptor
contained cycles it could lead to stack overflow errors when traversing
the graph for execution.

This commit addresses this issue by explicitly checking the returned
graph of the root test descriptor for cycles. If a cycle is detected a
PreconditionViolationException is thrown.

Fixes: #1447
@sormuras sormuras force-pushed the issues-1447-ensure-acyclic-graphs branch from 2ba57d1 to 9f662e5 Compare June 7, 2018 08:19
@sormuras sormuras merged commit 27ea0d4 into master Jun 8, 2018
@sormuras sormuras deleted the issues-1447-ensure-acyclic-graphs branch June 8, 2018 10:56
@ghost ghost removed the status: in progress label Jun 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants