Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better traces #19

Merged
merged 2 commits into from
May 31, 2023
Merged

Better traces #19

merged 2 commits into from
May 31, 2023

Conversation

bartoszmodelski
Copy link
Contributor

@bartoszmodelski bartoszmodelski commented Apr 7, 2023

Currently, dscheck has rudimentary trace printing for failed assertions. This PR improves it on a few fronts.

Vocab

Let's start being a little more careful with words:

  • execution sequence, history, interleaving - a particular execution
  • trace - an equivalence class of equal execution sequences
  • maximal [execution sequence | trace | ..] - sequence reaching the terminal state

I used to use trace and execution sequence interchangeably before. Turns out in the dpor literature they are not and it's a helpful distinction. I'll add that into readme at some point.

Interleavings

Improve the way we print interleavings and add an option to print all maximal sequences visited. With that is's easier to understand what's actually being checked and why the tested program fails.

Process 0: start 
Process 0: fetch_and_add 1
Process 1: start 
Process 1: fetch_and_add 1
Process 1: fetch_and_add 2
----
Process 0: start 
Process 1: start 
Process 1: fetch_and_add 1
Process 0: fetch_and_add 1
Process 1: fetch_and_add 2
...

->

sequence 1
----------------------------------------
P0			P1			
----------------------------------------
start 
fetch_and_add a
			start 
			fetch_and_add a
			fetch_and_add b
----------------------------------------

sequence 2
----------------------------------------
P0			P1			
----------------------------------------
start 
			start 
			fetch_and_add a
fetch_and_add a
			fetch_and_add b
----------------------------------------

explored 2 maximal interleavings and 9 states

Traces

DPOR methods aim to enumerate exactly one execution sequence per trace. Less than that and we're skipping an interleaving that may happen in the real-world. More than that is less bad: we're exploring redundant interleavings but we can still validate programs, just slower. While DPORs are complicated, it's fairly easy to tell whether two execution sequences belong to the same trace. There's a nice formalism, which boils down to the rule, that we can swap any two adjacent independent operations in the trace. If a number of such swaps takes us from one sequence to the other, they belong to the same trace. In our case, independent operations are ones that:

  • belong to different threads (we cannot order two instructions that are program-ordered)
  • work on disjoint locations (otherwise the swap produces a new final state).

For example:
set a 1, set b 2, set b 2, set a 1 are equal
set a 1, set a 2, set a 2, set a 1 are naturally not

Therefore, we can take all execution sequences that DSCheck explores, dedupe the equivalent ones, and we have precisely the list of interesting interleavings that have been explored. These consist of pairs (proc id, variable) as that's all the info we need to dedupe. Sample output: (2,c),(1,c),(2,b),(1,b),(2,d),(0,b). With that its super easy to see:

  • Whether future changes are unsafe - if any trace disappears, something's off.
  • How inefficient we are - with execution sequences / traces ratio. E.g. the 12k execution sequences on mscott (n=4) are contained in just 16 traces. Further, we can easily track down the redundant interleavings.

Reference: Godefroid's thesis, chap 3.

Deduplication

I tried a few methods of deduplication:

Bubble sort

We know, which pairs of operations can be swapped. Let's put some ordering (e.g. lower proc first) and keep swapping until no more swaps available. This doesn't work since the order is partial and operations cannot jump over clusters, e.g.:
(1,a),(2,b),(0,b) - (2,b),(0,b),(1,a)

The strings are clearly not equal, yet we cannot do any more swaps as per our order.

Topological sort

This work well on human-created tests, which tend to have lots of operations and few variables. Such tests have a lot of relationships, which forced traces into some normal form. No longer true with my machine generated ones, which may have a lot of independent accesses, which topsort would leave as they were.

Compare partial orders directly

Every execution sequence is a total order induced by partial order (trace). E.g. the PO for (1,a),(2,b),(0,b) is ((2,b) < (0,b)). As long as sequence induces the same PO, it belong to the same trace. Instead of trying to bring execution sequences into some normal forms as before, let's just compare the partial orders directly. We'll still keep the some execution sequence for each PO since they are far easier to read for a human.

Testing

With above support for traces, we need to ensure that we can easily generate traces of all tests and compare them across different versions. This PR adds ./gen_traces.sh that dumps traces, which will then be tracked in git. I tried to integrate that with dune (as is done with rules for the new pretty printing) but it does not work too well with alcotest. Further, the tests with traces on are significantly slower, so I think it's fine to keep them separate until we find something better.

Also added two new tests with conditionals.

@bartoszmodelski bartoszmodelski requested a review from a team April 7, 2023 11:24
@bartoszmodelski bartoszmodelski mentioned this pull request Apr 7, 2023
@bartoszmodelski bartoszmodelski changed the title Better traces printing Better traces Apr 7, 2023
src/trace_tracker.ml Outdated Show resolved Hide resolved
@@ -0,0 +1,87 @@
module Op = struct
type t = { proc : int; variable : int; step : int }
let is_dependent t1 t2 = t1.proc == t2.proc || t1.variable == t2.variable
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it significantly lower the number of optimal traces to consider that two parallel get t, get t commute?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, potentially. Feels like there's many real-world program where that would be significant. It's quite a rabbit hole on its own, likewise we don't have to commute writes with the same value or failing cases.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be explored in a follow up, this PR is already useful as it is :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that sounds good. I think this part is best left as is, for it to match the implementation. Otherwise while checking new ideas it might collapse together sequences that should be explored separately under our naive dependency relation (so our algo is skipping traces, albeit immaterial ones).

As far as I understand the lit, a dpor algorithm treats the dependency relationship as complete blackbox. We should be able to easily parametrise our exploration over whatever binary relationship as long as it conforms with Godfroid 3.1. That should make testing & adding the optims like the one proposed above more straightforward.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For inspiration here, the TruST paper uses a nice trick where all operations are broken down into get and set, such that you would only need to consider how those two commutes (a successful CAS is a get;set while a failing one is just a single get)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Do you know if they are also breaking down fetch and add, incr like that or do they do something special?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think it's the same situation, fad and exchange always succeed and translate to a get;set. To be clear, they don't spell out in the paper that they do this, but their algorithm only works if you break the Atomic operations into get and set since that's the only operations they support (the translation is implicit since it's supposed to be enforced by the chosen memory model).

@bartoszmodelski
Copy link
Contributor Author

While doing automated tests I found cases where above does not deduplicate all the equivalent traces. I'll push a fix tomorrow.

@bartoszmodelski
Copy link
Contributor Author

Fixed, added a paragraph about the approach to deduplication.

@bartoszmodelski bartoszmodelski mentioned this pull request Apr 17, 2023
@bartoszmodelski bartoszmodelski force-pushed the better-traces-output branch 2 times, most recently from 2e85720 to 0a2a617 Compare April 24, 2023 10:43
Copy link
Collaborator

@lyrm lyrm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is an essential feature and I did not see anything wrong. I put a few comments, mostly to ask clarification about how things work.

Also, does the algorithm that determines which trace is a particular execution sequence (the comparison of partial orders) is explained in a paper / thesis ?

Comment on lines 36 to 37
Hashtbl.remove steps proc;
Hashtbl.add steps proc (v + 1));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(a detail) This is the Hashtbl.replace function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this to use replace (and simplified the surroundings).


let tag_with_deps (t : t) : Key.t =
let next_dep op t = List.find_opt (Op.is_dependent op) t in
let rec f t =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming thing, the hard part of programming :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha yes, renamed this to attach_deps

Comment on lines 391 to 392
Printf.fprintf chan "\nexplored %d maximal interleavings and %d states\n"
!num_interleavings !num_states);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is uncleared to me what a maximal interleavings and a state are. Could you give some additional explanations or examples ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the word maximal, since usually we refer to interleavings meaning maximal interleavings anyway (and there's no need to be extra rigorous here). As for the state it's a bit harder - the key here is that dscheck sees a multicore program as a graph, and then state is just a state. Hopefully the new readme will make that clearer.

Comment on lines 49 to 51
val trace :
?interleavings:out_channel -> ?record_traces:bool -> (unit -> unit) -> unit
(** start the simulation trace *)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need some additional documentation about the optional arguments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added notes about the new arguments.

Comment on lines 14 to 16
if Atomic.get c = 0 then (
seen_b := Atomic.get b;
if !seen_b = 0 then Atomic.set ok true))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these conditional tests (test_conditional1 and test_conditional2) interesting ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to conditional_nested and conditional_ssb (also highlighted it's "borrowed" from the paper)

create trace tracker, add two tests with conditionals and dump traces
for all tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants