Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Sabre to an explicitly stateful router #11977

Merged
merged 3 commits into from
Apr 11, 2024

Conversation

jakelishman
Copy link
Member

Summary

This commit is an overdue tidy up of the Sabre code, which had been through a few growth spurts with the addition of the combined layout-and-routing pass in Rust, and the support for control-flow operations and directives. I have a few things I'd like to try with Sabre, and the code was getting quite unwieldy to modify and extend.

This refactors the Sabre routing internals, encapsulating a "routing target" into a single view object that is used to define the hardware target, and the stateful components of the routing algorithm into a formal RoutingState object. The functions that build up the routing algorithm then become stateful instance methods, avoiding needing to pass many things through several internal function calls.

In addition to the non-trivial lines-of-code savings, this also made it clearer to me (while doing the refactor) that routing-state methods were not all really at similar levels of abstraction, meaning that things like the escape-valve mechanism took up oversized space in the description of the main algorithm, and the control-flow block handling was not as neatly split from the rest of the logic as it could have been. This reorganises some of the methods to make the important components of the algorithms clearer; the top level of the algorithm now fits on one screen.

Lastly, this moves both layout and routing into a unified sabre module, mostly just to simplify all the use statements and to put logically grouped code in the same place.

Details and comments

This is a refactor, so I've deliberately kept it RNG compatible, despite it needing a now-useless extra pRNG object in the combined layout+routing to ensure we use the same seed.

@jakelishman jakelishman added Changelog: None Do not include in changelog Rust This PR or issue is related to Rust code in the repository mod: transpiler Issues and PRs related to Transpiler labels Mar 8, 2024
@jakelishman jakelishman added this to the 1.1.0 milestone Mar 8, 2024
@jakelishman jakelishman requested a review from a team as a code owner March 8, 2024 23:38
@qiskit-bot
Copy link
Collaborator

One or more of the the following people are requested to review this:

@coveralls
Copy link

coveralls commented Mar 8, 2024

Pull Request Test Coverage Report for Build 8598851157

Details

  • 473 of 485 (97.53%) changed or added relevant lines in 11 files are covered.
  • 4 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.003%) to 89.377%

Changes Missing Coverage Covered Lines Changed/Added Lines %
crates/accelerate/src/sabre/mod.rs 40 52 76.92%
Files with Coverage Reduction New Missed Lines %
crates/qasm2/src/lex.rs 4 92.88%
Totals Coverage Status
Change from base Build 8598608314: -0.003%
Covered Lines: 60163
Relevant Lines: 67314

💛 - Coveralls

This commit is an overdue tidy up of the Sabre code, which had been
through a few growth spurts with the addition of the combined
layout-and-routing pass in Rust, and the support for control-flow
operations and directives. I have a few things I'd like to try with
Sabre, and the code was getting quite unwieldy to modify and extend.

This refactors the Sabre routing internals, encapsulating a "routing
target" into a single view object that is used to define the hardware
target, and the stateful components of the routing algorithm into a
formal `RoutingState` object.  The functions that build up the routing
algorithm then become stateful instance methods, avoiding needing to
pass many things through several internal function calls.

In addition to the non-trivial lines-of-code savings, this also made it
clearer to me (while doing the refactor) that routing-state methods were
not all really at similar levels of abstraction, meaning that things
like the escape-valve mechanism took up oversized space in the
description of the main algorithm, and the control-flow block handling
was not as neatly split from the rest of the logic as it could have
been.  This reorganises some of the methods to make the important
components of the algorithms clearer; the top level of the algorithm now
fits on one screen.

Lastly, this moves both layout and routing into a unified `sabre`
module, mostly just to simplify all the `use` statements and to put
logically grouped code in the same place.
@@ -10,7 +10,6 @@
// copyright notice, and modified files need to carry a notice indicating
// that they have been altered from the originals.

use ahash;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clippy started complaining at me that use ahash; is unnecessary, but I don't like the "crates are implicitly defined" behaviour, so I made the roots explicit.

Comment on lines -108 to -128
/// Populate a of nodes that would be routable if the given swap was applied to a layout. This
/// mutates `routable` to avoid heap allocations in the main logic loop.
pub fn routable_after(
&self,
routable: &mut Vec<NodeIndex>,
swap: &[PhysicalQubit; 2],
coupling: &DiGraph<(), ()>,
) {
let [a, b] = *swap;
if let Some((node, c)) = self.qubits[a.index()] {
if coupling.contains_edge(NodeIndex::new(b.index()), NodeIndex::new(c.index())) {
routable.push(node);
}
}
if let Some((node, c)) = self.qubits[b.index()] {
if coupling.contains_edge(NodeIndex::new(a.index()), NodeIndex::new(c.index())) {
routable.push(node);
}
}
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realised that this is a really awkward place to put this method, since it takes in a Vec that it mutates. Clearer to expose the underlying qubits structure, but in an immutable form - I think when I first wrote this struct, I didn't realise I could write a getter method turning the Vec in a &[], and so I erred in favour of the access control, and ended up with this method.

Comment on lines 132 to +135

// This is purely for RNG compatibility during a refactor.
let routing_seed = Pcg64Mcg::seed_from_u64(seed).next_u64();

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further down, we call the now-public swap_map_trial function, rather than indirecting ourselves through the maybe-parallel dispatcher. This extra line ensures that we use the same seed as the parent commit, despite skipping the dispatcher.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the logic of this file has changed (it used to be sabre_swap/mod.rs), but it's been fairly heavily reorganised to put the function splits in (certainly to me) more natural places, and to encapsulate all the scratch space and tracking structs into a single State object. We were practically passing the entire state to every function already, so while they were technically pure functions before and now they're technically instance methods, the pragmatics of the statefulness is the same, except now our argument lists aren't all 10 elements long.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really appreciate the walkthrough of the restructuring you’ve done with the Sabre module, in particular shifting from sabre_swap/mod.rs to a more consolidated state management approach within the RoutingState object. It’s evident that this cleanup streamlined the argument lists and made the organization more intuitive. I am curious about how this refactor could also help pave the way for us to more effectively experiment with and modify our heuristic strategies. How do you envision this impacting our ability to integrate and refine new heuristic algorithms?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, since this is just a refactor, technically there's nothing in here that makes things possible that weren't before. In practice, though, the main reason I went for this is that I had thoughts and ideas of additional heuristics to try, and differences in the tracking state I wanted to make for all sorts of things, and it was getting exceptionally tedious trying to modify the previous form with its giant argument lists and awkward function splits.

For a couple of concrete examples of stuff I've got in the works:

  • I've got a PR to follow this one that makes all the heuristic components configurable dynamically at runtime from Python space, so you can adjust the relative weights of the (basic, lookahead, decay) components, change the size of the extended set, change the decay reset interval, etc and it doesn't seem to affect runtime at all - if anything, the additional bits round the edge have a positive impact!
  • I have rough ideas for calculating the extended set as a delta on the previous set, which would remove a lot of the reasons for needing it to be fairly small in size for performance. That would need new tracking data, which in the previous form would need threading through everywhere.
  • I want to experiment with mapping the heuristic scores to some configurable probability distribution, then randomly choosing based on that distribution, rather than only picking from the swaps with the best score. That would also need new persistent allocations for the algorithm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, Jake. Modifying the previous form was quite cumbersome, and I like how this change makes experimentation with heuristic components more flexible and easier to consider adjustments. I agree that the concepts of incrementally computing the extended set and experimenting with various probability distributions are good avenues to explore. While these changes will likely necessitate new tracking data, the refactors does a great job in building a cohesive structure that facilitates further experimentation.

Comment on lines +103 to +112
/// Return the node, if any, that is on this qubit and is routable with the current layout.
#[inline]
fn routable_node_on_qubit(&self, qubit: PhysicalQubit) -> Option<NodeIndex> {
self.front_layer.qubits()[qubit.index()].and_then(|(node, other)| {
self.target
.coupling
.contains_edge(NodeIndex::new(qubit.index()), NodeIndex::new(other.index()))
.then_some(node)
})
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This moved from FrontLayer, and became a function on a single qubit instead of on both of a swap. This also evaluates whether a node is routable after we applied the swap (since we were always going to do that anyway), rather than the awkward behaviour of the old FrontLayer::routable_after, which mutated an input Vec with nodes that would be routable after the given swap was made.

Comment on lines +152 to +155
let block_results = blocks
.iter()
.map(|block| self.route_control_flow_block(block))
.collect::<Vec<_>>();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, the logic of route_control_flow_block was inlined here inside a for loop, with only the "swap epilogue" part encapsulated into a separate call, so it had ended up really nested, and it felt like the function calls didn't represent atomic operations.

Comment on lines +193 to +238
/// Inner worker to route a control-flow block. Since control-flow blocks are routed to
/// restore the layout at the end of themselves, and the recursive calls spawn their own
/// tracking states, this does not affect our own state.
fn route_control_flow_block(&self, block: &SabreDAG) -> BlockResult {
let (result, mut block_final_layout) =
swap_map_trial(self.target, block, self.heuristic, &self.layout, self.seed);
// For now, we always append a swap circuit that gets the inner block back to the
// parent's layout.
let swap_epilogue = {
// Map physical location in the final layout from the inner routing to the current
// location in the outer routing.
let mapping: HashMap<NodeIndex, NodeIndex> = block_final_layout
.iter_physical()
.map(|(p, v)| {
(
NodeIndex::new(p.index()),
NodeIndex::new(v.to_phys(&self.layout).index()),
)
})
.collect();

let swaps = token_swapper(
&self.target.coupling,
mapping,
Some(SWAP_EPILOGUE_TRIALS),
Some(self.seed),
None,
)
.unwrap();

// Convert physical swaps to virtual swaps
swaps
.into_iter()
.map(|(l, r)| {
let p_l = PhysicalQubit::new(l.index().try_into().unwrap());
let p_r = PhysicalQubit::new(r.index().try_into().unwrap());
block_final_layout.swap_physical(p_l, p_r);
[p_l, p_r]
})
.collect()
};
BlockResult {
result,
swap_epilogue,
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is now a separate call, I inlined gen_swap_epilogue - it's the majority of the logic already, so further abstraction wouldn't have helped much.

Comment on lines +285 to +334
/// Add swaps to the current set that greedily bring the nearest node together. This is a
/// "release valve" mechanism; it ignores all the Sabre heuristics and forces progress, so we
/// can't get permanently stuck.
fn force_enable_closest_node(
&mut self,
current_swaps: &mut Vec<[PhysicalQubit; 2]>,
) -> NodeIndex {
let (&closest_node, &qubits) = {
let dist = &self.target.distance;
self.front_layer
.iter()
.min_by(|(_, qubits_a), (_, qubits_b)| {
dist[[qubits_a[0].index(), qubits_a[1].index()]]
.partial_cmp(&dist[[qubits_b[0].index(), qubits_b[1].index()]])
.unwrap_or(Ordering::Equal)
})
.unwrap()
};
let shortest_path = {
let mut shortest_paths: DictMap<NodeIndex, Vec<NodeIndex>> = DictMap::new();
(dijkstra(
&self.target.coupling,
NodeIndex::new(qubits[0].index()),
Some(NodeIndex::new(qubits[1].index())),
|_| Ok(1.),
Some(&mut shortest_paths),
) as PyResult<Vec<Option<f64>>>)
.unwrap();
shortest_paths
.get(&NodeIndex::new(qubits[1].index()))
.unwrap()
.iter()
.map(|n| PhysicalQubit::new(n.index() as u32))
.collect::<Vec<_>>()
};
// Insert greedy swaps along that shortest path, splitting them between moving the left side
// and moving the right side to minimise the depth. One side needs to move up to the split
// point and the other can stop one short because the gate will be routable then.
let split: usize = shortest_path.len() / 2;
current_swaps.reserve(shortest_path.len() - 2);
for i in 0..split {
current_swaps.push([shortest_path[i], shortest_path[i + 1]]);
}
for i in 0..split - 1 {
let end = shortest_path.len() - 1 - i;
current_swaps.push([shortest_path[end], shortest_path[end - 1]]);
}
current_swaps.iter().for_each(|&swap| self.apply_swap(swap));
closest_node
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, this release valve was three separate bits of logic in the main logic loop - closest_operation, swaps_to_route and then the application of those swaps. Those three were all super linked, so I pulled them into one single function.

Comment on lines +349 to +369
for swap in obtain_swaps(&self.front_layer, self.target.neighbors) {
let score = match self.heuristic {
Heuristic::Basic => self.front_layer.score(swap, dist),
Heuristic::Lookahead => {
self.front_layer.score(swap, dist)
+ EXTENDED_SET_WEIGHT * self.extended_set.score(swap, dist)
}
Heuristic::Decay => {
self.qubits_decay[swap[0].index()].max(self.qubits_decay[swap[1].index()])
* (absolute_score
+ self.front_layer.score(swap, dist)
+ EXTENDED_SET_WEIGHT * self.extended_set.score(swap, dist))
}
};
if score < min_score - BEST_EPSILON {
min_score = score;
self.swap_scratch.clear();
self.swap_scratch.push(swap);
} else if (score - min_score).abs() < BEST_EPSILON {
self.swap_scratch.push(swap);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been diving into the swap selection logic in the routing algorithm and something caught my attention regarding how we handle scoring ties. From my experience testing Sabre on various circuits, these ties seem to be a common occurrence. It seems to me that the order of iteration of obtain_swaps is deterministic, so we're consistently picking the first optimal swap we come across. If this is true, then we might be inadvertently biasing our selection towards certain swaps, potentially missing out on exploring other equally good options that could prove beneficial, particularly in more complex or larger circuits. Currently, it does not seem like we have a tie-resolution mechanism. We could consider some randomness in dealing with ties. This could get a broader view of the optimal paths across different runs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • "ties seem to be a common occurrence": yep, they are.
  • "the order of iteration of obtain_swaps is deterministic": yes, 100% - we need this for determinism of the algorithm for a fixed seed.
  • "we're consistently picking the first optimal swap we come across": this is where you're going a bit wrong, I think. We don't pick the first optimal one; we track our current "optimal score" (up to some slightly fuzzy epsilon), and whenever we see a swap that's as good as the best, we push it into a Vec (same as a Python list, for all intents and purposes) (line 368). If we find a new best, we reset that tracking (lines 365 and 366). Once we've examined all swaps, we use the pRNG to randomly choose between all swaps that were the best (line 371, just below this diff) - that's where the random seed comes in, and in almost all circuits, that's the only use of randomness.

So our "tie-resolution mechanism" is just "pick randomly from all swaps that were within epsilon of the best seen", and in the preset pass managers, we run Sabre in parallel with quite a few different random seeds to try and get a view of the optimal paths. Was there some other part you were thinking of?

Fwiw, I'm also interested in trialling a "best swap" chooser where we assign all swaps a probability of being chosen (for some loss function that I've not had any particularly bright ideas about) and then choose from that, rather than only looking at the best swaps.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes that is correct and thank you for the clarification. And no, I wasn’t thinking of another specific part of the algorithm for tie-resolution. Your explanation clarified my misunderstanding about how ties are handled within the current mechanism. Also I found the concept of experimenting with a probabilistic "best swap" chooser quite promising. I had dabbled with ideas for a loss function in this context before, but my focus shifted elsewhere. It seems like a worthwhile direction for further exploration, and I will try to revisit those preliminary thoughts.

@henryzou50
Copy link
Contributor

This refactor looks very well-thought-out and enhances the readability of rust aspects in the Sabre module. I'm curious about your perspective on how this refactor aligns with the broader architecture of the Sabre module. Do you feel that these changes are moving us closer to an ideal architecture for Sabre routing? Are there opportunities to extend this rethinking to other parts of the module, potentially simplifying further aspects of the system?

@jakelishman
Copy link
Member Author

Do you feel that these changes are moving us closer to an ideal architecture for Sabre routing?

Overall, from a user perspective and from an internal logic perspective, this refactor doesn't really change all that much. It (imo) makes the Sabre code rather easier to work with, but it doesn't really fundamentally change anything. I'm not really sure what an "ideal architecture" would be, but in practical terms, the Rust-space components are very very fast, and we're primarily bottlenecked on the Python-space reconstruction of the DAG, which should get wildly faster once/if we move the DAG entirely down to Rust space.

Are there opportunities to extend this rethinking to other parts of the module, potentially simplifying further aspects of the system?

This touched pretty much all of our Rust-space Sabre components, so I wouldn't say there's much more I'd want to do it in terms of re-organisation - I think what we have is largely fine. Just for one data point: I've been writing some code around Sabre using this PR as the base, and finding it an awful lot easier to make larger-scale changes to the new form of the code than the old form.

Copy link
Contributor

@henryzou50 henryzou50 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@henryzou50 henryzou50 added this pull request to the merge queue Apr 11, 2024
Merged via the queue into Qiskit:main with commit 3af3cf5 Apr 11, 2024
12 checks passed
@jakelishman jakelishman deleted the sabre/refactor-structs branch April 11, 2024 23:19
jakelishman added a commit to jakelishman/qiskit-terra that referenced this pull request Apr 12, 2024
The recent Sabre refactor (Qiskitgh-11977, 3af3cf5) inadvertantly switched
the order that physical qubits were considered when modifying the front
layer after a swap insertion. This could occasionally have an impact in
the swap-chooser and extended-set population, if a swap enabled two
separate gates at once.
jakelishman added a commit to jakelishman/qiskit-terra that referenced this pull request Apr 12, 2024
The recent Sabre refactor (Qiskitgh-11977, 3af3cf5) inadvertently switched
the order that physical qubits were considered when modifying the front
layer after a swap insertion. This could occasionally have an impact in
the swap-chooser and extended-set population, if a swap enabled two
separate gates at once.
@jakelishman
Copy link
Member Author

ASV flagged an accidental change to the RNG compatibility of this commit, which #12172 fixes.

github-merge-queue bot pushed a commit that referenced this pull request Apr 12, 2024
The recent Sabre refactor (gh-11977, 3af3cf5) inadvertently switched
the order that physical qubits were considered when modifying the front
layer after a swap insertion. This could occasionally have an impact in
the swap-chooser and extended-set population, if a swap enabled two
separate gates at once.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: None Do not include in changelog mod: transpiler Issues and PRs related to Transpiler Rust This PR or issue is related to Rust code in the repository
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants