Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support onion message pathfinding #1724

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

valentinewallace
Copy link
Contributor

@valentinewallace valentinewallace commented Sep 15, 2022

Take 2, Github wouldn't let me reopen #1669 🤔

We could reuse router::get_route, but it's better to start from scratch
because get_route includes a lot of payment-specific computations that impact
performance.

  • check for OM feature bit
  • finish tests

Partially addresses #1607

Based on #1731

Known followups:

  • scoring
  • benchmarking (will need to update the netgraph file to support the OM feature bit)

@valentinewallace
Copy link
Contributor Author

valentinewallace commented Sep 15, 2022

I think I addressed all the feedback from #1669. Much cleaner!

If there are rough conceptual ACKs, I can add more testing, docs, feature bit checks, and open the first commit in a separate PR.

@valentinewallace valentinewallace force-pushed the 2022-09-om-pathfinding branch 2 times, most recently from bd24563 to a019106 Compare September 15, 2022 21:47
@codecov-commenter
Copy link

codecov-commenter commented Sep 15, 2022

Codecov Report

Patch coverage: 89.72% and project coverage change: +3.59 🎉

Comparison is base (8311581) 87.21% compared to head (d773b04) 90.80%.

❗ Current head d773b04 differs from pull request most recent head c9fafc3. Consider uploading reports for the commit c9fafc3 to get more accurate results

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1724      +/-   ##
==========================================
+ Coverage   87.21%   90.80%   +3.59%     
==========================================
  Files         100       88      -12     
  Lines       44516    47645    +3129     
  Branches    44516    47645    +3129     
==========================================
+ Hits        38825    43266    +4441     
+ Misses       5691     4379    -1312     
Impacted Files Coverage Δ
lightning/src/routing/onion_message.rs 87.58% <87.58%> (ø)
lightning/src/routing/test_utils.rs 97.73% <97.22%> (+1.67%) ⬆️
lightning/src/routing/router.rs 91.59% <100.00%> (-0.64%) ⬇️
lightning-block-sync/src/rest.rs 62.06% <0.00%> (-30.79%) ⬇️
lightning/src/chain/mod.rs 68.18% <0.00%> (-16.03%) ⬇️
lightning-block-sync/src/rpc.rs 74.80% <0.00%> (-15.02%) ⬇️
lightning/src/util/wakers.rs 89.26% <0.00%> (-9.16%) ⬇️
lightning/src/util/invoice.rs 91.66% <0.00%> (-8.34%) ⬇️
lightning/src/util/macro_logger.rs 87.14% <0.00%> (-4.35%) ⬇️
lightning/src/util/ser_macros.rs 89.13% <0.00%> (-2.46%) ⬇️
... and 98 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@tnull tnull self-requested a review September 16, 2022 07:48
Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first glance it looks like it'll work. We definitely need some ability to score nodes, if only a "skip nodes" set, as we'll want the ability to, say, fetch two non-overlapping paths.

@valentinewallace
Copy link
Contributor Author

valentinewallace commented Sep 20, 2022

We definitely need some ability to score nodes, if only a "skip nodes" set, as we'll want the ability to, say, fetch two non-overlapping paths.

Sounds good, will likely make the module private and do it in follow-up if that works

@TheBlueMatt
Copy link
Collaborator

Can be rebased now, I believe.

@valentinewallace valentinewallace force-pushed the 2022-09-om-pathfinding branch 2 times, most recently from a571f8c to 980bbc1 Compare September 23, 2022 21:09
impl PartialOrd for PathBuildingHop {
fn partial_cmp(&self, other: &Self) -> Option<cmp::Ordering> {
// We need a min-heap, whereas `BinaryHeap`s are a max-heap, so compare the costs in reverse.
other.cost.partial_cmp(&self.cost)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the cost is equal, can someone grind lower node-id to attract more message flows? After staring at rust documentation for 5 minutes, I'm not sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a TODO for this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you confirm it's possible by looking at the rust doc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I misread your comment. After testing in the playground, I don't think this is the case. https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=3912286ccee3e036de27c9e811b40f83 It seems like if the cost is equal, then whatever node was pushed first will be the one popped first, regardless of node id.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps fine to keep a todo anyway because:

  • the way nodes sorted before insertion into heap might play a role then
  • heap implementation may change

Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass, looks good! Haven't looked at tests yet.

@valentinewallace valentinewallace force-pushed the 2022-09-om-pathfinding branch 3 times, most recently from 5280490 to fdaadac Compare September 28, 2022 16:54
Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly focused on tests this time.

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically LGTM, one real comment.

//! between channel peers are likely to be prioritized over those between non-channel peers.
//!
//! [Onion message]: crate::onion_message
//! [offers]: <https://github.com/lightning/bolts/pull/798>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, maybe add a TODO to point to the offers module once it lands?

//! requests from [offers]. It differs from payment pathfinding in that channel liquidity, fees, and
//! knobs such as `htlc_maximum_msat` do not factor into path selection -- onion messages require a
//! peer connection and nothing more. However, we still use the network graph because onion messages
//! between channel peers are likely to be prioritized over those between non-channel peers.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/prioritized/nodes will refuse to forward unless they're already connected/ (ie there is no way to get a message through at all otherwise).

/// Finds a route from us to the given `destination` node.
/// If we have private channels, it may be useful to fill in `first_hops` with the results from
/// [`ChannelManager::list_usable_channels`]. If `first_hops` is not filled in, connected peers
/// without public channels will not be forwarded over.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not true, we outright will fail to find a route if its not filled in. I think this is fine, but we need to update the docs. More generally, I think these docs really need some kind of general discussion of OM pathfinding principles - something about how we should try to find a path, and try to forward over it, but most nodes currently dont support OMs, so path finding is likely to fail. In that case, users should consider implementing something to directly connect to the destination node if possible, noting that this compromises sender privacy and this should be disabled in the future.

}
if let Some(node_info) = network_nodes.get(&node_id) {
// Only consider the network graph if first_hops does not override it.
if valid_first_hops.contains(&node_id) || node_id == our_node_id {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we got back to ourself, we definitely dont want to add more channels to the frontier set.

{ continue; }
} else { continue; }
for scid in &node_info.channels {
if let Some(chan_info) = network_channels.get(&scid) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting question - would this be faster to actually just push the SCID onto the heap? We'd have to handle the first-hops even more explicitly, but that's okay. It would avoid a full round of channel-gets at the frontier when we find the target, which may be worth quite a bit.

Copy link
Contributor Author

@valentinewallace valentinewallace Mar 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting somewhat comparable times, with scid-heap narrowly taking the lead.

inserting node_ids into heap:

test routing::onion_message::benches::generate_simple_routes                 ... bench:  42,412,441 ns/iter (+/- 26,718,082)
test routing::onion_message::benches::generate_simple_routes                 ... bench:  40,244,529 ns/iter (+/- 27,297,264)
test routing::onion_message::benches::generate_simple_routes                 ... bench:  38,865,025 ns/iter (+/- 27,147,061)
test routing::onion_message::benches::generate_simple_routes                 ... bench:  43,582,987 ns/iter (+/- 23,421,789)

inserting scids into heap:

test routing::onion_message::benches::generate_simple_routes                 ... bench:  38,450,779 ns/iter (+/- 29,381,214)
test routing::onion_message::benches::generate_simple_routes                 ... bench:  37,930,562 ns/iter (+/- 27,023,358)
test routing::onion_message::benches::generate_simple_routes                 ... bench:  41,410,850 ns/iter (+/- 27,262,014)
test routing::onion_message::benches::generate_simple_routes                 ... bench:  39,008,820 ns/iter (+/- 26,009,606)

I pushed a WIP commit (d9acdf1) with the scid-heap version if you want to test it out and/or check that I implemented it the way you expected. (The commit needs a bit of cleanup..)

Copy link

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice if we can this back, figured out gonna be needed for #1848 and beyond. Will review soon.

Comment on lines +15 to +16
//! peer connection and nothing more. However, we still use the network graph because onion messages
//! between channel peers are likely to be prioritized over those between non-channel peers.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So while I think relying on the existence of channels between peers to flow onion messages align well the incentives between bandwidth consumed and onion forwards accepted, I don't know if this is a direction we would like for the long-term in term of privacy. Indeed, overlapping both offers flows and HTLC flows, if we re-use the same heuristics than for classic pathfinding we might have overlap and show more "data trails" to any payment observers. Especially onion messages routing might invalidate some privacy-preserving enhancing we have implemented with HTLCs such as #1286. The lack of timing padding in onion routing might cancel the CLTV delta offset. Of course, this is a subject we can think more and improve with time. Just to be aware of about the privacy regression we might have.

@valentinewallace
Copy link
Contributor Author

Nice if we can this back, figured out gonna be needed for #1848 and beyond. Will review soon.

Thanks, I don't know if I'll be prioritizing the follow-ups for a while (open to someone taking them over), but since this seems close(?) I'll try to get back to it in the new year

Copy link

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beyond offer transport, I think onion messages pathfinding could be useful to two more use-cases: bitcoin transaction relay and jamming credentials. For the former, I believe the set of heuristics is going to be really-specific, as you might avoid centralised hops to improve fault-tolerance. For the latter, the use-case requirements are more likely to be similar to offers, however you might do synthetic traffic to mask the economic volume.

Just to say, I don't think we'll need more than one onion messages pathfinding algorithms for other use-cases, however I believe they might need each to plug a CustomHeuristicsOnionScore similar to scoring::Score, or combine those CustomHeuristicsOnionScore. Thinking aloud for jamming.

let dest_node_id = NodeId::from_pubkey(destination);
if our_node_id == dest_node_id { return Err(Error::InvalidDestination) }

// Add our start and first-hops to `frontier`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can define what is meant by "frontier". Understood as "intermediate node". Or maybe it's just us the start.

let start = NodeId::from_pubkey(&our_node_pubkey);
let mut valid_first_hops = HashSet::new();
let mut frontier = BinaryHeap::new();
frontier.push(PathBuildingHop { cost: 0, node_id: start, parent_node_id: start });
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can define the reasoning behind PathBuildingHop::cost. I don't know if the onion bandwidth we're expecting from hops on the network is scaled on the channel capacity. The most rule of thumb metric I would guess.

@ariard
Copy link

ariard commented Jan 20, 2023

@naumenkogs if you're bored, I'm pretty sure onion message pathfinding will be a must to play with a bunch of things, notably jamming mitigations :)

Used incoming commit(s) to test onion message pathfinding
We *could* reuse router::get_route, but it's better to start from scratch
because get_route includes a lot of payment-specific computations that impact
performance.
As judged by whether the channel is disabled, similar to payment routing
Don't allow finding a path to ourselves and reject first hops that contain our
node id as the counterparty node id.
@valentinewallace valentinewallace force-pushed the 2022-09-om-pathfinding branch from d773b04 to b07e707 Compare March 1, 2023 16:11
@valentinewallace valentinewallace force-pushed the 2022-09-om-pathfinding branch 2 times, most recently from 28318ac to d9acdf1 Compare March 1, 2023 16:16
@valentinewallace valentinewallace force-pushed the 2022-09-om-pathfinding branch from d9acdf1 to c9fafc3 Compare March 1, 2023 16:23
@moneyball moneyball added this to the 0.0.119 milestone Oct 26, 2023
@valentinewallace
Copy link
Contributor Author

I see @moneyball added this to the 119 milestone.

But I don't see the rush on this, because most cases will use direct-connect? It seems like this will only be useful once OMs are more widely supported across the network. @TheBlueMatt @jkczyz Let me know if I'm good to remove this from the milestone for now.

@moneyball
Copy link
Contributor

Yes I'll remove it. I had added it to 119 due to a misunderstanding of how it could work in the short-term.

@moneyball moneyball removed this from the 0.0.119 milestone Nov 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants