Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement multi-part payments (MPP) sending support #3809

Merged
merged 20 commits into from
Jul 15, 2020
Merged

Implement multi-part payments (MPP) sending support #3809

merged 20 commits into from
Jul 15, 2020

Conversation

cdecker
Copy link
Member

@cdecker cdecker commented Jul 3, 2020

  • We have two splitter phases: a presplit modifier which splits large amounts
    before even starting a first attempt, and an adaptive_splitter which reacts
    to new information about edges and failed payments (mainly capacity
    estimates and disabled channels)
  • The presplit modifier will split anything larger than our threshold (10ksat
    at the moment) so that's pretty much the opposite of what you're
    doing. This is derived from a recent round of probing I did (see results
    below).
  • When splitting we fuzz the amount by +/- 10% uniformly around the target
    split amount (10ksat in the presplit case, or 1/2 previous attempt in the
    adaptive splitter).
  • We track estimates and new information we learn about channels and nodes as
    channel_hints in the root of the payment tree
  • We iteratively ask for new routes using getroute for each attempt, this is
    a bit more costly than doing it in batch with Yen's algorithm, but we can
    provide up to date info on the channel state, and reduce our in-memory
    state-management during a payment.

About the capacity probes I did: I implemented a simple binary search on the
payment amount and then iterated through all nodes and tested the maximum
single-part payment that will reach the destination. After testing about 35%
of the network the histogram of successful capacities looks like this:

probed-channel-capacities

The 80th percentile is marked in black, and the 50th percentile marked in
red. With 10ksat for the pre-splitter we end up with parts that are ~1USD in
size, and that will succeed in 80% of cases without requiring further
splits. Our goal here is to be rather aggressive, not optimize for fees, but
for speed. At 10ksat the proportional fee still dominates the overall cost, so
we're confident that adding a couple more parts is not going to be too costly.

Notice that these measurements use my usual worst-case assumption of any
sender-receiver pair being equally likely, which we pretty much know is too
pessimistic. Not surprisingly the bottlenecks on the path are clustered around
the 100ksat and 500ksat sizes, which are the most popular total capacity sizes
in the network today.

We also plan to keep measuring and adapting these constants as we learn new
things and the network evolves.

Depends-On: #3826

@cdecker cdecker added this to the v0.9.0 milestone Jul 3, 2020
@cdecker cdecker self-assigned this Jul 3, 2020
@cdecker cdecker added the Optech Make Me Famous! Look! Look! Look! COOL NEW FEATURE! label Jul 3, 2020
@cdecker cdecker force-pushed the paymod-03 branch 5 times, most recently from 8ebcef8 to de74b45 Compare July 6, 2020 16:47
@cdecker cdecker force-pushed the paymod-03 branch 3 times, most recently from 8ac4e2e to b606f46 Compare July 7, 2020 19:49
@cdecker cdecker changed the base branch from paymod-03 to master July 8, 2020 18:14
@cdecker cdecker changed the title WIP: Implement multi-part payments (MPP) sending support Implement multi-part payments (MPP) sending support Jul 13, 2020
@cdecker cdecker changed the base branch from master to paymod-04 July 13, 2020 18:32
@cdecker cdecker marked this pull request as ready for review July 13, 2020 18:38
@cdecker cdecker requested review from rustyrussell and niftynei and removed request for rustyrussell July 13, 2020 18:38
Copy link
Contributor

@rustyrussell rustyrussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack 7f78707

Just minor: please invert the disable-mpp flag...

plugins/libplugin-pay.c Show resolved Hide resolved
doc/lightningd-config.5.md Show resolved Hide resolved
doc/lightningd-config.5.md Outdated Show resolved Hide resolved
With MPP we require that the sum of parts is equal to the `total_msat` amount
declared in the onion. Since that can't be changed once the first part arrives
we need a way to disable amount fuzzing for MPP.
This is necessary in the next commit to override the total_msat that is being
delivered to the destination.
With the `presplit`-modifier we actually skip execution of the root altogether
which results in the root not having a result at all. Instead we should use
the result returned by `payment_collect_result`.
Changelog-Added: The MPP presplit modifier splits large payments into 10k satoshi parts to maximize chances of performing the payment and to obfuscate the overall amount being sent.
Specifically if we split, there is no result, but we shouldn't add a failure
message.
If the parent is a split we have new payment parameters, and want to perform a
number of attempts with those.
This modifier splits a payment that has been attempted a number of times (by a
modifier earlier in the mod chain) and has failed consistently. It splits the
amount roughly in half, with a but if random fuzz, and then starts a new round
of attempts for the two smaller amounts.
Changelog-Added: The adaptive multi-part payment modifier will split payments that are failing due to their size into smaller parts, and re-attempted.
Several tests are not well-suited for mpp, so I added a CLI option to opt-out
of the MPP support at startup time.
We abort on the root since that is the coordination point for all parts of the payment.
Some tests were failing because they were running into the presplit modifier
and then surprised that the payment got split.
If one part sets the root to be aborted, there is little point in continuing
to wait for the remainder, return to the caller immediately.
I found it rather useful to trace how a payment is getting retried in the logs.
This happens to be an edge case with the way we use `sendonion` in
MPP. `sendonion` does not attempt to recover the route even if we supply the
shared secrets (it'd require us to map forwarding channels to the nodes etc),
so `failnode` will always be unset, unless it is the first hop, which gets
stored. This is not a problem if it weren't for the fact that we don't store
the partial route, consisting solely of the channel leading to the first hop,
therefore the assertion that either both are NULL or both aren't fails on the
first hop.

This went unnoticed since with MPP we have more concurrent payments in flight,
increasing the chances of a exhausted first hop considerably.
These mostly deal with exact HTLC counts, and fixed number of attempts to
conclusion, so the randomization that MPP adds is not desirable.
This exercises something that is simply not possible without MPP, i.e., the
bundling of multiple paths to get sufficient capacity to perform the payment.
`listconfigs` calls were setting the description twice and was using the
pointer to the boolean value as the boolean value, resulting in always
returning `true`.
@cdecker cdecker changed the base branch from paymod-04 to master July 14, 2020 15:59
@cdecker
Copy link
Member Author

cdecker commented Jul 15, 2020

Ack 7f78707

Just minor: please invert the disable-mpp flag...

Re-applying after fixups

ACK 4e9054b

@cdecker cdecker merged commit de096ee into master Jul 15, 2020
@rustyrussell rustyrussell deleted the mpp branch August 15, 2022 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Optech Make Me Famous! Look! Look! Look! COOL NEW FEATURE!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants