Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retransmission with HTLC meta-issue #172

Closed
rustyrussell opened this issue May 16, 2017 · 0 comments
Closed

Retransmission with HTLC meta-issue #172

rustyrussell opened this issue May 16, 2017 · 0 comments
Milestone

Comments

@rustyrussell
Copy link
Collaborator

Background

When nodes reconnect/restart, they need some way to see what was received by the other end to re-sync state. For normal (or shutting down state) this naturally follows batching done by commit_signature/revoke_and_ack messages.

The c-lightning prototype used a scheme where init messages contained a single counter: the total number of commit-signature and revoke_and_ack messages it had received. On disconnect, it would also forget any updates which it had not received commit_signature for.

In Milan, @adiabat argued simply retransmitting and discarding duplicates, rather than an explicit ack number. More recently, @pm47 asked to avoid the compulsory discard, and require an exact retransmission of previous messages; @rustyrussell instead asked for a strict superset. But further consideration has raised issues with these approaches.

The Uncertain Signature Problem

  • A sends: update1 commitment_signed disconnect
  • B is in three possible states:
  • B1: received nothing. Still at previous commit.
  • B2: received update1. Has one update pending.
  • B3: received update1 and commitment_signed. Has sent revoke_and_ack.

Now, when A reconnects, it does an exact retransmission:

  • A sends: update1 commitment_signed
  • B1 is fine. B2 is fine if it ignores the duplicate. B3 either considers the COMMIT to have changed nothing (currently illegal), or if it ignores that, the signature is bad (it expects to be using the next_per_commitment_point it sent in revoke_and_ack.

There is also the case where A adds another change (eg. feechange, or another update).

Possible solutions:

  • Insist on an exact retransmission, and allow an empty commit, assert that as a special case, an "empty" commitment uses the previous per-commitment-point (and replies with the same revoke_and_ack as before).
  • Allow changed transmission (must be a superset!), and if the signature check fails, try creating the commitment signature using the previous per_commitment_point, and if that succeeds, reply with the same revoke_and_ack as before.
  • Send the explicit counter of updates + revocations so we don't encounter this situation.

The Persistence Problem

It's important that an optimal implementation only be required to remember state at the minimal number of points, as a robust implementation will need to synchronously write to disk(s). A node must remember when it receives revoke_and_ack (to create penalty transactions later), and when it sends commitment_signed (as it is committed to the HTLCs at that point, so it must remember them), so these are the minimal "sync" points possible.

Thus, requiring a node to persistently remember updates it has sent but not yet committed to is a poor idea. However, this can be reconstructed: we have to remember incoming HTLCs or fulfill/fails which were going to the reconnecting peer anyway, we can just re-send them. However we would not normally remember fee changes we have not committed to: requiring this to be recorded on sending update_fee adds a disk sync. Nor would we normally remember the order in which we sent the updates, which is imperative for the update_add_htlc id fields to match.

ECLAIR seems to require remembering the state and not rolling back. c-lightning (old, pre-Milan daemon) used reconstruction on reconnect/restart, but assumed the other side would roll back and used a total counter, and thus didn't have an issue if order or fees changed. lnd goes even further, and doesn't even remember id across reconnections: HTLCs are implicitly renumbered from 0 at that point. I don't find this 8-byte ondisk saving convincing: once HTLCs are no longer in the commitment transaction the ID can indeed be forgotten, but so can the amount and routing information: only the cltv and RIPEMD of the payment hash need to be remembered for creating the penalty transaction.

rustyrussell added a commit to rustyrussell/lightning-rfc that referenced this issue May 16, 2017
1. Tell the node when to broadcast the funding tx (we didn't do this!).
2. Allow timeouts generally if no progress is made (originally this
   was just when waiting for funding_locked, but it applies generally).
3. Use `funding_signed` as the commitment point: before this, we forget,
   after this, we remember.  If lost, we'll timeout.
4. The core of the retransmission requirements now only applies to
   the normal and shutdown states, and will be revised separately
   depending on lightning#172

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
rustyrussell added a commit that referenced this issue May 18, 2017
1. Tell the node when to broadcast the funding tx (we didn't do this!).
2. Allow timeouts generally if no progress is made (originally this
   was just when waiting for funding_locked, but it applies generally).
3. Use `funding_signed` as the commitment point: before this, we forget,
   after this, we remember.  If lost, we'll timeout.
4. The core of the retransmission requirements now only applies to
   the normal and shutdown states, and will be revised separately
   depending on #172

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
rustyrussell added a commit to rustyrussell/lightning-rfc that referenced this issue May 24, 2017
This adds a message for each channel reconnect (after we've
sent/received `funding_signed`, ie. when we rememeber the channel),
which says exactly how many `commitment_signed` and `revoke_and_ack`
we've received.  Really, we could use one bit for each (they could
only be missing the last one), but better to be clear.

This leaves the "rollback if didn't get commitment_signed"
requirement, but avoids any need to handle update duplicates or wonder
what update number a `commitment_signed` applies to after reconnect.

Closes: lightning#172
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
rustyrussell added a commit to rustyrussell/lightning-rfc that referenced this issue May 25, 2017
This adds a message for each channel reconnect (after we've
sent/received `funding_signed`, ie. when we rememeber the channel),
which says exactly how many `commitment_signed` and `revoke_and_ack`
we've received.  Really, we could use one bit for each (they could
only be missing the last one), but better to be clear.

This leaves the "rollback if didn't get commitment_signed"
requirement, but avoids any need to handle update duplicates or wonder
what update number a `commitment_signed` applies to after reconnect.

Closes: lightning#172
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
rustyrussell added a commit to rustyrussell/lightning-rfc that referenced this issue May 25, 2017
This adds a message for each channel reconnect (after we've
sent/received `funding_signed`, ie. when we rememeber the channel),
which says exactly how many `commitment_signed` and `revoke_and_ack`
we've received.  Really, we could use one bit for each (they could
only be missing the last one), but better to be clear.

This leaves the "rollback if didn't get commitment_signed"
requirement, but avoids any need to handle update duplicates or wonder
what update number a `commitment_signed` applies to after reconnect.

Closes: lightning#172
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
rustyrussell added a commit to rustyrussell/lightning-rfc that referenced this issue May 25, 2017
This adds a message for each channel reconnect (after we've
sent/received `funding_signed`, ie. when we rememeber the channel),
which says exactly how many `commitment_signed` and `revoke_and_ack`
we've received.  Really, we could use one bit for each (they could
only be missing the last one), but better to be clear.

This leaves the "rollback if didn't get commitment_signed"
requirement, but avoids any need to handle update duplicates or wonder
what update number a `commitment_signed` applies to after reconnect.

Closes: lightning#172
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@cdecker cdecker added this to the v1.0 milestone Jun 14, 2017
rustyrussell added a commit to rustyrussell/lightning-rfc that referenced this issue Jun 27, 2017
This adds a message for each channel reconnect (after we've
sent/received `funding_signed`, ie. when we rememeber the channel),
which says exactly how many `commitment_signed` and `revoke_and_ack`
we've received.  Really, we could use one bit for each (they could
only be missing the last one), but better to be clear.

This leaves the "rollback if didn't get commitment_signed"
requirement, but avoids any need to handle update duplicates or wonder
what update number a `commitment_signed` applies to after reconnect.

Many thanks to pm47 and roasbeef especially for constructive feedback
which made this far better than I originally had.

Closes: lightning#172
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

BOLT 2: require shutdown rexmit, clarify which commitment_signed/revoke_and_ack

Roasbeef points out that we need to retransmit shutdown, and I also
clarified that commitments_received doesn't count previous re-transmissions:
the simplest way to specify this is that is to use the commitment number.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

FIXUP: require funding_locked retransmission.

But make the rules as loose as possible.

fix revocations_received definition.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

fixup: don't require exact retransmission if last commit not received: you can just start from prev.

Reported-by: pm47
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

fixup! This time for sure!  Use next index for both reestablish numbers.

So much confusion!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

fixup! Simplified retransmission rules for funding_locked.

Suggested-by: pm47
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
rustyrussell added a commit to rustyrussell/lightning-rfc that referenced this issue Jun 27, 2017
This adds a message for each channel reconnect (after we've
sent/received `funding_signed`, ie. when we rememeber the channel),
which says exactly how many `commitment_signed` and `revoke_and_ack`
we've received.  Really, we could use one bit for each (they could
only be missing the last one), but better to be clear.

This leaves the "rollback if didn't get commitment_signed"
requirement, but avoids any need to handle update duplicates or wonder
what update number a `commitment_signed` applies to after reconnect.

Many thanks to pm47 and roasbeef especially for constructive feedback
which made this far better than I originally had.

Closes: lightning#172
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
rustyrussell added a commit to rustyrussell/lightning-rfc that referenced this issue Jun 27, 2017
This adds a message for each channel reconnect (after we've
sent/received `funding_signed`, ie. when we rememeber the channel),
which says exactly how many `commitment_signed` and `revoke_and_ack`
we've received.  Really, we could use one bit for each (they could
only be missing the last one), but better to be clear.

This leaves the "rollback if didn't get commitment_signed"
requirement, but avoids any need to handle update duplicates or wonder
what update number a `commitment_signed` applies to after reconnect.

Many thanks to pm47 and roasbeef especially for constructive feedback
which made this far better than I originally had.

Closes: lightning#172
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants