Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preparing for onion messages #1957

Merged
merged 9 commits into from
Nov 9, 2021
Merged

Preparing for onion messages #1957

merged 9 commits into from
Nov 9, 2021

Conversation

thomash-acinq
Copy link
Member

@thomash-acinq thomash-acinq commented Sep 17, 2021

Basic support for relaying onion messages
lightning/bolts#759

Copy link
Member

@t-bast t-bast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple high-level comments, overall it's going in the right direction.

I'll be working on route blinding this week, this will probably overlap with some of the changes you made, once it's ready you should be able to rebase on top of my branch. I'm not sure yet where the TLVs and codecs should live, I need to dive back into the spec proposal to figure this out.

Copy link
Member

@pm47 pm47 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include a link to the rfc PR in the description so it's easier for reviewers: lightning/bolts#759.

@@ -320,6 +320,8 @@ object ReplyChannelRange {

case class GossipTimestampFilter(chainHash: ByteVector32, firstTimestamp: Long, timestampRange: Long, tlvStream: TlvStream[GossipTimestampFilterTlv] = TlvStream.empty) extends RoutingMessage with HasChainHash

case class OnionMessage(blindingKey: PublicKey, onionRoutingPacket: OnionRoutingPacket, tlvStream: TlvStream[OnionMessageTlv] = TlvStream.empty) extends LightningMessage
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have traits that map BOLT 1 message types:

Suggested change
case class OnionMessage(blindingKey: PublicKey, onionRoutingPacket: OnionRoutingPacket, tlvStream: TlvStream[OnionMessageTlv] = TlvStream.empty) extends LightningMessage
case class OnionMessage(blindingKey: PublicKey, onionRoutingPacket: OnionRoutingPacket, tlvStream: TlvStream[OnionMessageTlv] = TlvStream.empty) extends RoutingMessage

You probably have done this on purpose, in order not to have the message filtered by the PeerConnection? Actually I think it's an issue with the spec itself: onion messages should not be grouped with routing messages.

eclair-core/src/main/scala/fr/acinq/eclair/io/Peer.scala Outdated Show resolved Hide resolved
eclair-core/src/main/scala/fr/acinq/eclair/io/Peer.scala Outdated Show resolved Hide resolved
eclair-core/src/main/scala/fr/acinq/eclair/io/Peer.scala Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Sep 21, 2021

Codecov Report

Merging #1957 (58857c3) into master (b45dd00) will decrease coverage by 0.01%.
The diff coverage is 88.46%.

@@            Coverage Diff             @@
##           master    #1957      +/-   ##
==========================================
- Coverage   87.57%   87.56%   -0.02%     
==========================================
  Files         162      165       +3     
  Lines       12607    12696      +89     
  Branches      524      556      +32     
==========================================
+ Hits        11041    11117      +76     
- Misses       1566     1579      +13     
Impacted Files Coverage Δ
...q/eclair/wire/protocol/LightningMessageTypes.scala 96.77% <ø> (ø)
...a/fr/acinq/eclair/wire/protocol/OnionRouting.scala 50.00% <33.33%> (-10.00%) ⬇️
.../scala/fr/acinq/eclair/message/OnionMessages.scala 76.74% <76.74%> (ø)
...core/src/main/scala/fr/acinq/eclair/Features.scala 99.07% <100.00%> (+0.02%) ⬆️
...src/main/scala/fr/acinq/eclair/crypto/Sphinx.scala 100.00% <100.00%> (ø)
.../scala/fr/acinq/eclair/payment/PaymentPacket.scala 91.42% <100.00%> (ø)
...lair/wire/protocol/EncryptedRecipientDataTlv.scala 100.00% <100.00%> (ø)
.../eclair/wire/protocol/LightningMessageCodecs.scala 100.00% <100.00%> (ø)
...a/fr/acinq/eclair/wire/protocol/MessageOnion.scala 100.00% <100.00%> (ø)
...r/acinq/eclair/wire/protocol/OnionMessageTlv.scala 100.00% <100.00%> (ø)
... and 5 more

@thomash-acinq thomash-acinq force-pushed the onion-messages branch 3 times, most recently from 8ed1323 to b38e5f5 Compare September 27, 2021 14:28
@thomash-acinq thomash-acinq force-pushed the onion-messages branch 4 times, most recently from 1c6a594 to 301b29f Compare October 19, 2021 16:21
@thomash-acinq thomash-acinq requested a review from t-bast October 19, 2021 16:27
@thomash-acinq thomash-acinq marked this pull request as ready for review October 20, 2021 15:44
@thomash-acinq thomash-acinq force-pushed the onion-messages branch 9 times, most recently from 461261a to 1eabee0 Compare October 25, 2021 10:01
@thomash-acinq
Copy link
Member Author

Tested with c-lightning. I was able to send a message eclair -> c-lightning -> eclair.

Comment on lines 76 to 77
peer ! Peer.Connect(nodeId, None)
peer forward s
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the connection to be established to be able to forward the onion message.
In case the peer didn't already exist or wasn't connected, this code does not work as the SendOnionMessage will be forwarded before the connection is established. @t-bast Do you have a good way to solve it? Ideally we wouldn't need to create a peer at all but currently it's the peer that creates the connection.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question, it raises a few important design questions.

In the case where we forward an onion message to a node we don't have channels with, what should our behavior be?
I think a good behavior would be to start a Peer actor, initiate the connection, forward the onion message, and automatically stop the actor after a few minutes (I think we should avoid keeping that connection alive indefinitely since we have no channels with that node). We should also ensure that we don't exchange gossip messages with that peer. What do you think of that proposal? I'm also curious to have @pm47 opinion.

If we go down that road, I see (at least) two options:

  • reuse the existing Peer actor and introduce:
    • a hook to have it send a message once the connection is established (e.g. by modifying the Peer.Connect case class to include a new sendOnceConnected: Option[LightningMessage] or something more clever)
    • a hook to have it stop itself after a while, unless a channel is being opened
    • a hook to filter out gossip traffic (this one shouldn't be too hard)
  • create a new EphemeralPeer that will support a subset of what the Peer does (copy-paste some of the code, but maybe opportunities to simplify the design since it handles less things)
    • it gives us more freedom to implement the onion messages part, but it may create the issue of upgrading an EphemeralPeer to a Peer: if while the EphemeralPeer connection is alive we open a channel (or they do) we really need to convert to a "full" Peer and that may not be trivial at all...
  • In both cases, we probably need the "stop" timer to reset every time we receive/forward a new onion message to that peer, so that we keep connections alive when they're actually regularly used for onion messages

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In c-lightning we don't opportunistically connect like this for forwards (we do if we're the originator). Long term I think that's the right thing, but it might be nice in the short term if nodes were try autoconnect like this even for forwarding (with some limit!).

It's overkill for the first implementation though, IMHO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not too hard we'd like to have it sooner rather than later, because we'd like to ensure that Phoenix users don't have to dox their IP address when they use offers (if they aren't using Tor)...we'd rather have them send onion messages through our node and have our node connect on-the-fly. Ideally if Phoenix could make our node connect to the next-to-last node in the onion route (instead of the final recipient) it would be even better for privacy!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you're right that we can split some of the connection handling to a separate PR to avoid delaying and get a first version of onion messages out! I'll be reviewing it this week.

Copy link
Member

@t-bast t-bast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of high-level comments about how code is organized in the codebase, as it's not necessarily intuitive where new features like this one should live.

I'll dive into the actual business logic once we've sorted out these high-level things and design questions.

Comment on lines 76 to 77
peer ! Peer.Connect(nodeId, None)
peer forward s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question, it raises a few important design questions.

In the case where we forward an onion message to a node we don't have channels with, what should our behavior be?
I think a good behavior would be to start a Peer actor, initiate the connection, forward the onion message, and automatically stop the actor after a few minutes (I think we should avoid keeping that connection alive indefinitely since we have no channels with that node). We should also ensure that we don't exchange gossip messages with that peer. What do you think of that proposal? I'm also curious to have @pm47 opinion.

If we go down that road, I see (at least) two options:

  • reuse the existing Peer actor and introduce:
    • a hook to have it send a message once the connection is established (e.g. by modifying the Peer.Connect case class to include a new sendOnceConnected: Option[LightningMessage] or something more clever)
    • a hook to have it stop itself after a while, unless a channel is being opened
    • a hook to filter out gossip traffic (this one shouldn't be too hard)
  • create a new EphemeralPeer that will support a subset of what the Peer does (copy-paste some of the code, but maybe opportunities to simplify the design since it handles less things)
    • it gives us more freedom to implement the onion messages part, but it may create the issue of upgrading an EphemeralPeer to a Peer: if while the EphemeralPeer connection is alive we open a channel (or they do) we really need to convert to a "full" Peer and that may not be trivial at all...
  • In both cases, we probably need the "stop" timer to reset every time we receive/forward a new onion message to that peer, so that we keep connections alive when they're actually regularly used for onion messages

@thomash-acinq
Copy link
Member Author

@t-bast I was able to rebase it on top of your refactoring

Copy link
Member

@t-bast t-bast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff, the logic looks good to me 👍

We are missing the following tests though:

  • We should create a MessageOnionCodecsSpec.scala and test serialization of each individual tlv, I've always found these extremely useful to quickly find codec bugs. In particular, it should contain tests for the fact that the onion message codec allows onions of arbitrary length.
  • We should extend PeerSpec to test the newly added logic of relaying onion messages once connected

I've made annoying comments about multiple fields to rename / file organization to match PaymentOnion.scala, but since these are annoying I've also prepared a commit that applies them so you don't have to go through the pain of this refactoring in #2057 (but I didn't have time to update all the tests and related code so it's not building...).

Apart from that, LGTM we should be able to merge that soon.
I'll discuss what final values we should set for each tlv field with rusty today and we can amend the PR with the latest values then.

eclair-core/src/main/scala/fr/acinq/eclair/io/Peer.scala Outdated Show resolved Hide resolved
eclair-core/src/main/scala/fr/acinq/eclair/io/Peer.scala Outdated Show resolved Hide resolved
eclair-core/src/main/scala/fr/acinq/eclair/io/Peer.scala Outdated Show resolved Hide resolved
Comment on lines 104 to 110
case Event(_ : PeerConnection.ConnectionResult.Failure, d: DisconnectedData) =>
if (d.channels.isEmpty) {
// we don't have any channel with this peer and we can't connect to it so we just drop it
stopPeer()
} else {
stay() using d.copy(messageToRelay = None)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This addition looks unrelated to onion messages, isn't it?
Why weren't we handling that event here before?
To be honest the Peer is subtle and I'm not sure I'm able to properly review it, it would be nice if @pm47 could review the changes to this file (no need to review the rest of the PR, it's been covered).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We didn't handle this before because we would never initiate a connection before. Connections were initiated manually with the API or by other peers connecting to us.
Now if we want to relay to a new peer, we first create the Peer object and then try to establish a connection. If the connection fails for a peer that we've just created that way, we should get rid of it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! It looks good to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just discussed this offline, I would have created a special-purpose actor for the task of connecting to the peer and sending messages. This would be consistent with what we do for relaying payments (and also waking up peers in feature branches) and would reduce added complexity to the Peer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update that in a separate PR where we rework the peer aspect more in-depth?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it's not a blocker.

- Relay onion messages
- Add functions to create onion messages
@thomash-acinq
Copy link
Member Author

Thank you for the improvements in #2057, I've merged them.
I've duplicated the onion packet codec so that it doesn't affect the routing part.

t-bast
t-bast previously approved these changes Nov 8, 2021
Copy link
Member

@t-bast t-bast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's get this first version on master, well done!

t-bast
t-bast previously approved these changes Nov 8, 2021
@thomash-acinq thomash-acinq merged commit 083dc3c into master Nov 9, 2021
@thomash-acinq thomash-acinq deleted the onion-messages branch November 9, 2021 09:16
@thomash-acinq thomash-acinq changed the title Relay onion messages Preparing for onion messages Nov 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants