Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update routing graph with rate-limited gossip #5239

Merged

Conversation

endothermicdev
Copy link
Collaborator

@endothermicdev endothermicdev commented May 4, 2022

The gossip rate-limiting feature is beneficial for the broader lightning network as a DoS prevention measure, but the practical reality is that all other implementations currently do little-to-no rate-limiting, choosing to benefit their node by utilizing the latest gossip updates. This PR lets us continue to squelch outbound gossip that has been rate-limited, but not penalize our own node in the process.

The ideal way to handle rate-limited gossip is unclear. Options are:

  1. Skip the subject channel_update or node_announcement entirely until a new, valid one comes in.
  2. Rebroadcast the last message (now outdated) that was not rate-limited.

This PR uses option 1, which modifies current node behavior from option 2.

To Do:

  • Determine optimal broadcast behavior when rate-limited gossip is encountered.

@endothermicdev endothermicdev requested a review from cdecker as a code owner May 4, 2022 22:44
Copy link
Contributor

@vincenzopalazzo vincenzopalazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tomorrow I will look in the CI for now I have only a couple of comments

common/gossip_store.c Outdated Show resolved Hide resolved
gossipd/routing.c Show resolved Hide resolved
gossipd/routing.c Outdated Show resolved Hide resolved
tests/test_gossip.py Outdated Show resolved Hide resolved
@endothermicdev endothermicdev force-pushed the gossip-unlimited branch 2 times, most recently from 2a3c00f to 3b82778 Compare May 5, 2022 21:44
@endothermicdev endothermicdev marked this pull request as draft May 6, 2022 11:28
@endothermicdev endothermicdev force-pushed the gossip-unlimited branch 3 times, most recently from 068c10f to 64de379 Compare May 20, 2022 13:59
@endothermicdev
Copy link
Collaborator Author

gossip_store will now add both versions of the latest gossip (rate-limited and okay-to-broadcast.) Gossipd skips messages with the rate-limited flag, while gossmap checks that a message is in fact the latest version as it reads from the store. This restores the original functionality (option 2 above.)

@endothermicdev endothermicdev marked this pull request as ready for review May 20, 2022 15:32
tests/test_gossip.py Outdated Show resolved Hide resolved
gossipd/routing.c Outdated Show resolved Hide resolved
gossipd/routing.c Outdated Show resolved Hide resolved
gossipd/routing.c Outdated Show resolved Hide resolved
gossipd/routing.c Outdated Show resolved Hide resolved
Comment on lines 1662 to 1663
if(!node->rgraph.index)
node->rgraph.timestamp = timestamp;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a comment at lesat!

# gossip_store contains both broadcastable and routing-only updates
# Check that we're not replacing a newer update with broadcastable
elif (half.timestamp > self.half_channels[direction].timestamp):
self.half_channels[direction] = half
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should filter when we enter the gossip store (if we don't already?) such that later messages will always have greater timestamp than earlier ones? So this logic is simply "override previous".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would normally be the case. I was concerned with an edge case where we have a normal message with timestamp 1, then a rate-limited message with timestamp 3. Some time later, our rate-limit clears and we receive a message with timestamp 2. This could replace the original broadcastable message, but would not supersede the later-timestamped gossip that's still useful for the routing graph.
This edge case wouldn't normally occur if we're well connected and the originating node is always incrementing gossip timestamps, but it is a valid case. We could ignore messages in this edge case and keep the gossip_store in ascending order. Alternatively, we could discard message 2 and unflag message 3 (now okay to broadcast also.) Would probably be more useful to delete and rewrite it to the store without the spam flag, but same idea. I kind of like this behavior now that I think about it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! We should be comparing any incoming against the latest we have, spam or no.

Don't worry too much about the case where we might consider something "no longer spam" sometime later, either.

Copy link
Contributor

@vincenzopalazzo vincenzopalazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK, I have only a couple of smaller comments.

And also we need the data in the binary like contrib/pyln-client/tests/data/gossip_store-part1.xz?

common/gossmap.c Outdated Show resolved Hide resolved
common/gossmap.c Outdated Show resolved Hide resolved
gossipd/gossip_store.c Outdated Show resolved Hide resolved
@endothermicdev
Copy link
Collaborator Author

And also we need the data in the binary like contrib/pyln-client/tests/data/gossip_store-part1.xz?

I only updated the gossip_store version byte to the latest in these. gossip_store-part2 I did not change. I'm not sure why it has a higher version byte already, but in use, it seems to only be opened in append mode, so it should be inconsequential.

Copy link
Contributor

@rustyrussell rustyrussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, some more simplifications possible.
Some weird commits in the middle, cherry-picked from elsewhere.
Some later commits need to be squashed into earlier ones.
Style is still not following doc/STYLE.md (see https://www.kernel.org/doc/html/v4.10/process/coding-style.html too)

# gossip_store contains both broadcastable and routing-only updates
# Check that we're not replacing a newer update with broadcastable
elif (half.timestamp > self.half_channels[direction].timestamp):
self.half_channels[direction] = half
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! We should be comparing any incoming against the latest we have, spam or no.

Don't worry too much about the case where we might consider something "no longer spam" sometime later, either.

gossipd/routing.c Outdated Show resolved Hide resolved
gossipd/routing.c Outdated Show resolved Hide resolved
Comment on lines 1662 to 1663
if(!node->rgraph.index)
node->rgraph.timestamp = timestamp;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a comment at lesat!

common/gossmap.c Outdated Show resolved Hide resolved
gossipd/gossip_store.c Outdated Show resolved Hide resolved
@endothermicdev endothermicdev force-pushed the gossip-unlimited branch 6 times, most recently from d764188 to 7c1faa2 Compare June 23, 2022 15:26
Copy link
Contributor

@rustyrussell rustyrussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes only; this now looks superb!

@@ -9,7 +9,7 @@
import struct

# These duplicate constants in lightning/common/gossip_store.h
GOSSIP_STORE_VERSION = 9
GOSSIP_STORE_VERSION = 0x0a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, to be nice, can we change this to:

GOSSIP_STORE_VERSIONS = [ 0x09, 0x0a ]

Then check version[0] in GOSSIP_STORE_VERSIONS? That lets us use older stores, too...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! Implemented and tested.

@@ -173,7 +173,7 @@ def __init__(self, store_filename: str = "gossip_store"):
self._last_scid: Optional[str] = None
version = self.store_file.read(1)
if version[0] != GOSSIP_STORE_VERSION:
raise ValueError("Invalid gossip store version {}".format(int(version)))
raise ValueError("Invalid gossip store version {}".format(int.from_bytes(version, "big")))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no endiannes on single values. But this should have been int(version[0]). Or better, set version = self.store_file.read(1)[0] and clean this all up?

Comment on lines -1152 to 1133
f.write(bytearray.fromhex("09" # GOSSIP_STORE_VERSION
f.write(bytearray.fromhex("0a" # GOSSIP_STORE_VERSION
"000001b0" # len
"fea676e8" # csum
"5b8d9b44" # timestamp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could deliberately leave this, right, and test the old version still works?

gossipd/routing.c Outdated Show resolved Hide resolved
gossipd/routing.c Outdated Show resolved Hide resolved
@niftynei niftynei added this to the v0.11.2 milestone Jun 24, 2022
Copy link
Contributor

@niftynei niftynei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was fun to review w/ @endothermicdev. learned a lot about how gossip store works.

this should be a great improvement over the existing impact of "spam" on our routing graph

mostly logic nits/readability things!

} else {
/* Remove the prior spam update if it exists. */
if (hc->rgraph.index != hc->bcast.index) {
gossip_store_delete(rstate->gs, &hc->rgraph,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the bcast entry is for the prior update (before this 'spammy' message arrived)

here we're removing the prior update from the rgraph (routing graph) because it's been superceded by this new update (which is spam, so we don't update bcast)

= gossip_store_add(rstate->gs, msg,
node->bcast.timestamp,
/* Don't add to the store if it was loaded from the store. */
if (index) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this accomplishes the same thing more succinctly?

if (!index) {
    index = gossip_store_add ...
    if (node->bcast.timestamp...)
}
    
node->rgraph.index = index;

if (!spam)
    node->bcast.index = index

@@ -1902,8 +1903,7 @@ def channel_fees(node):
check=True,
timeout=TIMEOUT,
stdout=subprocess.PIPE).stdout.decode('utf8')
# Used in routing graph, but not passed to gossip peers.
assert("fee_proportional_millionths=1005" not in decoded)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe check both? (e.g. leave this check in?)

@endothermicdev endothermicdev modified the milestones: v0.11.2, v0.12 Jun 27, 2022
@endothermicdev endothermicdev force-pushed the gossip-unlimited branch 2 times, most recently from 1451c89 to 44674bd Compare June 28, 2022 15:41
@endothermicdev endothermicdev force-pushed the gossip-unlimited branch 4 times, most recently from a31cac5 to e1b4950 Compare July 1, 2022 22:54
This will be used to decouple internal use of gossip from what is
passed to gossip peers. Updates GOSSIP_STORE_VERION to 10.

Changelog-Changed: gossip_store updated to version 10.
routing.c now flags rate-limited gossip as it enters the gossip_store but
makes use of it in updating the routing graph. Flagged gossip is not
rebroadcast to gossip peers.

Changelog-Changed: gossipd: now accepts spam gossip, but squelches it for
peers.
This grows the routing state in order to index both okay-to-broadcast
and rate-limited gossip. The gossip_store also logs the rate-limited
gossip if useful. This allows the broadcast of the last non-rate-limited
gossip.
@rustyrussell
Copy link
Contributor

Ack d7ec151

@rustyrussell rustyrussell merged commit 87a66c1 into ElementsProject:master Jul 6, 2022
cdecker added a commit to cdecker/lightning that referenced this pull request Jul 11, 2022
The gossip_store version was bumbed to 10 in PR ElementsProject#5239 and ElementsProject#5375 has
apparently not been rebased on top of that when it was merged, causing
the `run-gossmap_canned` test to fail when parsing the v9 gossip
store.

Changelog-None
cdecker added a commit to cdecker/lightning that referenced this pull request Jul 11, 2022
The gossip_store version was bumbed to 10 in PR ElementsProject#5239 and ElementsProject#5375 has
apparently not been rebased on top of that when it was merged, causing
the `run-gossmap_canned` test to fail when parsing the v9 gossip
store.

Changelog-None
cdecker added a commit to cdecker/lightning that referenced this pull request Jul 11, 2022
The gossip_store version was bumbed to 10 in PR ElementsProject#5239 and ElementsProject#5375 has
apparently not been rebased on top of that when it was merged, causing
the `run-gossmap_canned` test to fail when parsing the v9 gossip
store.

Changelog-None
cdecker added a commit to cdecker/lightning that referenced this pull request Jul 11, 2022
The gossip_store version was bumbed to 10 in PR ElementsProject#5239 and ElementsProject#5375 has
apparently not been rebased on top of that when it was merged, causing
the `run-gossmap_canned` test to fail when parsing the v9 gossip
store.

Changelog-None
cdecker added a commit to cdecker/lightning that referenced this pull request Jul 11, 2022
The gossip_store version was bumbed to 10 in PR ElementsProject#5239 and ElementsProject#5375 has
apparently not been rebased on top of that when it was merged, causing
the `run-gossmap_canned` test to fail when parsing the v9 gossip
store.

Changelog-None
niftynei pushed a commit that referenced this pull request Jul 11, 2022
The gossip_store version was bumbed to 10 in PR #5239 and #5375 has
apparently not been rebased on top of that when it was merged, causing
the `run-gossmap_canned` test to fail when parsing the v9 gossip
store.

Changelog-None
whitslack pushed a commit to whitslack/lightning that referenced this pull request Oct 8, 2022
The gossip_store version was bumbed to 10 in PR ElementsProject#5239 and ElementsProject#5375 has
apparently not been rebased on top of that when it was merged, causing
the `run-gossmap_canned` test to fail when parsing the v9 gossip
store.

Changelog-None
whitslack pushed a commit to whitslack/lightning that referenced this pull request Oct 28, 2022
The gossip_store version was bumbed to 10 in PR ElementsProject#5239 and ElementsProject#5375 has
apparently not been rebased on top of that when it was merged, causing
the `run-gossmap_canned` test to fail when parsing the v9 gossip
store.

Changelog-None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants