-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update routing graph with rate-limited gossip #5239
Update routing graph with rate-limited gossip #5239
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tomorrow I will look in the CI for now I have only a couple of comments
2a3c00f
to
3b82778
Compare
068c10f
to
64de379
Compare
gossip_store will now add both versions of the latest gossip (rate-limited and okay-to-broadcast.) Gossipd skips messages with the rate-limited flag, while gossmap checks that a message is in fact the latest version as it reads from the store. This restores the original functionality (option 2 above.) |
gossipd/routing.c
Outdated
if(!node->rgraph.index) | ||
node->rgraph.timestamp = timestamp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a comment at lesat!
# gossip_store contains both broadcastable and routing-only updates | ||
# Check that we're not replacing a newer update with broadcastable | ||
elif (half.timestamp > self.half_channels[direction].timestamp): | ||
self.half_channels[direction] = half |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should filter when we enter the gossip store (if we don't already?) such that later messages will always have greater timestamp than earlier ones? So this logic is simply "override previous".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would normally be the case. I was concerned with an edge case where we have a normal message with timestamp 1, then a rate-limited message with timestamp 3. Some time later, our rate-limit clears and we receive a message with timestamp 2. This could replace the original broadcastable message, but would not supersede the later-timestamped gossip that's still useful for the routing graph.
This edge case wouldn't normally occur if we're well connected and the originating node is always incrementing gossip timestamps, but it is a valid case. We could ignore messages in this edge case and keep the gossip_store in ascending order. Alternatively, we could discard message 2 and unflag message 3 (now okay to broadcast also.) Would probably be more useful to delete and rewrite it to the store without the spam flag, but same idea. I kind of like this behavior now that I think about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! We should be comparing any incoming against the latest we have, spam or no.
Don't worry too much about the case where we might consider something "no longer spam" sometime later, either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK, I have only a couple of smaller comments.
And also we need the data in the binary like contrib/pyln-client/tests/data/gossip_store-part1.xz
?
e72b0a9
to
b87e226
Compare
I only updated the gossip_store version byte to the latest in these. gossip_store-part2 I did not change. I'm not sure why it has a higher version byte already, but in use, it seems to only be opened in append mode, so it should be inconsequential. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, some more simplifications possible.
Some weird commits in the middle, cherry-picked from elsewhere.
Some later commits need to be squashed into earlier ones.
Style is still not following doc/STYLE.md (see https://www.kernel.org/doc/html/v4.10/process/coding-style.html too)
# gossip_store contains both broadcastable and routing-only updates | ||
# Check that we're not replacing a newer update with broadcastable | ||
elif (half.timestamp > self.half_channels[direction].timestamp): | ||
self.half_channels[direction] = half |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! We should be comparing any incoming against the latest we have, spam or no.
Don't worry too much about the case where we might consider something "no longer spam" sometime later, either.
gossipd/routing.c
Outdated
if(!node->rgraph.index) | ||
node->rgraph.timestamp = timestamp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a comment at lesat!
d764188
to
7c1faa2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor changes only; this now looks superb!
@@ -9,7 +9,7 @@ | |||
import struct | |||
|
|||
# These duplicate constants in lightning/common/gossip_store.h | |||
GOSSIP_STORE_VERSION = 9 | |||
GOSSIP_STORE_VERSION = 0x0a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, to be nice, can we change this to:
GOSSIP_STORE_VERSIONS = [ 0x09, 0x0a ]
Then check version[0] in GOSSIP_STORE_VERSIONS
? That lets us use older stores, too...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! Implemented and tested.
@@ -173,7 +173,7 @@ def __init__(self, store_filename: str = "gossip_store"): | |||
self._last_scid: Optional[str] = None | |||
version = self.store_file.read(1) | |||
if version[0] != GOSSIP_STORE_VERSION: | |||
raise ValueError("Invalid gossip store version {}".format(int(version))) | |||
raise ValueError("Invalid gossip store version {}".format(int.from_bytes(version, "big"))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no endiannes on single values. But this should have been int(version[0])
. Or better, set version = self.store_file.read(1)[0]
and clean this all up?
f.write(bytearray.fromhex("09" # GOSSIP_STORE_VERSION | ||
f.write(bytearray.fromhex("0a" # GOSSIP_STORE_VERSION | ||
"000001b0" # len | ||
"fea676e8" # csum | ||
"5b8d9b44" # timestamp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could deliberately leave this, right, and test the old version still works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was fun to review w/ @endothermicdev. learned a lot about how gossip store works.
this should be a great improvement over the existing impact of "spam" on our routing graph
mostly logic nits/readability things!
} else { | ||
/* Remove the prior spam update if it exists. */ | ||
if (hc->rgraph.index != hc->bcast.index) { | ||
gossip_store_delete(rstate->gs, &hc->rgraph, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the bcast
entry is for the prior update (before this 'spammy' message arrived)
here we're removing the prior update from the rgraph
(routing graph) because it's been superceded by this new update (which is spam, so we don't update bcast
)
= gossip_store_add(rstate->gs, msg, | ||
node->bcast.timestamp, | ||
/* Don't add to the store if it was loaded from the store. */ | ||
if (index) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this accomplishes the same thing more succinctly?
if (!index) {
index = gossip_store_add ...
if (node->bcast.timestamp...)
}
node->rgraph.index = index;
if (!spam)
node->bcast.index = index
tests/test_gossip.py
Outdated
@@ -1902,8 +1903,7 @@ def channel_fees(node): | |||
check=True, | |||
timeout=TIMEOUT, | |||
stdout=subprocess.PIPE).stdout.decode('utf8') | |||
# Used in routing graph, but not passed to gossip peers. | |||
assert("fee_proportional_millionths=1005" not in decoded) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe check both? (e.g. leave this check in?)
1451c89
to
44674bd
Compare
a31cac5
to
e1b4950
Compare
This will be used to decouple internal use of gossip from what is passed to gossip peers. Updates GOSSIP_STORE_VERION to 10. Changelog-Changed: gossip_store updated to version 10.
routing.c now flags rate-limited gossip as it enters the gossip_store but makes use of it in updating the routing graph. Flagged gossip is not rebroadcast to gossip peers. Changelog-Changed: gossipd: now accepts spam gossip, but squelches it for peers.
This grows the routing state in order to index both okay-to-broadcast and rate-limited gossip. The gossip_store also logs the rate-limited gossip if useful. This allows the broadcast of the last non-rate-limited gossip.
e1b4950
to
d7ec151
Compare
Ack d7ec151 |
The gossip_store version was bumbed to 10 in PR ElementsProject#5239 and ElementsProject#5375 has apparently not been rebased on top of that when it was merged, causing the `run-gossmap_canned` test to fail when parsing the v9 gossip store. Changelog-None
The gossip_store version was bumbed to 10 in PR ElementsProject#5239 and ElementsProject#5375 has apparently not been rebased on top of that when it was merged, causing the `run-gossmap_canned` test to fail when parsing the v9 gossip store. Changelog-None
The gossip_store version was bumbed to 10 in PR ElementsProject#5239 and ElementsProject#5375 has apparently not been rebased on top of that when it was merged, causing the `run-gossmap_canned` test to fail when parsing the v9 gossip store. Changelog-None
The gossip_store version was bumbed to 10 in PR ElementsProject#5239 and ElementsProject#5375 has apparently not been rebased on top of that when it was merged, causing the `run-gossmap_canned` test to fail when parsing the v9 gossip store. Changelog-None
The gossip_store version was bumbed to 10 in PR ElementsProject#5239 and ElementsProject#5375 has apparently not been rebased on top of that when it was merged, causing the `run-gossmap_canned` test to fail when parsing the v9 gossip store. Changelog-None
The gossip_store version was bumbed to 10 in PR ElementsProject#5239 and ElementsProject#5375 has apparently not been rebased on top of that when it was merged, causing the `run-gossmap_canned` test to fail when parsing the v9 gossip store. Changelog-None
The gossip_store version was bumbed to 10 in PR ElementsProject#5239 and ElementsProject#5375 has apparently not been rebased on top of that when it was merged, causing the `run-gossmap_canned` test to fail when parsing the v9 gossip store. Changelog-None
The gossip rate-limiting feature is beneficial for the broader lightning network as a DoS prevention measure, but the practical reality is that all other implementations currently do little-to-no rate-limiting, choosing to benefit their node by utilizing the latest gossip updates. This PR lets us continue to squelch outbound gossip that has been rate-limited, but not penalize our own node in the process.
The ideal way to handle rate-limited gossip is unclear. Options are:
This PR uses option 1, which modifies current node behavior from option 2.
To Do: