-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat:networking: (Synchronous) Consistent Broadcast for Filecoin EC #9858
feat:networking: (Synchronous) Consistent Broadcast for Filecoin EC #9858
Conversation
Oh whoa I didn’t even notice this until now! Thank you Alfonso for getting this in! @filecoin-project/lotus-tse can one of you throw this branch on your miner and confirm win posts stability? (AFTER WINTER BREAK) |
itests/kit/init.go
Outdated
// NOTE: If we want the payment channel itests to pass that use a | ||
// block time of 5*millisecond, we need to set the consistent broadcast | ||
// delay to something lower than that block time. | ||
// todo: configure such a low delay only for paych tests. | ||
build.CBDeliveryDelay = 2 * time.Millisecond |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we fully disable this for most tests? wdpost tests for example use 2ms block time, and it would be good to not slow them down even more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completely disabled consistent broadcast for itests. That being said, it would be great if we can at least enable it on a smoke consensus or deals itests with a decent block time (if there is any) just as a sanity-check that this doesn't break anything in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(meant to comment, not approve)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broadly LGTM, a few questions / suggestions.
type blksInfo struct { | ||
ctx context.Context | ||
cancel context.CancelFunc | ||
blks []cid.Cid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure about the thread-safety of this blks
slice. We seem to be reading & modifying it somewhat unsafely, and while it's probably the case that the system won't ever race here, I'm...not sure.
Can we reason through this a bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am using a sync.Map
to prevent parallel read/writes and avoid requiring a specific lock for every blksInfo
which may be harder to handle. I assume blksInfo
as a single unit of data, and never read the data structure of blocks directly. Thus the load
, store
methods.
That being said, I may have missed some subtle detail. Let me know if this makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So my concern is that we load the blksInfo
out of the sync.Map, but then read and modify the blks
field. I think if we have 2 calls to RcvBlock
, they both load the same bInfo
object (safely). After that, however, they're gonna be concurrently accessing the blks
slice, which I don't think is safe (eg. iterating over the blks field, while it's also being modified).
I'm not 100% sure that I'm right, and even if so, I'm not sure whether simultaneous calls to RcvBlock
are possible (need to 👀 more), but does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, looking at the code closer, we don't actually ever modify it. So we're probably okay here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are OK on that, but I think you are right that there may be a race here as one routine may load the dict and another one may load it in parallel before the previous one has been able to store any potential changes.
I think this is fixed by moving this lock after the load statement in line 125. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adlrocha I don't think that's sufficient. I think it needs to be held until no more writing happens, which is the end of the method (as currently written). Consider the following situation:
- 2 blocks A and B come in with the same VRFProof.
- A gets the lock first, creates an empty BCastDict (line 119), doesn't find a match for the common
key
on line 125, drops the lock - B gets the lock, loads the empty BCastDict created by A, also doesn't find a match for the common
key
on line 125, drops the lock - A
store
sblkACid
as the only entry corresponding tokey
on line 141, returns - B overwrites that, storing
blkBCid
as the only entry corresponding tokey
on line 141, returns - Both A and B succeed their respective calls to
WaitForDelivery
build/params_mainnet.go
Outdated
// CBDeliveryDelay is the delay before deliver in the synchronous consistent broadcast. | ||
// This determines the wait time for the detection of potential equivocations. | ||
var CBDeliveryDelay = 6 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Does this impact block propagation? I guess no since pubsub checks don't touch this logic.
- Can we set this to a lower value while still getting the expected security benefits?
- My concern is that if we merge this now, without fixes for market cron we'll push quite deep into danger territory on how long it takes to get through an epoch (propagation delay + cb delay + state compute, which can be 14s today + winning post)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this impact block propagation? I guess no since pubsub checks don't touch this logic.
It shouldn't, right? The only thing we are delaying is the delivery of the block to the syncer (i.e. the call to InformNewBlock
), but this is done after the Gossipsub validation so the propagation of the block should follow its course seamlessly.
Can we set this to a lower value while still getting the expected security benefits?
This is a great point, and this could definitely be a problem. These 6 seconds is the theoretical time that it takes for a block to be propagated to all of the network by Gossipsub. The assumption is that after this time there is high probability that any equivocation has been propagated enough to be caught by a majority of miners. Lowering this value will impact the effectiveness of the scheme, but I don't know to what extent or what is the minimum threshold. I know @jsoares and @guy.goren were working on getting some real data from mainnet with ProbeLab to see empirically what is the optimal delay. I'll let them chime in in case they have more info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this impact block propagation? I guess no since pubsub checks don't touch this logic.
As @adlrocha said, propagation shouldn't be affected. Anything else would negate the benefits of the patch: we want the guard time to be larger than the propagation time, which is easier the faster propagation is.
Can we set this to a lower value while still getting the expected security benefits?
The value can be set lower. In fact, there is no "cliff" behaviour and the security gains can be approximated (if not formally, at least intuitively) as a logistic function centred on the average propagation delay: while you don't necessarily need to cover the 99th-percentile, too low thresholds would catch few attacks and you'd always want to at least cover the average case.
Alas, our only real-world data was collected from the bootstrapper nodes (via Sentinel). Our analysis of that data shows that the vast majority of blocks reach the bootstrappers in < 1s. However, there are a few weaknesses here, primarily as (i) we're trusting source timestamps; (ii) bootstrapper nodes are not necessarily representative of the average node in terms of connectivity (both link & centrality, even mesh configuration).
In the meantime, ProbeLab is working on a network-wide GossipSub study that will yield a higher-confidence distribution for block propagation. Alas, that work is taking longer than we expected and there is still no time frame for the results. The plan was, therefore, to start with the 6s used as the assumption in the GossipSub security analysis, keep it user-configurable, and push better defaults once we have the data. I expect we'll land in the 1 to 2 second range, but don't hold me to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that here I'm more concerned about a block arriving late making SPs be late with their own block.
Propagation delay is 10s, state compute today takes up to 14s due to cron issues. With this patch worst case if some set of SPs choose to send blocks late, we'd have essentially no time spare for WindowPoSt, essentially cutting into propagation delay for the next epoch - which afaict would be either comparable or worse than epoch boundary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that's a valid concern given we're right at the threshold. How high would you be comfortable going? Given the context above, I think we could conceivably adjust down a second or two to leave some buffer time -- and still reassess when the study results are available.
Co-authored-by: Aayush Rajasekaran <arajasek94@gmail.com>
So I think I share @magik6k's concern here -- this is unfortunate, but the good news is that we're expecting things to get better fairy soon. What would we think about shipping this with an artificially low value (2 seconds?), and running that until nv19. That gives time for any CRITICAL issues with the code to emerge (not that I expect any). Upon nv19, we can set this to the desired 6 seconds. We also probably want to lower the default propagation delay when this is set to 6 seconds, I think? |
(note: I'm responding to this but I'm not the author of the FIP...) That seems reasonable to me. I'm not sure it will be that far off the final numbers. As for adjusting the cutoff, I think the propagation study will definitely provide good data to inform this. That said:
|
That's correct, but in the current implementation it extends the latest time at which block validation begins. Ideally we'd start validating those blocks "optimistically", and somehow retro-reject them if we see another within |
Indeed, and that's suggested in the FIP as a potential optimisation. It's unclear to me how much of a gain that represents in practice, and it may add a bit of complexity, but it's a possible direction if we are indeed time-constrained. My expectation is still that the final parameters will be very comfortable wrt to the available window. |
@arajasek, all comments addressed. I changed the locking strategy to make it less fine-grained but easier to reason about (as discussed above). We could recover the fine-grained locking and make it more understandable by using https://github.com/gammazero/keymutex, but I didn't want to introduce any additional dependency. (For context, we used this dependency extensively for storetheindex, and is written and maintained by Andrew). |
Co-authored-by: Aayush Rajasekaran <arajasek94@gmail.com>
…ast" This reverts commit 8b2208f, reversing changes made to 2db6b12. Unfortunately, we've found multiple issues in this code that could lead to unbounded resource usage, block censorship, crashes, etc. Luckily, this code isn't actually needed by the main Filecoin chain, which relies on reporting consensus faults. So we can just try again later.
…ast" This reverts commit 8b2208f, reversing changes made to 2db6b12. Unfortunately, this is rather tricky code. We've found several issues so far and, while we've fixed a few, there are outstanding issues that would require complex fixes we don't have time to tackle right now. Luckily, this code isn't actually needed by the main Filecoin chain which relies on consensus fault reporting to handle equivocation. So we can just try again later.
Revert #9858 (consistent broadcast changes)
Related Issues
Implementation of filecoin-project/FIPs#527
Proposed Changes
This PR implements a synchronous consistent broadcast to detect potential consensus equivocations.
Checklist
Before you mark the PR ready for review, please make sure that:
<PR type>: <area>: <change being made>
fix: mempool: Introduce a cache for valid signatures
PR type
: fix, feat, build, chore, ci, docs, perf, refactor, revert, style, testarea
, e.g. api, chain, state, market, mempool, multisig, networking, paych, proving, sealing, wallet, deps