-
Notifications
You must be signed in to change notification settings - Fork 1.6k
validation host is too eager to deem a candidate invalid #3192
Comments
I think that from the perspective of the high-level code the 'validate candidate' API should make a best effort to validate the candidate and if that fails then to be clear that there has been an internal failure. We must make sure that malicious code cannot cause issues presenting as an internal failure. When internal issues are encountered the candidate validation subsystem might retry a few times to present an API which the high level code can use simply to answer the question of validity |
Yes exactly, we should report how we fail though when starting a dispute. We should relevant logs related to disputes too. |
I was thinking about this and I am not sure if I agree. Imagine a malicious PVF crafted so that it would OOM on 50% of machines during compilation. This would make one side of the chasm to think the candidates for that PVF are valid and the other side would think it is invalid. Is that really worse than if the second side thought that this is an internal failure? |
Yes, we prefer if compilation errors are separate from block errors. At first blush, it sounds easy for parachain code if validators compile core ahead of time, but actually yes we likely need a dispute procedure for parachain code even there eventually. In parathreads, we do not want most validators to ever build most parachain code, so I suggested that code updates be handled like a special block, meaning approval checkers build the code and then we conclude the code is valid. If we expect 16 honest approval checkers then that's p^16 for code that fails build on a proportion p of machines. If p > 2/3 then initially we slash those claiming it builds 100%. If p < 1/3 then initially we slash those claiming it fails maybe 10%. If 1/3 < p < 2/3 then we slash nobody but declare the code invalid. In both slashing cases, we need governance toi perform forensics to revert the slashes, but importantly since forensics concerns only a bad build it can afford to be rather generous. I suppose parachains could adopt the same code upgrade procedure too, so no separate dispute procedure. As Adam Langley said, "When you design a protocol have one joint and keep it well oiled" or similarly "do one thing and do it well". I'm kinda ambivalent what happens in the shorter term, but longer term once parachains work properly, including #1656, then they should become out well oiled joint, even though they're crazy complex. |
Can't this be averted somehow? We must have some kind of threat model for parachains where the compilation can succeed on all machines |
Recently we observed that the validation host may fail to communicate with the child process that is responsible for compiling the PVF. This now leads to validation host thinking that the PVF that is broken. Therefore, all following execution requests will fail with "invalid candidate".
The logic there is wrong because failing to communicate with the child process should not lead to claiming that the candidate is invalid.
Besides that, it is not clear to me if we should deem the candidate invalid if the PVF itself is invalid.
The text was updated successfully, but these errors were encountered: