-
Notifications
You must be signed in to change notification settings - Fork 1.6k
pvf: Update docs for PVF artifacts #6551
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,6 +14,46 @@ | |
// You should have received a copy of the GNU General Public License | ||
// along with Polkadot. If not, see <http://www.gnu.org/licenses/>. | ||
|
||
//! PVF artifacts (final compiled code blobs). | ||
//! | ||
//! # Lifecycle of an artifact | ||
//! | ||
//! 1. During node start-up, the artifacts cache is cleaned up. | ||
//! | ||
//! 2. In order to be executed, a PVF should be prepared first. This means that artifacts should | ||
//! have an [`ArtifactState::Prepared`] entry for that artifact. If not, the preparation process | ||
//! kicks in. The execution request is stashed until after the preparation is done, and the | ||
//! artifact state in the host is set to [`ArtifactState::Preparing`]. Preparation goes through | ||
//! the preparation queue and the pool. | ||
//! | ||
//! 1. If the artifact is already being processed, we add another execution request to the | ||
//! existing preparation job, without starting a new one. | ||
//! | ||
//! 2. Note that if the state is [`ArtifactState::FailedToProcess`], we usually do not retry | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We do not currently have any "I'm taking a long time" messages, so if we send out approval assignments but do artifact builds lazily, then we'll cause no shows, given that builds can take more than the 12? second no show time out. In theory, we could send messages for "building artifact" and/or "It's slow but I'm here", but @rphmeier wanted to avoid complicating the approval process with such messages, probably a wise decision. We therefore need PVF artifacts to be built in advance, or else we suck up the risk of correlated artifact builds creating de fact escalations. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Indeed, I was wondering about this a couple of times already myself. I think for the time being, preparation is usually pretty fast so there are no issues. The problem with preparation in advance is, that this will likely result in wasted effort in case of parathreads. As all validators would need to prepare a PVF, although only 30 approval checkers will actually need it. Might be fine. Other options:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should imho avoid clearing the artifacts cache when the host did not change. We could recompile only the parachain blocks when the host version did change, then lazily recompile the parathread ones. We should've timings of course, but we'll never stop people building wasm blobs that screw up build times intentionally. Interpreting kinda works. We have consensus upon who gets compiled vs interpreted, so interpreted then runs with different approval time parameters. We could similarly adjust approval parameters to include recompiling parathreads each block. This makes parathreads more expensive and second class though. We could've parathreads that "buy" being compiled in advance like parachains. We do still have everyone compile the parathread when the PVF initially gets uploaded though, yes? I'd think this suggests parathreads and parachains should be all be precompiled, which just makes uploading a PVF more expensive. Implicitly then host upgrades become relatively more expensive, but this makes sense too. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
We'd need an intelligent garbage collector, then. Imagine the node is restarted and has a hundred artifacts in the cache. How do we know which ones we will use and which are stale? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Why? They would not pass the pre-checking phase, assuming:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We'd need to garbage collect PVFs when parachains deregister or when the PVF gets superseded by later PVFs. We could do pre-checking after a PVF upload gets finalized, so then we avoid fork concerns and each parachan has at most two PVFs in the cache. We'd loose the ability to upgrade parachain PVFs when finality stalls though, so system parachain could require some escape hatch here. As always we pay for optimizations with complexity. Ain't clear how far this should go right now of course. We could stick with the current proposal for now, but make an issue for smarter PVF garbage collection in future.
It'll be possible to pass the pre-checking but be quite slow compared with average PVF builds. It'll occasionally be possible to pass the pre-checking on one host, but be abysmal on some host upgrade in the pipeline. We could imho ignore this risk though, so yeah maybe you're right.. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That does sound like it would introduce some complexity. I guess my question would be, is it really necessary? How much disk space does each compiled PVF actually require? How bad is it to keep old artifacts around? So far, it seems that we have not had issues with the 24-hour TTL of artifacts AFAIK, so my grug brain thinks that we shouldn't introduce unnecessary optimizations. For parathreads I could see the artifacts needing to stay around for longer - but in that case I would just have a longer TTL for those, and not worry about the extra used disk space. 😛 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All this complexity comes from one scenario: An adversary can create many relay chain forks, so they can upload one artifact on each fork. We support all forks until we know which fork survives. We've rough consensus on artifact age so we could use duration as a proxy for finality though: We retain all artifacts compatible with the current host, so long as either the artifact is active on some relay chain fork, or else the artifact is less than 24 hours old. We add some abandon artifact call for artifacts uploaded but not activated, or else force activation at some block height, or something like that. It's messy to create many relay chain forks without equivocation, so the attack might already result in slashing, which maybe suffices. If you've a run of blocks, then you could've some forks without equivocations, but not too many. It's maybe just easier to wait for finality and have some override for system parachains, not sure. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Raised https://github.com/paritytech/polkadot/issues/6941 to continue this discussion. |
||
//! preparation, though we may under certain conditions. | ||
//! | ||
//! 3. The pool gets an available worker and instructs it to work on the given PVF. The worker | ||
//! starts compilation. When the worker finishes successfully, it writes the serialized artifact | ||
//! into a temporary file and notifies the host that it's done. The host atomically moves | ||
//! (renames) the temporary file to the destination filename of the artifact. | ||
//! | ||
//! 4. If the worker concluded successfully or returned an error, then the pool notifies the queue. | ||
//! In both cases, the queue reports to the host that the result is ready. | ||
//! | ||
//! 5. The host will react by changing the artifact state to either [`ArtifactState::Prepared`] or | ||
//! [`ArtifactState::FailedToProcess`] for the PVF in question. On success, the | ||
//! `last_time_needed` will be set to the current time. It will also dispatch the pending | ||
//! execution requests. | ||
//! | ||
//! 6. On success, the execution request will come through the execution queue and ultimately be | ||
//! processed by an execution worker. When this worker receives the request, it will read the | ||
//! requested artifact. If it doesn't exist it reports an internal error. A request for execution | ||
//! will bump the `last_time_needed` to the current time. | ||
//! | ||
//! 7. There is a separate process for pruning the prepared artifacts whose `last_time_needed` is | ||
//! older by a predefined parameter. This process is run very rarely (say, once a day). Once the | ||
//! artifact is expired it is removed from disk eagerly atomically. | ||
|
||
use crate::{error::PrepareError, host::PrepareResultSender}; | ||
use always_assert::always; | ||
use polkadot_parachain::primitives::ValidationCodeHash; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to re-prepare all PVF artifacts on each node restart? Or does "clean up" mean the artifacts' build host version matches the current host version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, all local artifacts are cleared - we'll re-prepare the PVF the first time a new execute request comes in. I'll update the doc to make it more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replying here to keep the threads focused. Yeah, thinking about it, I actually don't see why we want to clear the artifacts cache. On start-up, we could instead re-populate the
Artifacts
table from the compiled artifacts on disk -- the PVF hash should already be in the filename -- and re-start the 24-hour TTL timers for each artifact. Or we could even use the system's last-modified/accessed metadata for the files (with some sanity checks). Then instead of lazily re-compiling the PVFs, we would lazily delete the ones we end up not needing, which seems a lot more efficient. 🙂@s0me0ne-unkn0wn Would this need coordination with your execution environment PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I didn't see the last comment yet and continued in that thread. Yes, I forgot about 24h timer, it solves the problem. Do I understand correctly that the proposal is to keep artifacts only if the node is not upgraded? In that case, it might work. But we should always purge the artifacts if the node is upgraded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Special coordination is not needed, I'm already used to merging master to that branch three times a week 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, makes sense to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raised https://github.com/paritytech/polkadot/issues/6940 to continue this discussion.