FIP: introducing the Filecoin Virtual Machine. #288

raulk · 2022-02-02T21:46:12Z

The first of several FIP submissions concerning the introduction of the Filecoin Virtual Machine as the new execution layer of the Filecoin blockchain.

This is an early draft that needs some refinement.

See discussion at #287.

Thanks to @Stebalien and @anorth for early reviews.

anorth

The IPLD model raises some interesting questions. My impression is that it's landed somewhere in an awkward middle ground, where the encoding is done in actors, but the VM must be able to interpret that encoded data and fundamentally models reachability.

I think either extreme of the VM being oblivious, or the VM actually implementing the IPLD model could be better. Oblivious means actors APIs and state are just bytes, the VM is simpler, but actors would have to do some work to ensure they don't store unreachable blocks, etc. The VM actually implementing IPLD is very opinionated (which... I think we want?) but efficient and a great programming model for actors, at the cost of more complex VM. The middle ground looks like it's going to need most of the complexity anyway, but with added complexity/cost of needing to validate encoding done in the actors.

anorth · 2022-02-06T21:51:10Z

FIPS/fip-introducing_fvm.md

+Wasm with 32-bit memories supports 64-bit integer types and arithmetic, it just
+can't address memory beyond 4GiB. This size is deemed sufficient for blockchain
+state transition use cases, but may be limiting for future and more generalised
+applications of the FVM.


For discussion: this might limit possible universality of FVM as an execution environment outside the blockchain state transition function. For example, an FVM rollup could have far greater demands on hardware, including memory, to enable much more complex and resource-intensive programs to be written, while retaining Filecoin's blockchain security. Similarly, for compute-over-data projects there's a potential world in which something very like the FVM is the target for code to run inside compute providers, and support very large memory execution environments. It could be very powerful to have seamless, binary interop between rollups and off-chain compute. Taken to the extreme, it might be that we develop a zk-FVM which could do both.

I get the current requirements are rightly focussed on the blockchain state use case, but is there a big gain there from taking the 32-bit route? Is there a possible path to 64-bit in the future (like a compiled contract indicating the wordsize it's built for)?

64-bit memory addressing is a proposed extension to the WASM spec; it doesn't seem to be final yet. This is the latest I could find on a quick search: https://twitter.com/wvo/status/1362458368107941888.

I agree that other use cases that need virtualized runtimes would benefit from 64-bit addressing. However, I'm more skeptical on whether the FVM is the exact virtualized runtime they'd adopt. The FVM is built for state tree manipulation within a blockchain, uses blockchain messages as units of execution, and exposes blockchain-related operations through syscalls.

However, I do feel that the FVM is an instantiation of a "thin waist" interplanetary VM substrate that is WASM-based and IPLD-native. In the future, I would be agreeable to formally definining such VM substrate (by extracting the relevant parts from this spec), and rebasing the FVM spec on top of it. Would probably get more groups involved.

Specialized VMs for compute-over-data (off-chain), Lurk, etc. would also be instantiations of that VM, albeit with support for different WASM features (memory64, SIMD, interface types, threads, etc.) and different syscalls available to hosted code.

anorth · 2022-02-06T22:03:49Z

FIPS/fip-introducing_fvm.md

+The signature of an actor entrypoint is:
+
+```rust
+pub fn invoke(params_block_id: u32) -> u32 { }


To implement internal dispatch in an extensible way, compatible with standard APIs etc, I guess that actors will all need to define something like:

struct Params { MethodName string Params []byte // CBOR-ended, to be decoded by receiver }

An SDK can wrap this, but for interoperability we should probably define a standard for how actors should interpret parameters.

This would be incompatible with the current built-in actors' parameter shapes.

This is good in a sense of giving actors full control over their API (they don't even have to use IPLD/CBOR), but a bit awkward. For ease of use and better interop between the built-in and new actors, perhaps we should consider the VM taking a slightly more active role in dispatch? E.g. the VM could interpret that params struct above, along with the method number, and then invoke the target actor with something like invoke(method_name, method_params)? Perhaps actors could declare which calling convention they use, leaving room for others (including the "raw" method currently specced)?

We definitely need sane defaults, but I'd really like actors to be able to specify their own ABI and API via some form of schema. For example:

type Bar struct { ... } type MyActor interface { func Transfer(foo int, ...) Bar (id:1) } representation "inline"

The interface would likely compile to some mapping between methods and message "templates". E.g.:

Transfer(foo) -> {"method": 1, "foo": $foo}

Transfer(foo) -> (1, $foo)`

This gives us some flexibility as actors can use method numbers, names, hashed names, polymorphic whatever, etc.

That being said... as long as we have method numbers, there's no reason not to pass them in invoke. This is more of a "if we find a nice way to get rid of them".

E.g. the VM could interpret that params struct above, along with the method number, and then invoke the target actor with something like invoke(method_name, method_params)?

Note: passing data into actors via invoke is non-trivial because the actor can't access the host's memory. The current approach lets the actor "read" the data from the system, giving it a chance to allocate the necessary space.

Encapsulating all call data into a single object enables call options like proxying/delegation, pattern matching, etc. Furthermore, having to choose how to identify methods in an actor whose code can be updated (assuming we do commit to implement code updates) makes the user think upfront about how they'd identify/deal with each invocable behaviour across updates.

Long term, the idea is to have an IPLD-based IDL that actors can advertise as metadata to enable discoverability by other developers (e.g. for composabilitiy) and by tooling (e.g. message inspectors). The IDL will express which behaviours an actor offers, its interface (inputs, outputs, errors), and how those behaviours can be called (call patterns). Because the method number is so ingrained in the protocol, it's likely that it will remain there forever, and it'll be a glorified call pattern available for users.

anorth · 2022-02-06T22:10:06Z

FIPS/fip-introducing_fvm.md

+
+/// Creates a new block, returning the block's ID. The block's children must be in the reachable
+/// set. The new block isn't added to the reachable set until the CID is computed.
+pub fn create(codec: u64, data: *const u8, len: u32) -> Result<u32>;


Could you confirm my inference here that encoding of the block is happening in contract space, prior to this call? Thus the codec parameter is expected to match the data (how is it checked?), but isn't instructing the VM how to do the encoding internally. So any (?) codec, including future ones is allowed? This will make tracking the links difficult.

It appears that we must specify a supported list of codecs that the VM has code to read. And if we're doing that, it's a small step to also do the encoding in the VM, rather than in WASM.

IPLD blocks are encoded actor-side and checked VM-side. For now, CBOR will be the only expected codec but we expect to add support for other codecs later.

This will make tracking the links difficult.

At a minimum, the VM will need a single function (either native or some WASM module) per-codec to extract links from blocks.

And if we're doing that, it's a small step to also do the encoding in the VM, rather than in WASM.

Unfortunately not. Encoding VM-side is non-trivial as we need to get the data across the FFI boundary somehow. We'd need to specify something like serde's builder pattern (and will lose a fair amount of performance by repeatedly crossing the FFI boundary and not being able to inline anything).

I responded more fully below in #288 (comment).

Sidenote: the codec passed here is later used when calculating the CID for the block.

A visitor pattern over FFI sounds slow. The VM doing the encoding doesn't seem plausible at all, unless it has knowledge of the memory layout and data types of the programming language the actor was developed with, which is undesirable. Alternative we could marshal the data across the boundary in some canonical form.... but wait, isn't that stepping on the toes of IPLD? 😉

A canonical form doesn't sound that bad to me. E.g. something like the Python struct module.

I don't think IPLD is trying to be an efficient in-memory format. It's a storage/distribution format.

anorth · 2022-02-06T22:11:08Z

FIPS/fip-introducing_fvm.md

+pub fn receiver() -> Result<u64>;
+
+/// Returns the method number from the message.
+pub fn method_number() -> Result<u64>;


How about passing this as parameter to invoke instead? Do we want to be opinionated about its use?

We didn't do that because we'd like to find a way for this to go away (using only the parameters). But if that's too difficult (from a network upgrade standpoint), I agree that passing this to invoke makes sense.

For clarity, I meant passing it to invoke and removing this method, thus making it harder for anyone to misuse and setting a path for it to go away.

For clarity, I meant passing it to invoke and removing this method, thus making it harder for anyone to misuse and setting a path for it to go away.

Well, we'd like to make the method number itself go away, not just the syscall.

@anorth That would further bless the "method number" call pattern into the FVM, something we're trying to avoid entirely.

I'm on board with trying to get rid of it entirely. I don't see how having a syscall that's accessible from any actor code all the time is less of a blessing than passing it only to a single entry point method and forcing the callee to do more complex programming if they want to propagate it. But ok, this is your design call.

FIPS/fip-introducing_fvm.md

anorth · 2022-02-06T22:16:50Z

FIPS/fip-introducing_fvm.md

+IPLD blocks are dynamically-sized blobs of data. When loading a block by CID,
+the size is unknown, so the actor cannot allocate memory upfront. Moreover,
+host-side memory is inaccessible to Wasm code, so it's not possible for the
+Kernel to return a pointer to host-side memory for Wasm code to read.


If the VM decodes, the actor can allocate the structure to decode into up-front.

Not without a lot of back-and-forth. I.e.:

Actor: Load CID1.

VM: Found a list of length 5.

Actor: Ok.

VM: The first item is a number.

Actor: Write the number to *some_ptr.

VM: The second item is a string of length 10.

Actor: write it to *some_other_ptr.

etc, etc.

Ok I see why that API would be troublesome, but can't we do better? How about:

Actor: Load CID1, expecting schema A

VM: Ok, here's a pointer to newly allocated memory, which you can cast

Obviously I glossed over details, but would it be unreasonable for the VM to, say, allocate memory dynamically to decode into? Maybe with a gas limit for the call?

Not without making a lot of assumptions about the actor's programming language (or defining some form of flat-buffer abstraction).

Currently, the VM never allocates memory because the VM knows nothing about the actor's memory layout (i.e., the actor "owns" its entire memory). To allocate, it would have to call recursively call back into the actor (which I'd like to avoid if possible).

Different programming languages have different concepts of strings, vectors, data layout, etc. We can pass trivial structs (structs of ints, arrays of ints, etc.) back and forth, but nothing more complex without encoding through something like CBOR.

I ack that it's not a simple problem to solve a language-independent in-memory form for this. Fine to leave it as something to maybe investigate later.

anorth · 2022-02-06T22:21:09Z

FIPS/fip-introducing_fvm.md

+## Backwards Compatibility
+
+Existing built-in actor and system functionality will not be affected by the
+introduction of the Filecoin Virtual Machine.


I think there's more to say here. Perhaps spell out that this FIP actually doesn't represent a protocol change, only a suggested re-architecture. Point to the future FIPs that will break compatibility.

anorth · 2022-02-06T22:25:31Z

FIPS/fip-introducing_fvm.md

+## Test Cases
+
+FVM implementations can subject themselves to the FVM-specific test vectors
+hosted at: https://github.com/filecoin-project/fvm-test-vectors.


This link doesn't resolve.

Because this proposal doesn't actually change the protocol, should we also point to (and execute) all the existing test vectors? Also all the existing actor unit and integration tests?

Worth noting here too that the roll-out plan is to demonstrate compatibility with the existing chain, resyncing since the last actors version. Actually, I just realised this FIP doesn't address the intention to trigger a network upgrade that switches the canonical spec of the actors to the Rust/WASM version? Am I right to infer that belongs here (because the next FIP will be the gas model change)?

This link doesn't resolve.

Try again? It does for me.

Because this proposal doesn't actually change the protocol, should we also point to (and execute) all the existing test vectors? Also all the existing actor unit and integration tests?

Most of the existing test vectors tested prior network versions. The FVM only actors v6 (and soon, v7).

Some of the test vectors were extracted from specs-actors.

In terms of integration tests, lotus integration is underway but the tests won't pass until we have v7 support.

Actually, I just realised this FIP doesn't address the intention to trigger a network upgrade that switches the canonical spec of the actors to the Rust/WASM version? Am I right to infer that belongs here (because the next FIP will be the gas model change)?

Raul is working on a separate FIP proposing a network upgrade. This FIP is setting design direction: switching Filecoin over to a WASM/IPLD VM.

Link works now.

Does the Rust actors codebase faithfully replicate all the unit tests from the Go one? Or did it rely on the comprehensiveness of the Go code for logical correctness checking, and focus on conformance?

The rust actor codebase has very few unit tests. Pretty much all of our test coverage currently comes from extracted test vectors (both from the specs-actors tests and from the network).

This is a problem we should solve on the road to M1. Given that Rust actors are likely becoming the canonical actors, they should be as well-tested as Go actors, if not better.

Stebalien · 2022-02-07T05:24:08Z

@anorth

The IPLD model raises some interesting questions. My impression is that it's landed somewhere in an awkward middle ground, where the encoding is done in actors, but the VM must be able to interpret that encoded data and fundamentally models reachability.

Specifically, it must be able to:

Validate data (so that FVM implementations deterministically accept/reject objects).
Extract links.

I think either extreme of the VM being oblivious, or the VM actually implementing the IPLD model could be better. Oblivious means actors APIs and state are just bytes, the VM is simpler, but actors would have to do some work to ensure they don't store unreachable blocks, etc. The VM actually implementing IPLD is very opinionated (which... I think we want?) but efficient and a great programming model for actors, at the cost of more complex VM. The middle ground looks like it's going to need most of the complexity anyway, but with added complexity/cost of needing to validate encoding done in the actors.

If we keep everything inside actors, managing IPLD data becomes tricky:

Garbage collection (all block management, really) now needs to ask the actor what blocks it thinks should be reachable. Worse, garbage collection would likely need to be eager to ensure that data is deterministically (un)reachable.
If IPLD data is passed from actor A to actor B and these actors have different "opinions" on reachability, we'll run into some really strange issues.

Doing all IPLD operations outside is more workable, but will likely lead to worse performance:

We'll lose lots of optimization/inlining opportunities and we'll have to repeatedly cross the invocation container boundary. Internally, CBOR decoding can be optimized by the compiler because it knows the datastructure it's decoding into. On the other hand, a "good enough" decoding API (likely modeled after serde) might be efficient.
Gas accounting becomes tricky as we're doing more work VM-side.
The API surface gets pretty large (at least if we want any amount of performance).

The third option is to implement the codecs as separate WASM modules. That way:

Actors can use whatever codec implementation they want internally (for performance). Or "import" the system's module for smaller code.
The VM no longer needs to understand specific codecs.
The single "system" codec determines reachability so there are no funny surprises.
We can eventually look into supporting custom user codecs.
We can share these codec implementations with IPFS, etc.

FIPS/fip-introducing_fvm.md

frrist · 2022-02-09T18:33:59Z

FIPS/fip-0030.md

+
+- Explicit messages are sent by secp256k1 or BLS account actors, appear on-chain
+  when executed, and carry a signature to prove their authenticity. 
+- Implicit messages are generated by the system, do not appear on chain, and


Projects such as Lily (and consumers of its data) must be able to observe implicit and internal messages (e.g. sends from multisig actors that do not appear on chain). Currently this is handled via the ExecMonitor interface. Can we ensure the switch to the FVM doesn't break this observability requirement?

I investigated this, and concluded that the FVM doesn't need to know about the ExecMonitor callback. This is entirely handled at the node level:

on every chain message application: https://github.com/filecoin-project/lotus/blob/9ec1abf8808422366443e3bea06973a23e644a2c/chain/consensus/filcns/compute_state.go#L197

on cron tick: https://github.com/filecoin-project/lotus/blob/9ec1abf8808422366443e3bea06973a23e644a2c/chain/consensus/filcns/compute_state.go#L130

on reward tick: https://github.com/filecoin-project/lotus/blob/9ec1abf8808422366443e3bea06973a23e644a2c/chain/consensus/filcns/compute_state.go#L230

However, the FVM does need to generate execution traces, but I'd be inclined to say that this is an optional feature which Lotus supports. I'm not sure we need to specify it in the spec, but I'll give it some extra thought.

Tracking here now: filecoin-project/ref-fvm#318.

anorth

With my FIP editor hat, approving this as a well-formed and -motivated proposal.

As a note to myself and other collaborators, almost all the comments made here would have been better made in the associated discussion thread #287. That way, the discussion would remain discoverable and in-flight while the FIP progresses. We should attempt to limit comments on FIP documents to those about the appropriateness, form and completeness of a proposal, rather than its substantive content.

raulk requested review from anorth, arajasek, jennijuju, kaitlin-beegle and momack2 as code owners February 2, 2022 21:46

raulk force-pushed the raulk/introducing-fvm branch 3 times, most recently from f46de6c to 0897e69 Compare February 2, 2022 21:49

FIP: introducing the Filecoin Virtual Machine.

b27ab8c

raulk force-pushed the raulk/introducing-fvm branch from 0897e69 to b27ab8c Compare February 2, 2022 21:51

raulk added 3 commits February 2, 2022 22:08

fix typo.

567bb1e

small adjustment.

ed2b00e

adjust actor execution diagram.

0ae1cbc

raulk requested review from Stebalien, dignifiedquire, magik6k, nicola, porcuquine, whyrusleeping, ZenGround0 and zixuanzh February 3, 2022 17:31

anorth reviewed Feb 6, 2022

View reviewed changes

kaitlin-beegle reviewed Feb 7, 2022

View reviewed changes

FIPS/fip-introducing_fvm.md Outdated Show resolved Hide resolved

raulk added 3 commits February 8, 2022 20:16

add explicit link for filecoin-project/fvm-test-vectors.

6bc39c0

enhance Backwards Compatibility and Product Considerations.

ab92dce

update header.

b35807d

raulk mentioned this pull request Feb 8, 2022

FIP-0031: Atomic switch to non-programmable FVM #294

Merged

frrist reviewed Feb 9, 2022

View reviewed changes

rename FIP file and assets directory.

5d7d0cb

raulk force-pushed the raulk/introducing-fvm branch from a6e8820 to 5d7d0cb Compare February 11, 2022 14:37

fix image links.

98e51bb

raulk mentioned this pull request Feb 11, 2022

support execution traces filecoin-project/ref-fvm#318

Closed

anorth approved these changes Feb 14, 2022

View reviewed changes

jennijuju merged commit 27920d2 into master Feb 17, 2022

jennijuju deleted the raulk/introducing-fvm branch February 17, 2022 01:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIP: introducing the Filecoin Virtual Machine. #288

FIP: introducing the Filecoin Virtual Machine. #288

raulk commented Feb 2, 2022 •

edited

Loading

anorth left a comment

anorth Feb 6, 2022

raulk Feb 8, 2022

raulk Feb 8, 2022 •

edited

Loading

anorth Feb 6, 2022

Stebalien Feb 8, 2022

raulk Feb 8, 2022

anorth Feb 6, 2022

Stebalien Feb 7, 2022

raulk Feb 8, 2022

raulk Feb 8, 2022 •

edited

Loading

anorth Feb 13, 2022

anorth Feb 6, 2022

Stebalien Feb 7, 2022

anorth Feb 7, 2022

Stebalien Feb 8, 2022

raulk Feb 8, 2022 •

edited

Loading

anorth Feb 13, 2022

anorth Feb 6, 2022

Stebalien Feb 7, 2022

anorth Feb 7, 2022 •

edited

Loading

Stebalien Feb 8, 2022 •

edited

Loading

anorth Feb 13, 2022

anorth Feb 6, 2022

anorth Feb 6, 2022

Stebalien Feb 7, 2022

anorth Feb 7, 2022

Stebalien Feb 7, 2022

raulk Feb 8, 2022

Stebalien commented Feb 7, 2022 •

edited

Loading

frrist Feb 9, 2022 •

edited

Loading

raulk Feb 11, 2022

raulk Feb 11, 2022

anorth left a comment

FIP: introducing the Filecoin Virtual Machine. #288

FIP: introducing the Filecoin Virtual Machine. #288

Conversation

raulk commented Feb 2, 2022 • edited Loading

anorth left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raulk Feb 8, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raulk Feb 8, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raulk Feb 8, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anorth Feb 7, 2022 • edited Loading

Choose a reason for hiding this comment

Stebalien Feb 8, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Stebalien commented Feb 7, 2022 • edited Loading

frrist Feb 9, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anorth left a comment

Choose a reason for hiding this comment

raulk commented Feb 2, 2022 •

edited

Loading

raulk Feb 8, 2022 •

edited

Loading

raulk Feb 8, 2022 •

edited

Loading

raulk Feb 8, 2022 •

edited

Loading

anorth Feb 7, 2022 •

edited

Loading

Stebalien Feb 8, 2022 •

edited

Loading

Stebalien commented Feb 7, 2022 •

edited

Loading

frrist Feb 9, 2022 •

edited

Loading