Skip to content

Latest commit

 

History

History
205 lines (143 loc) · 10.8 KB

fip-0072.md

File metadata and controls

205 lines (143 loc) · 10.8 KB
fip title author discussions-to status type created
0072
Improved Event Syscall API
Fridrik Asmundsson (@fridrik01), Steven Allen (@stebalien)
Final
Technical Core
2023-08-10

FIP-0072: Improved Event Syscall API

Simple Summary

Revise the internal emit_event syscall by replacing the single CBOR encoded ActorEvent parameter with three separate fixed-sized byte arrays. This change enhances gas charge accuracy and eliminates the need for parsing.

Abstract

The current procedure for emitting an event through the FVM SDK involves sending an ActorEvent structure as input. This structure is encoded into a byte array with CBOR formatting and then sent to the kernel.

Inside the kernel, complete CBOR deserialization is necessary before any validation can occur, which results in an upfront gas charge. However, accurately estimating the gas charge is challenging, as the original size of the ActorEvent remains unknown until the CBOR byte array is fully deserialized. Consequently, to ensure sufficient gas, a slightly higher gas fee is imposed, making this process more costly than necessary.

Change Motivation

This proposal aims to optimize the emit_event syscall by eliminating the need for CBOR encoding altogether. This change enables precise gas charging since the original size of the ActorEvent is known including total sizes of key and value lengths allowing us to charge them separately in addition to fixed overhead of each event entry.

Moreover, the current gas charge is estimated based on the cost of emitting EVM events and won't be accurate for general events emitted by Wasm actors. Thus, we propose a new approach to calculating the gas cost, which should account for any type of emitted events.

Furthermore, this proposal allows validation to be performed concurrently during the deserialization of the input.

Specification

Currently the emit_event syscall is defined as:

pub fn emit_event(
    event_off: *const u8,
    event_len: u32
) -> Result<()>;

The syscall takes in event_off and event_len which refer to a pointer and length in WASM memory where the CBOR encoded ActorEvent has been written. An ActorEvent structure is defined as:

pub struct ActorEvent {
    pub entries: Vec<Entry>,
}

pub struct Entry {
    pub flags: Flags,
    pub key: String,
    pub codec: u64,
    pub value: Vec<u8>,
}

Proposed change

We propose changing the emit_event syscall shown above to the following:

pub fn emit_event(
    event_off: *const EventEntry,
    event_len: u32,
    key_off: *const u8,
    key_len: u32,
    value_off: *const u8,
    value_len: u32,
) -> Result<()>;

We also introduce a new struct EventEntry which is encoded as a packed tuple in field order (u64, u64, u32, u32) where flags/codec fields are copied from Entry directly and key_size/value_size are the size (in bytes) of their respective key/value fields in Entry:

#[repr(C, packed)]
pub struct EventEntry {
    pub flags: u64,   // copy of Entry::flags
    pub codec: u64 ,  // copy Entry::codec
    pub key_size: u32, // Entry::key.len()
    pub value_size: u32, // Entry::value.len()
}

In the new emit_event syscall, instead of taking the whole ActorEvent data encoded using CBOR, this proposed version splits the ActorEvent instance into three different buffers that each refer to a pointer and length in WASM memory where their data has been written. They are as follows:

  • event_off/event_len: Pointer to an array of EventEntry where event_len is the number of elements in the array.
  • key_off/key_len: Pointer to a buffer of size key_len bytes of all the keys from each entry are concatenated.
  • value_off/value_len: Pointer to a buffer of size value_len bytes of all the values from each entry are concatenated.

This syscall will:

  1. Validate that the pointers passed to the syscall are in-bounds.
  2. Validate that the actor is not currently executing in "read-only" mode. If so, the syscall fail with a "ReadOnly" (13) syscall error.
  3. Charge gas as described below.
  4. Validate the event.
  5. Record the event.

Importantly, implementations must charge gas before validating.

Serialization

The serialization from the syscall to FVM is implemented as follows:

Given an ActorEvent with N number of Entry elements, declare three buffers:

  • entries an array of EventEntry of size N.
  • keys a byte buffer of size equal to the total number of keys in all entries.
  • values a buffer of size equal to the total number of values in all entries.

Then for each entry e in ActorEvent:

  • Create EventEntry from e as described above and add it to entries.
  • Add e.keys to the keys buffer.
  • Add e.values to the values buffer.

Deserialization in FVM is done in a similar fashion where we read each EventEntry from entries and create a new Entry object by using the flags/codec from EventEntry and reconstructing its keys/values by reading exactly key_size/value_size from keys/values respectively.

Validation

In the FVM we perform the following validation while deserializing the input:

  1. Initial validation:
    1. Validate that the size of entries is 255 or less. Previously 256 entries were accepted but reducing it by 1 ensures that the resulting CBOR encoding can store the length in a single byte. Otherwise, fail with LimitExceeded.
    2. Validate that the size of values does not exceed 8KiB. Otherwise, fail with LimitExceeded.
    3. Validate that the keys are valid utf8 (validate the entire buffer, all at once).
  2. Initialize total_key_size and total_value_size to 0. For each event entry ee in entries:
    1. Validate that the ee.flags bits are valid (see FIP-0049#New chain types for more details). Otherwise, fail with InvalidArgument.
    2. Validate that the ee.key_size is 31 bytes or less and starts/ends on unicode character boundaries. Previously 32 byte sized keys were accepted but reducing it by 1 ensures compact CBOR encoding. Otherwise, fail with LimitExceeded.
    3. Add ee.key_size to total_key_size, add ee.value_size to total_value_size. Fail with IllegalArgument if either exceed key_size or value_size respectively.
    4. Validate that the ee.codec is acceptable. Currently, only IPLD_RAW (0x55) is allowed. Otherwise, fail with IllegalCodec.
  3. Validate that total_value_size == value_size and total_key_size == key_size. Otherwise, fail with IllegalArgument.

Gas

This FIP makes an adjustment to the gas cost when emitting an event as originally defined in FIP-0049#Gas costs. Instead of having a separate gas fee for validation and processing of an event, we instead charge a single upfront gas fee covering both validation and processing. This fee is applied after checking for read-only but before any other validation, allocation, or deserialization is performed as described in the specification above.

Notable changes from previous gas fee is that we:

  • Removed the indexing cost associated with specifying flags that indicate that keys/values should be indexed by the client. This was not used and therefore removed here.
  • We charge an additional memory allocation per event.
  • Refactored the gas charges allowing us to tune gas for entry and key validation separately.

We define 4 new gas charge parameters:

Name Gas Comment
p_gas_event 2000 Base fee per event emitted
p_gas_event_per_entry 1400 Fee per entry in the event
p_gas_utf8_validation 500 Base fee for utf8 validation
p_gas_utf8_validation_per_byte 16 Per-byte utf8 validation fee

We also use the following existing gas parameters:

Name Gas Comment
p_block_alloc_per_byte 2 Cost of allocating memory
p_memcpy_per_byte 0.4 Cost of copying memory
p_hash_cost[Blake2b-256b1] 10 Cost of hashing

Then, given event_len, key_len, and value_len as passed to the emit_event syscall, we charge gas as follows:

  1. We compute the upper bound of the size of the serialized events (size) as 12 + 9 * event_len + key_len + value_len.
  2. We charge for the event: p_gas_event + event_len * p_gas_event_per_entry.
  3. We charge for utf8 validation: p_gas_utf8_validation + key_len * p_gas_utf8_validation_per_byte.
  4. We charge for copying/allocation 3 times: 3 * (p_block_alloc_per_byte + p_block_memcpy_per_byte) * size.
  5. We charge for hashing the final events to form the event AMT: p_hash_cost[Blake2b-256b1] * size.

Design Rationale

Serialize a single buffer instead of using three

We considered serializing the ActorEvent using a single buffer (as with CBOR) for simplicity but decided against that since it required some parsing if we wanted to charge gas for keys/values separately.

Memory Gas

We charge for 3 memory copies/allocations because we expect:

  1. An immediate copy from the actor's memory to the host's memory.
  2. A copy when forming the event AMT.
  3. A copy when returning events to the client.

Backwards Compatibility

This FIP is consensus breaking, but should have no other impact on backwards compatibility.

Although this proposal restricts the number of entries accepted (from 256 to 255) and the key size (from 32 to 31) it will not have any compatibility issues with current EVM runtime as it produces events with maximum 5 entries and 5 keys.

Test Cases

Provided with implementation.

Security Considerations

This FIP improves security by improving the gas model.

Product Considerations

This FIP makes it possible to safely emit events from actors other than the EVM.

The downside is that it will slightly increase the cost of emitting events from the EVM as the gas model is slightly more pessimistic (and more general). On the other hand, the increase shouldn't matter for most purposes: the base cost (ignoring the cost of the data, which hasn't changed) was previously ~18k gas per event and it now varies from 19k-22k (approximately), based on the number of entries in the event.

Implementation

filecoin-project/ref-fvm#1807

Copyright

Copyright and related rights waived via CC0.