fip | title | author | discussions-to | status | type | created |
---|---|---|---|---|---|---|
0072 |
Improved Event Syscall API |
Fridrik Asmundsson (@fridrik01), Steven Allen (@stebalien) |
Final |
Technical Core |
2023-08-10 |
Revise the internal emit_event
syscall by replacing the single CBOR encoded ActorEvent
parameter with three separate fixed-sized byte arrays. This change enhances gas charge accuracy and eliminates the need for parsing.
The current procedure for emitting an event through the FVM SDK involves sending an ActorEvent
structure as input. This structure is encoded into a byte array with CBOR formatting and then sent to the kernel.
Inside the kernel, complete CBOR deserialization is necessary before any validation can occur, which results in an upfront gas charge. However, accurately estimating the gas charge is challenging, as the original size of the ActorEvent
remains unknown until the CBOR byte array is fully deserialized. Consequently, to ensure sufficient gas, a slightly higher gas fee is imposed, making this process more costly than necessary.
This proposal aims to optimize the emit_event
syscall by eliminating the need for CBOR encoding altogether. This change enables precise gas charging since the original size of the ActorEvent
is known including total sizes of key and value lengths allowing us to charge them separately in addition to fixed overhead of each event entry.
Moreover, the current gas charge is estimated based on the cost of emitting EVM events and won't be accurate for general events emitted by Wasm actors. Thus, we propose a new approach to calculating the gas cost, which should account for any type of emitted events.
Furthermore, this proposal allows validation to be performed concurrently during the deserialization of the input.
Currently the emit_event
syscall is defined as:
pub fn emit_event(
event_off: *const u8,
event_len: u32
) -> Result<()>;
The syscall takes in event_off
and event_len
which refer to a pointer and length in WASM memory where the CBOR encoded ActorEvent
has been written. An ActorEvent
structure is defined as:
pub struct ActorEvent {
pub entries: Vec<Entry>,
}
pub struct Entry {
pub flags: Flags,
pub key: String,
pub codec: u64,
pub value: Vec<u8>,
}
We propose changing the emit_event
syscall shown above to the following:
pub fn emit_event(
event_off: *const EventEntry,
event_len: u32,
key_off: *const u8,
key_len: u32,
value_off: *const u8,
value_len: u32,
) -> Result<()>;
We also introduce a new struct EventEntry
which is encoded as a packed tuple in field order (u64, u64, u32, u32)
where flags
/codec
fields are copied from Entry
directly and key_size
/value_size
are the size (in bytes) of their respective key
/value
fields in Entry
:
#[repr(C, packed)]
pub struct EventEntry {
pub flags: u64, // copy of Entry::flags
pub codec: u64 , // copy Entry::codec
pub key_size: u32, // Entry::key.len()
pub value_size: u32, // Entry::value.len()
}
In the new emit_event
syscall, instead of taking the whole ActorEvent
data encoded using CBOR, this proposed version splits the ActorEvent
instance into three different buffers that each refer to a pointer and length in WASM memory where their data has been written. They are as follows:
event_off/event_len
: Pointer to an array ofEventEntry
whereevent_len
is the number of elements in the array.key_off/key_len
: Pointer to a buffer of sizekey_len
bytes of all the keys from each entry are concatenated.value_off/value_len
: Pointer to a buffer of sizevalue_len
bytes of all the values from each entry are concatenated.
This syscall will:
- Validate that the pointers passed to the syscall are in-bounds.
- Validate that the actor is not currently executing in "read-only" mode. If so, the syscall fail with a "ReadOnly" (13) syscall error.
- Charge gas as described below.
- Validate the event.
- Record the event.
Importantly, implementations must charge gas before validating.
The serialization from the syscall to FVM is implemented as follows:
Given an ActorEvent
with N number of Entry
elements, declare three buffers:
entries
an array ofEventEntry
of size N.keys
a byte buffer of size equal to the total number of keys in all entries.values
a buffer of size equal to the total number of values in all entries.
Then for each entry e
in ActorEvent
:
- Create
EventEntry
frome
as described above and add it toentries
. - Add
e.keys
to thekeys
buffer. - Add
e.values
to thevalues
buffer.
Deserialization in FVM is done in a similar fashion where we read each EventEntry
from entries
and create a new Entry
object by using the flags/codec from EventEntry
and reconstructing its keys/values by reading exactly key_size
/value_size
from keys
/values
respectively.
In the FVM we perform the following validation while deserializing the input:
- Initial validation:
- Validate that the size of
entries
is 255 or less. Previously 256 entries were accepted but reducing it by 1 ensures that the resulting CBOR encoding can store the length in a single byte. Otherwise, fail withLimitExceeded
. - Validate that the size of
values
does not exceed 8KiB. Otherwise, fail withLimitExceeded
. - Validate that the keys are valid utf8 (validate the entire buffer, all at once).
- Validate that the size of
- Initialize
total_key_size
andtotal_value_size
to 0. For each event entryee
inentries
:- Validate that the
ee.flags
bits are valid (see FIP-0049#New chain types for more details). Otherwise, fail withInvalidArgument
. - Validate that the
ee.key_size
is 31 bytes or less and starts/ends on unicode character boundaries. Previously 32 byte sized keys were accepted but reducing it by 1 ensures compact CBOR encoding. Otherwise, fail withLimitExceeded
. - Add
ee.key_size
tototal_key_size
, addee.value_size
tototal_value_size
. Fail withIllegalArgument
if either exceedkey_size
orvalue_size
respectively. - Validate that the
ee.codec
is acceptable. Currently, onlyIPLD_RAW
(0x55) is allowed. Otherwise, fail withIllegalCodec
.
- Validate that the
- Validate that
total_value_size == value_size
andtotal_key_size
==key_size
. Otherwise, fail withIllegalArgument
.
This FIP makes an adjustment to the gas cost when emitting an event as originally defined in FIP-0049#Gas costs. Instead of having a separate gas fee for validation and processing of an event, we instead charge a single upfront gas fee covering both validation and processing. This fee is applied after checking for read-only but before any other validation, allocation, or deserialization is performed as described in the specification above.
Notable changes from previous gas fee is that we:
- Removed the indexing cost associated with specifying flags that indicate that keys/values should be indexed by the client. This was not used and therefore removed here.
- We charge an additional memory allocation per event.
- Refactored the gas charges allowing us to tune gas for entry and key validation separately.
We define 4 new gas charge parameters:
Name | Gas | Comment |
---|---|---|
p_gas_event |
2000 | Base fee per event emitted |
p_gas_event_per_entry |
1400 | Fee per entry in the event |
p_gas_utf8_validation |
500 | Base fee for utf8 validation |
p_gas_utf8_validation_per_byte |
16 | Per-byte utf8 validation fee |
We also use the following existing gas parameters:
Name | Gas | Comment |
---|---|---|
p_block_alloc_per_byte |
2 | Cost of allocating memory |
p_memcpy_per_byte |
0.4 | Cost of copying memory |
p_hash_cost[Blake2b-256b1] |
10 | Cost of hashing |
Then, given event_len
, key_len
, and value_len
as passed to the emit_event
syscall, we charge gas as follows:
- We compute the upper bound of the size of the serialized events (
size
) as12 + 9 * event_len + key_len + value_len
. - We charge for the event:
p_gas_event + event_len * p_gas_event_per_entry
. - We charge for utf8 validation:
p_gas_utf8_validation + key_len * p_gas_utf8_validation_per_byte
. - We charge for copying/allocation 3 times:
3 * (p_block_alloc_per_byte + p_block_memcpy_per_byte) * size
. - We charge for hashing the final events to form the event AMT:
p_hash_cost[Blake2b-256b1] * size
.
We considered serializing the ActorEvent
using a single buffer (as with CBOR) for simplicity but decided against that since it required some parsing if we wanted to charge gas for keys/values separately.
We charge for 3 memory copies/allocations because we expect:
- An immediate copy from the actor's memory to the host's memory.
- A copy when forming the event AMT.
- A copy when returning events to the client.
This FIP is consensus breaking, but should have no other impact on backwards compatibility.
Although this proposal restricts the number of entries accepted (from 256 to 255) and the key size (from 32 to 31) it will not have any compatibility issues with current EVM runtime as it produces events with maximum 5 entries and 5 keys.
Provided with implementation.
This FIP improves security by improving the gas model.
This FIP makes it possible to safely emit events from actors other than the EVM.
The downside is that it will slightly increase the cost of emitting events from the EVM as the gas model is slightly more pessimistic (and more general). On the other hand, the increase shouldn't matter for most purposes: the base cost (ignoring the cost of the data, which hasn't changed) was previously ~18k gas per event and it now varies from 19k-22k (approximately), based on the number of entries in the event.
Copyright and related rights waived via CC0.