aip | title | author | discussions-to (*optional) | Status | last-call-end-date (*optional) | type | created | updated (*optional) | requires (*optional) |
---|---|---|---|---|---|---|---|---|---|
105 |
Value manipulation move stdlib native utilities |
igor-aptos |
Draft |
<mm/dd/yyyy the last date to leave feedbacks and reviews> |
Framework |
<mm/dd/yyyy> |
<mm/dd/yyyy> |
<AIP number(s)> |
This AIP covers adding a set of new native functions, for value manipulation, that improve functionality and performance:
mem::swap
- Swaps contents of two mutable references.vector::move_range
- (efficiently) Moves range of elements from one vector to another vector at specific position, keep the order of the rest of the elements.cmp::compare
- Compares two move values of the same type.bcs::constant_serialized_size
- If the type has known constant (always the same, independent of instance) serialized size in BCS format, allows obtaining it.
Only above utility functions are within scope.
Today, in Move, it is impossible to update a value (for example field in a struct) without either dropping the existing value there, or updating all fields recursively. (which cannot be done if module doesn't own all those types).
For copy+drop types, you can have a method to exchange contents of two mutable references
public native fun swap<T: copy+drop>(left: &mut T, right: &mut T) {
let tmp = *left;
*left = *right;
*right = tmp;
}
but that is impossible to do with types that are not copy+drop
.
As an example, it would be impossible to implement API of the current Option
type, as an enum (instead of current vector):
enum Option<Element> {
None,
Some { value: Element, },
}
/// Convert the none option `self` to a some option by adding `e`.
/// Aborts if `self` already holds a value
public fun fill<Element>(self: &mut Option<Element>, e: Element) {
// cannot be implemented
}
New mem
module, with native fun swap<T>(left: &mut T, right: &mut T)
and easily derived fun replace<T>(ref: &mut T, new: T): T
functions enables a content of mutable reference to be updated (swapped or replaced) without copies and drops.
With it, above example can be implemented as:
public fun fill<Element>(self: &mut Option<Element>, e: Element) {
assert!(self is Option::None<Element>, EOPTION_IS_SET);
let new_value = Option::Some { value : e };
mem::swap(self, &mut new_value);
let Option::None = new_value;
}
Or more simply with mem::replace
:
public fun fill<Element>(self: &mut Option<Element>, e: Element) {
assert!(self is Option::None<Element>, EOPTION_IS_SET);
let Option::None = mem::replace(self, Option::Some { value : e });
}
A lot of common operations on vectors (like insert
, remove
, append
, trim
), require shifting/moving portion of the vector. Currently only individual elements can be exchanged with vector::swap
, and so all those operations are implemented in move, with individually swapping one element at the time.
In most languages such operations are implemented through extremely efficient memcopy
functions (that are specialized for the hardware they are running on). In Rust, Vec
operations use ptr::copy_nonoverlapping
to perform such operations efficiently.
We introduce here native fun range_move<T>(from: &mut vector<T>, removal_position: u64, length: u64, to: &mut vector<T>, insert_position: u64);
, which moves range of elements from one vector to another vector at specific position, keep the order of the rest of the elements.
It is a single method that generalizes all 4 above operations, and more. With it, for example append
can be efficiently implemented as:
public fun append<Element>(self: &mut vector<Element>, other: vector<Element>) {
let self_length = length(self);
let other_length = length(&other);
range_move(&mut other, 0, other_length, self, self_length);
destroy_empty(other);
}
Performance improvement (and correspondingly gas costs) are:
num elements from the back | 10 | 100 | 1000 |
---|---|---|---|
remove+insert | -28.8% | -87.3% | -98.4% |
trim+append | -66.1% | -89.5% | -99.0% |
Currently, in Move, only primitive types can be compared (using the <
and other corresponding operators). And it cannot be done on a generic type at all.
You could serialize values into BCS, and then compare byte vectors, but:
- that is very inefficient
- That generally doesn't perform comparison intuitively, as it might not match the
<
operator.
This prevents from being able to create generic utility methods - like sort, binary_search, or generic datastructures - like OrderedMap, PriorityQueue, etc.
Here we introduce native fun compare<T>(first: &T, second: &T): Ordering
method, which compares two values both efficiently, and with the natural ordering:
- native types are compared identically to
<
and other operators - Complex types - Structs and vectors - are compared lexicographically - first field/element is compared first, and if equal we proceed to the next.
- enum's are compared first by their variant, and if equal - they are compared as structs are.
Some types have variable serialized size (vectors, enums), and some types have constant (independent of instance) serialized size (primitive types, structs composed of constant sized types).
It is useful to be able to tell the distinction - and in a generic method be able to have different logic based on whether the type has constant serialized size. It is also useful to be able to get the size of a type that has constant size without having a single instance provided.
For example usecase, individual resources onchain have upper limits, and so different libraries need to make sure resources don't exceed those limits.
If reources is a vector of elements, it might need to check if adding element will exceed the limit. It could track the running size, and compute the size of provided value, to evaluate that.
But if elements have constant size, it can be computed once initially, and translated into maximal number of elements to put in that resource, avoiding needing to call bcs::serialized_size
on every value added.
mem::swap
vector::move_range
and it's usage to optimize common vector operations.cmp::compare
and it's move part.bcs::constant_serialized_size
Each native function is provided with a set of tests, as well as performance benchmark as appropriate.
Adding new native functions always raises questions/concerns on whether it is really needed to do so - given the larger complexity/harder maintanability of native code. We've provided above
mem::swap
and vector::move_range
directly interact and modify internal VM value representations, and so any bugs there could lead to type confusion (which could then lead to very serious issues - like loss of funds).
So those two need to be thoroughly scrutinized. Their implementations are luckily very short. Additional security reviews will be performed.
These new native functions open doors for a lot of new things to be built on top of them:
- generic ordered datastructures
- efficient vector-backed datastructures
- expand what can be done with enums
- etc
Full implementation is ready at this point.
Optional: Indicate a future release version as a rough estimate for when the community should expect to see this deployed on our three networks (e.g., release 1.7). You are responsible for updating this AIP with a better estimate, if any, after the AIP passes the gatekeeper’s design review.
- On devnet?
- On testnet?
- On mainnet?
...
Q&A here, some of them can have answers some of those questions can be things we have not figured out but we should
...