Skip to content
This repository has been archived by the owner on Aug 17, 2022. It is now read-only.

Update string.(lift|lower)_memory to read u32 + string.size #102

Closed
wants to merge 1 commit into from

Conversation

Hywan
Copy link
Contributor

@Hywan Hywan commented Mar 27, 2020

Initially, string.lift_memory reads a pair of i32 to represent the pointer/the base, and the length of the string. string.lower_memory also reads a i32 to represent the pointer/the base. Finally, string.size returns a i32.

This patch proposes to change i32 by u32. Indeed, a pointer/a base cannot be negative, so it can be represented by u32. The length of the string can also not be negative, so it is by definition a u32.

In addition, using i32 restricts ourself to a memory of 2Gib, whilst u32 allows to address a memory of 4Gib.

Thoughts?

(Originates from wasmerio/wasmer#1329 and wasmerio/wasmer#1337)

Initially, `string.lift_memory` reads a pair of `i32` to represent the pointer/the base, and the length of the string. `string.lower_memory` also reads a `i32` to represent the pointer/the base. Finally, `string.size` returns a `i32`.

This patch proposes to change `i32` by `u32`. Indeed, a pointer/a base cannot be negative, so it can be represented by `u32`. The length of the string can also not be negative, so it is by definition a `u32`.

In addition, using `i32` restricts ourself to a memory of 2Gib, whilst `u32` allows to address a memory of 4Gib.
@eqrion
Copy link
Contributor

eqrion commented Mar 27, 2020

Technically, i32 is a core wasm type and so the interpretation of the bits (i.e signedness) is determined by the instruction that consumes it. This is the case for the core wasm instructions, like i32.load which interpret the i32 as an unsigned integer. The proposed string.lift/lower_memory instructions follow that precedent and interpret the i32 as an unsigned offset into linear memory.

This does raise an interesting question though of when using an interface integer type is more appropriate than a core integer type. The fact that both are available in interface-types raises a good amount of confusion.

I would lean towards using the core i32 type here to match how core wasm instructions specify linear memory offsets. In the future with wasm64, there may be a new abstraction for whether the offset is an i32 or i64, and I can imagine we would want to be compatible with that.

@Hywan
Copy link
Contributor Author

Hywan commented Mar 30, 2020

Thank you for the detailed answer.

My understanding is that it depends of the intent. Do we want to express the semantics of the instruction, or the “implementation”/low-level representation of the instruction (as Wasm core would do)?

My personal opinion is that the document should reflect the semantics of the instructions, and it's up to the implementor to find the best strategy. Because, since WIT allows to express more types, why the document should restrict itself to wasm-core types only?

@lukewagner
Copy link
Member

Lifting inherently converts from core wasm types to interface types, so having string.lift/lower take/produce a u32 would require an additional otherwise-unnecessary instruction to lift/lower the core i32 which is ultimately flowed from/to core wasm.

@Hywan Hywan closed this Mar 31, 2020
@Hywan
Copy link
Contributor Author

Hywan commented Mar 31, 2020

Thanks :-)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants