Binary Encoding

This document describes the portable binary encoding of the WebAssembly modules.

The binary encoding is a dense representation of module information that enables small files, fast decoding, and reduced memory usage. See the rationale document for more detail.

🦄 = Planned future feature

The encoding is split into three layers:

Layer 0 is a simple binary encoding of the bytecode instructions and related data structures. The encoding is dense and trivial to interact with, making it suitable for scenarios like JIT, instrumentation tools, and debugging.
Layer 1 🦄 provides structural compression on top of layer 0, exploiting specific knowledge about the nature of the syntax tree and its nodes. The structural compression introduces more efficient encoding of values, rearranges values within the module, and prunes structurally identical tree nodes.
Layer 2 🦄 Layer 2 applies generic compression algorithms, like gzip and Brotli, that are already available in browsers and other tooling.

Most importantly, the layering approach allows development and standardization to occur incrementally. For example, Layer 1 and Layer 2 encoding techniques can be experimented with by application-level decompressing to the layer below. As compression techniques stabilize, they can be standardized and moved into native implementations.

See proposed layer 1 compression 🦄 for a proposal for layer 1 structural compression.

Data types

Numbers

`uintN`

An unsigned integer of N bits, represented in N/8 bytes in little endian order. N is either 8, 16, or 32.

`varuintN`

A LEB128 variable-length integer, limited to N bits (i.e., the values [0, 2^N-1]), represented by at most ceil(N/7) bytes that may contain padding 0x80 bytes.

Note: Currently, the only sizes used are varuint1, varuint7, and varuint32, where the former two are used for compatibility with potential future extensions.

`varintN`

A Signed LEB128 variable-length integer, limited to N bits (i.e., the values [-2^(N-1), +2^(N-1)-1]), represented by at most ceil(N/7) bytes that may contain padding 0x80 or 0xFF bytes.

Note: Currently, the only sizes used are varint7, varint32 and varint64.

Language Types

All types are distinguished by a negative varint7 values that is the first byte of their encoding (representing a type constructor):

Opcode	Type constructor
`-0x01` (i.e., the byte `0x7f`)	`i32`
`-0x02` (i.e., the byte `0x7e`)	`i64`
`-0x03` (i.e., the byte `0x7d`)	`f32`
`-0x04` (i.e., the byte `0x7c`)	`f64`
`-0x10` (i.e., the byte `0x70`)	`anyfunc`
`-0x20` (i.e., the byte `0x60`)	`func`
`-0x40` (i.e., the byte `0x40`)	pseudo type for representing an empty `block_type`

Some of these will be followed by additional fields, see below.

Note: Gaps are reserved for future 🦄 extensions. The use of a signed scheme is so that types can coexist in a single space with (positive) indices into the type section, which may be relevant for future extensions of the type system.

`value_type`

A varint7 indicating a value type. One of:

i32
i64
f32
f64

as encoded above.

`block_type`

A varint7 indicating a block signature. These types are encoded as:

either a value_type indicating a signature with a single result
or -0x40 (i.e., the byte 0x40) indicating a signature with 0 results.

`elem_type`

A varint7 indicating the types of elements in a table. In the MVP, only one type is available:

anyfunc

Note: In the future 🦄, other element types may be allowed.

`func_type`

The description of a function signature. Its type constructor is followed by an additional description:

Field	Type	Description
form	`varint7`	the value for the `func` type constructor as defined above
param_count	`varuint32`	the number of parameters to the function
param_types	`value_type*`	the parameter types of the function
return_count	`varuint1`	the number of results from the function
return_type	`value_type?`	the result type of the function (if return_count is 1)

Note: In the future 🦄, return_count and return_type might be generalised to allow multiple values.

Other Types

`global_type`

The description of a global variable.

Field	Type	Description
content_type	`value_type`	type of the value
mutability	`varuint1`	`0` if immutable, `1` if mutable

`table_type`

The description of a table.

Field	Type	Description
element_type	`elem_type`	the type of elements
limits	`resizable_limits`	see below

`memory_type`

The description of a memory.

Field	Type	Description
limits	`resizable_limits`	see below

`external_kind`

A single-byte unsigned integer indicating the kind of definition being imported or defined:

0 indicating a Function import or definition
1 indicating a Table import or definition
2 indicating a Memory import or definition
3 indicating a Global import or definition

`resizable_limits`

A packed tuple that describes the limits of a table or memory:

Field	Type	Description
flags	`varuint32`	bit `0x1` is set if the maximum field is present
initial	`varuint32`	initial length (in units of table elements or wasm pages)
maximum	`varuint32`?	only present if specified by `flags`

Note: In the future 🦄, the "flags" field may be extended, e.g., to include a flag for sharing between threads.

`init_expr`

The encoding of an initializer expression is the normal encoding of the expression followed by the end opcode as a delimiter.

Note that get_global in an initializer expression can only refer to immutable imported globals and all uses of init_expr can only appear after the Imports section.

Module structure

The following documents the current prototype format. This format is based on and supersedes the v8-native prototype format, originally in a public design doc.

High-level structure

The module starts with a preamble of two fields:

Field	Type	Description
magic number	`uint32`	Magic number `0x6d736100` (i.e., '\0asm')
version	`uint32`	Version number, currently 0xd. The version for MVP will be reset to 1.

The module preamble is followed by a sequence of sections. Each section is identified by a 1-byte section code that encodes either a known section or a user-defined section. The section length and payload data then follow. Known sections have non-zero ids, while unknown sections have a 0 id followed by an identifying string as part of the payload. Unknown sections are ignored by the WebAssembly implementation, and thus validation errors within them do not invalidate a module.

Field	Type	Description
id	`varuint7`	section code
payload_len	`varuint32`	size of this section in bytes
name_len	`varuint32` ?	length of the section name in bytes, present if `id == 0`
name	`bytes` ?	section name string, present if `id == 0`
payload_data	`bytes`	content of this section, of length `payload_len - sizeof(name) - sizeof(name_len)`

Each section is optional and may appear at most once. Known sections from this list may not appear out of order. The content of each section is encoded in its payload_data.

Section Name	Code	Description
Type	`1`	Function signature declarations
Import	`2`	Import declarations
Function	`3`	Function declarations
Table	`4`	Indirect function table and other tables
Memory	`5`	Memory attributes
Global	`6`	Global declarations
Export	`7`	Exports
Start	`8`	Start function declaration
Element	`9`	Elements section
Code	`10`	Function bodies (code)
Data	`11`	Data segments

The end of the last present section must coincide with the last byte of the module. The shortest valid module is 8 bytes (magic number, version, followed by zero sections).

Type section

The type section declares all function signatures that will be used in the module.

Field	Type	Description
count	`varuint32`	count of type entries to follow
entries	`func_type*`	repeated type entries as described below

Note: In the future 🦄, this section may contain other forms of type entries as well, which can be distinguished by the form field of the type encoding.

Import section

The import section declares all imports that will be used in the module.

Field	Type	Description
count	`varuint32`	count of import entries to follow
entries	`import_entry*`	repeated import entries as described below

Import entry

Field	Type	Description
module_len	`varuint32`	module string length
module_str	`bytes`	module string of `module_len` bytes
field_len	`varuint32`	field name length
field_str	`bytes`	field name string of `field_len` bytes
kind	`external_kind`	the kind of definition being imported

Followed by, if the kind is Function:

Field	Type	Description
type	`varuint32`	type index of the function signature

or, if the kind is Table:

Field	Type	Description
type	`table_type`	type of the imported table

or, if the kind is Memory:

Field	Type	Description
type	`memory_type`	type of the imported memory

or, if the kind is Global:

Field	Type	Description
type	`global_type`	type of the imported global

Note that, in the MVP, only immutable global variables can be imported.

Function section

The function section declares the signatures of all functions in the module (their definitions appear in the code section).

Field	Type	Description
count	`varuint32`	count of signature indices to follow
types	`varuint32*`	sequence of indices into the type section

Table section

The encoding of a Table section:

Field	Type	Description
count	`varuint32`	indicating the number of tables defined by the module
entries	`table_type*`	repeated `table_type` entries as described below

In the MVP, the number of tables must be no more than 1.

Memory section

ID: memory

The encoding of a Memory section:

Field	Type	Description
count	`varuint32`	indicating the number of memories defined by the module
entries	`memory_type*`	repeated `memory_type` entries as described below

Note that the initial/maximum fields are specified in units of WebAssembly pages.

In the MVP, the number of memories must be no more than 1.

Global section

The encoding of the Global section:

Field	Type	Description
count	`varuint32`	count of global variable entries
globals	`global_variable*`	global variables, as described below

Global Entry

Each global_variable declares a single global variable of a given type, mutability and with the given initializer.

Field	Type	Description
type	`global_type`	type of the variables
init	`init_expr`	the initial value of the global

Note that, in the MVP, only immutable global variables can be exported.

Export section

The encoding of the Export section:

Field	Type	Description
count	`varuint32`	count of export entries to follow
entries	`export_entry*`	repeated export entries as described below

Export entry

Field	Type	Description
field_len	`varuint32`	field name string length
field_str	`bytes`	field name string of `field_len` bytes
kind	`external_kind`	the kind of definition being exported
index	`varuint32`	the index into the corresponding index space

For example, if the "kind" is Function, then "index" is a function index. Note that, in the MVP, the only valid index value for a memory or table export is 0.

Start section

The start section declares the start function.

Field	Type	Description
index	`varuint32`	start function index

Element section

The encoding of the Elements section:

Field	Type	Description
count	`varuint32`	count of element segments to follow
entries	`elem_segment*`	repeated element segments as described below

a elem_segment is:

Field	Type	Description
index	`varuint32`	the table index (0 in the MVP)
offset	`init_expr`	an `i32` initializer expression that computes the offset at which to place the elements
num_elem	`varuint32`	number of elements to follow
elems	`varuint32*`	sequence of function indices

Code section

ID: code

The code section contains a body for every function in the module. The count of function declared in the function section and function bodies defined in this section must be the same and the ith declaration corresponds to the ith function body.

Field	Type	Description
count	`varuint32`	count of function bodies to follow
bodies	`function_body*`	sequence of Function Bodies

Data section

The data section declares the initialized data that is loaded into the linear memory.

Field	Type	Description
count	`varuint32`	count of data segments to follow
entries	`data_segment*`	repeated data segments as described below

a data_segment is:

Field	Type	Description
index	`varuint32`	the linear memory index (0 in the MVP)
offset	`init_expr`	an `i32` initializer expression that computes the offset at which to place the data
size	`varuint32`	size of `data` (in bytes)
data	`bytes`	sequence of `size` bytes

Name section

User-defined section string: "name"

The names section does not change execution semantics, and thus is not allocated a section code. It is encoded as an unknown section (id 0) followed by the identification string "name". Like all unknown sections, a validation error in this section does not cause validation of the module to fail. The expectation is that, when a binary WebAssembly module is viewed in a browser or other development environment, the names in this section will be used as the names of functions and locals in the text format.

Field	Type	Description
count	`varuint32`	count of entries to follow
entries	`function_names*`	sequence of names

The sequence of function_names assigns names to the corresponding function index. The count may be greater or less than the actual number of functions.

Function names

Field	Type	Description
fun_name_len	`varuint32`	string length, in bytes
fun_name_str	`bytes`	valid utf8 encoding
local_count	`varuint32`	count of local names to follow
local_names	`local_name*`	sequence of local names

The sequence of local_name assigns names to the corresponding local index. The count may be greater or less than the actual number of locals.

Local name

Field	Type	Description
local_name_len	`varuint32`	string length, in bytes
local_name_str	`bytes`	valid utf8 encoding

Function Bodies

Function bodies consist of a sequence of local variable declarations followed by bytecode instructions. Each function body must end with the end opcode.

Field	Type	Description
body_size	`varuint32`	size of function body to follow, in bytes
local_count	`varuint32`	number of local entries
locals	`local_entry*`	local variables
code	`byte*`	bytecode of the function
end	`byte`	`0x0f`, indicating the end of the body

Local Entry

Each local entry declares a number of local variables of a given type. It is legal to have several entries with the same type.

Field	Type	Description
count	`varuint32`	number of local variables of the following type
type	`value_type`	type of the variables

Control flow operators (described here)

Name	Opcode	Immediates	Description
`unreachable`	`0x00`		trap immediately
`nop`	`0x01`		no operation
`block`	`0x02`	sig : `block_type`	begin a sequence of expressions, yielding 0 or 1 values
`loop`	`0x03`	sig : `block_type`	begin a block which can also form control flow loops
`if`	`0x04`	sig : `block_type`	begin if expression
`else`	`0x05`		begin else expression of if
`end`	`0x0b`		end a block, loop, or if
`br`	`0x0c`	relative_depth : `varuint32`	break that targets an outer nested block
`br_if`	`0x0d`	relative_depth : `varuint32`	conditional break that targets an outer nested block
`br_table`	`0x0e`	see below	branch table control flow construct
`return`	`0x0f`		return zero or one value from this function

The sig fields of block and if operators specify function signatures which describe their use of the operand stack.

The br_table operator has an immediate operand which is encoded as follows:

Field	Type	Description
target_count	`varuint32`	number of entries in the target_table
target_table	`varuint32*`	target entries that indicate an outer block or loop to which to break
default_target	`varuint32`	an outer block or loop to which to break in the default case

The br_table operator implements an indirect branch. It accepts an optional value argument (like other branches) and an additional i32 expression as input, and branches to the block or loop at the given offset within the target_table. If the input value is out of range, br_table branches to the default target.

Note: Gaps in the opcode space, here and elsewhere, are reserved for future 🦄 extensions.

Call operators (described here)

Name	Opcode	Immediates	Description
`call`	`0x10`	function_index : `varuint32`	call a function by its index
`call_indirect`	`0x11`	type_index : `varuint32`, reserved : `varuint1`	call a function indirect with an expected signature

The call_indirect operator takes a list of function arguments and as the last operand the index into the table. Its reserved immediate is for future 🦄 use and must be 0 in the MVP.

Parametric operators (described here)

Name	Opcode	Immediates	Description
`drop`	`0x1a`		ignore value
`select`	`0x1b`		select one of two values based on condition

Variable access (described here)

Name	Opcode	Immediates	Description
`get_local`	`0x20`	local_index : `varuint32`	read a local variable or parameter
`set_local`	`0x21`	local_index : `varuint32`	write a local variable or parameter
`tee_local`	`0x22`	local_index : `varuint32`	write a local variable or parameter and return the same value
`get_global`	`0x23`	global_index : `varuint32`	read a global variable
`set_global`	`0x24`	global_index : `varuint32`	write a global variable

Memory-related operators (described here)

Name	Opcode	Immediate	Description
`i32.load`	`0x28`	`memory_immediate`	load from memory
`i64.load`	`0x29`	`memory_immediate`	load from memory
`f32.load`	`0x2a`	`memory_immediate`	load from memory
`f64.load`	`0x2b`	`memory_immediate`	load from memory
`i32.load8_s`	`0x2c`	`memory_immediate`	load from memory
`i32.load8_u`	`0x2d`	`memory_immediate`	load from memory
`i32.load16_s`	`0x2e`	`memory_immediate`	load from memory
`i32.load16_u`	`0x2f`	`memory_immediate`	load from memory
`i64.load8_s`	`0x30`	`memory_immediate`	load from memory
`i64.load8_u`	`0x31`	`memory_immediate`	load from memory
`i64.load16_s`	`0x32`	`memory_immediate`	load from memory
`i64.load16_u`	`0x33`	`memory_immediate`	load from memory
`i64.load32_s`	`0x34`	`memory_immediate`	load from memory
`i64.load32_u`	`0x35`	`memory_immediate`	load from memory
`i32.store`	`0x36`	`memory_immediate`	store to memory
`i64.store`	`0x37`	`memory_immediate`	store to memory
`f32.store`	`0x38`	`memory_immediate`	store to memory
`f64.store`	`0x39`	`memory_immediate`	store to memory
`i32.store8`	`0x3a`	`memory_immediate`	store to memory
`i32.store16`	`0x3b`	`memory_immediate`	store to memory
`i64.store8`	`0x3c`	`memory_immediate`	store to memory
`i64.store16`	`0x3d`	`memory_immediate`	store to memory
`i64.store32`	`0x3e`	`memory_immediate`	store to memory
`current_memory`	`0x3f`	reserved : `varuint1`	query the size of memory
`grow_memory`	`0x40`	reserved : `varuint1`	grow the size of memory

The memory_immediate type is encoded as follows:

Name	Type	Description
flags	`varuint32`	a bitfield which currently contains the alignment in the least significant bits, encoded as `log2(alignment)`
offset	`varuint32`	the value of the offset

As implied by the log2(alignment) encoding, the alignment must be a power of 2. As an additional validation criteria, the alignment must be less or equal to natural alignment. The bits after the log(memory-access-size) least-significant bits must be set to 0. These bits are reserved for future 🦄 use (e.g., for shared memory ordering requirements).

The reserved immediate to the current_memory and grow_memory operators is for future 🦄 use and must be 0 in the MVP.

Constants (described here)

Name	Opcode	Immediates	Description
`i32.const`	`0x41`	value : `varint32`	a constant value interpreted as `i32`
`i64.const`	`0x42`	value : `varint64`	a constant value interpreted as `i64`
`f32.const`	`0x43`	value : `uint32`	a constant value interpreted as `f32`
`f64.const`	`0x44`	value : `uint64`	a constant value interpreted as `f64`

Comparison operators (described here)

Name	Opcode	Immediate	Description
`i32.eqz`	`0x45`
`i32.eq`	`0x46`
`i32.ne`	`0x47`
`i32.lt_s`	`0x48`
`i32.lt_u`	`0x49`
`i32.gt_s`	`0x4a`
`i32.gt_u`	`0x4b`
`i32.le_s`	`0x4c`
`i32.le_u`	`0x4d`
`i32.ge_s`	`0x4e`
`i32.ge_u`	`0x4f`
`i64.eqz`	`0x50`
`i64.eq`	`0x51`
`i64.ne`	`0x52`
`i64.lt_s`	`0x53`
`i64.lt_u`	`0x54`
`i64.gt_s`	`0x55`
`i64.gt_u`	`0x56`
`i64.le_s`	`0x57`
`i64.le_u`	`0x58`
`i64.ge_s`	`0x59`
`i64.ge_u`	`0x5a`
`f32.eq`	`0x5b`
`f32.ne`	`0x5c`
`f32.lt`	`0x5d`
`f32.gt`	`0x5e`
`f32.le`	`0x5f`
`f32.ge`	`0x60`
`f64.eq`	`0x61`
`f64.ne`	`0x62`
`f64.lt`	`0x63`
`f64.gt`	`0x64`
`f64.le`	`0x65`
`f64.ge`	`0x66`

Numeric operators (described here)

Name	Opcode	Immediate	Description
`i32.clz`	`0x67`
`i32.ctz`	`0x68`
`i32.popcnt`	`0x69`
`i32.add`	`0x6a`
`i32.sub`	`0x6b`
`i32.mul`	`0x6c`
`i32.div_s`	`0x6d`
`i32.div_u`	`0x6e`
`i32.rem_s`	`0x6f`
`i32.rem_u`	`0x70`
`i32.and`	`0x71`
`i32.or`	`0x72`
`i32.xor`	`0x73`
`i32.shl`	`0x74`
`i32.shr_s`	`0x75`
`i32.shr_u`	`0x76`
`i32.rotl`	`0x77`
`i32.rotr`	`0x78`
`i64.clz`	`0x79`
`i64.ctz`	`0x7a`
`i64.popcnt`	`0x7b`
`i64.add`	`0x7c`
`i64.sub`	`0x7d`
`i64.mul`	`0x7e`
`i64.div_s`	`0x7f`
`i64.div_u`	`0x80`
`i64.rem_s`	`0x81`
`i64.rem_u`	`0x82`
`i64.and`	`0x83`
`i64.or`	`0x84`
`i64.xor`	`0x85`
`i64.shl`	`0x86`
`i64.shr_s`	`0x87`
`i64.shr_u`	`0x88`
`i64.rotl`	`0x89`
`i64.rotr`	`0x8a`
`f32.abs`	`0x8b`
`f32.neg`	`0x8c`
`f32.ceil`	`0x8d`
`f32.floor`	`0x8e`
`f32.trunc`	`0x8f`
`f32.nearest`	`0x90`
`f32.sqrt`	`0x91`
`f32.add`	`0x92`
`f32.sub`	`0x93`
`f32.mul`	`0x94`
`f32.div`	`0x95`
`f32.min`	`0x96`
`f32.max`	`0x97`
`f32.copysign`	`0x98`
`f64.abs`	`0x99`
`f64.neg`	`0x9a`
`f64.ceil`	`0x9b`
`f64.floor`	`0x9c`
`f64.trunc`	`0x9d`
`f64.nearest`	`0x9e`
`f64.sqrt`	`0x9f`
`f64.add`	`0xa0`
`f64.sub`	`0xa1`
`f64.mul`	`0xa2`
`f64.div`	`0xa3`
`f64.min`	`0xa4`
`f64.max`	`0xa5`
`f64.copysign`	`0xa6`

Conversions (described here)

Name	Opcode	Immediate	Description
`i32.wrap/i64`	`0xa7`
`i32.trunc_s/f32`	`0xa8`
`i32.trunc_u/f32`	`0xa9`
`i32.trunc_s/f64`	`0xaa`
`i32.trunc_u/f64`	`0xab`
`i64.extend_s/i32`	`0xac`
`i64.extend_u/i32`	`0xad`
`i64.trunc_s/f32`	`0xae`
`i64.trunc_u/f32`	`0xaf`
`i64.trunc_s/f64`	`0xb0`
`i64.trunc_u/f64`	`0xb1`
`f32.convert_s/i32`	`0xb2`
`f32.convert_u/i32`	`0xb3`
`f32.convert_s/i64`	`0xb4`
`f32.convert_u/i64`	`0xb5`
`f32.demote/f64`	`0xb6`
`f64.convert_s/i32`	`0xb7`
`f64.convert_u/i32`	`0xb8`
`f64.convert_s/i64`	`0xb9`
`f64.convert_u/i64`	`0xba`
`f64.promote/f32`	`0xbb`

Reinterpretations (described here)

Name	Opcode	Immediate	Description
`i32.reinterpret/f32`	`0xbc`
`i64.reinterpret/f64`	`0xbd`
`f32.reinterpret/i32`	`0xbe`
`f64.reinterpret/i64`	`0xbf`

Files

BinaryEncoding.md

Latest commit

History