-
Notifications
You must be signed in to change notification settings - Fork 0
Fill boc serialization page #524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thanks for the updates to the BoC serialization docs and examples. I found several clarity and consistency issues in tvm/serialization/boc.mdx that need fixes before merge. Findings (19)High (3)[HIGH] Conflicting reuse of symbol
|
- Choose an integer number `s`, such that `n ≤ 2^s`. Represent each cell `ci` by an integral number of bytes | |
as in standard representation cell algorithm, but using unsigned big-endian | |
s-bit integer `j` instead of hash `Hash(cj)` to represent internal references to cell `cj`. More precisely, each | |
individual cell `c` serialized as follows, provided `s` is a multiple of eight. | |
- Two descriptor bytes `d1` and `d2` are computed by | |
setting `d1 = r + 8s + 16h + 32l` and $d2 = \lfloor \frac{b}{8} \rfloor + \lceil \frac{b}{8}\rceil$ (for absent cells, | |
only `d1` is present, always equals to `7 + 16 + 32l`), where: | |
- `0 ≤ r ≤ 4` is the number of cell references present in cell `c`, if `c `is | |
absent from the bag of cells being serialized and is represented by | |
its hashes only, then `r` is set to `7`; | |
- `0 ≤ b ≤ 1023` is the number of data bits in cell `c`; | |
- `0 ≤ l ≤ 3 `is the level of cell `c`; | |
- `s = 1` for exotic cells and `s = 0` for ordinary cells; | |
- `h = 1` if the cell’s hashes are explicitly included into the serialization; otherwise, `h = 0` (when `r = 7`, we must be `h = 1`). |
Description:
s
is defined as the index-bit width and then reused as the 1‑bit exotic/ordinary flag in the descriptor, making d1
ambiguous and risking incorrect implementations. Use distinct identifiers for index width and the exotic flag.
Suggestion:
Option 1 — Rename the descriptor flag to x
(formula and bullet):
- setting `d1 = r + 8s + 16h + 32l` and $d2 = \lfloor \frac{b}{8} \rfloor + \lceil \frac{b}{8}\rceil$ (for absent cells,
+ setting `d1 = r + 8x + 16h + 32l` and $d2 = \lfloor \frac{b}{8} \rfloor + \lceil \frac{b}{8}\rceil$ (for absent cells,
only `d1` is present, always equals to `7 + 16 + 32l`), where:
...
- - `s = 1` for exotic cells and `s = 0` for ordinary cells;
+ - `x = 1` for exotic cells and `x = 0` for ordinary cells;
Option 2 — Rename the index bit‑width to i_bits
and keep s
for the flag:
- - Choose an integer number `s`, such that `n ≤ 2^s`. Represent each cell `ci` by an integral number of bytes
- as in standard representation cell algorithm, but using unsigned big-endian
- s-bit integer `j` instead of hash `Hash(cj)` to represent internal references to cell `cj`. More precisely, each
- individual cell `c` serialized as follows, provided `s` is a multiple of eight.
+ - Choose an integer number `i_bits`, such that `n ≤ 2^i_bits`. Represent each cell `ci` by an integral number of
+ bytes as in standard representation cell algorithm, but using an unsigned big‑endian
+ `i_bits`-bit integer `j` instead of `Hash(cj)` to represent internal references to cell `cj`. More precisely, each
+ individual cell `c` is serialized as follows, provided `i_bits` is a multiple of eight.
[HIGH] Incorrect root and absent-cell index ranges
Location:
mintlify-ton-docs/tvm/serialization/boc.mdx
Lines 84 to 91 in babdad4
- The number of root cells `k ≤ n` present in the serialization. The root | |
cells themselves are $c_1, \ldots, c_{k−1}$. All other cells present in the bag of | |
cells are expected to be reachable by chains of references starting from | |
the root cells. | |
- The number of absent cells `l ≤ n − k`, which represent cells that are | |
actually absent from this bag of cells, but are referred to from it. The | |
absent cells themselves are represented by $c_{n−l}, \ldots, c_{n−1}$, and only | |
these cells may (and also must) have `r = 7`. Complete bags of cells |
Description:
The root range “c1, …, c_{k−1}” yields only k−1
roots; the absent range “c_{n−l}, …, c_{n−1}” yields l−1
items. Fix the off‑by‑one errors to match one‑based indexing.
Suggestion:
- cells themselves are $c_1, \ldots, c_{k−1}$. All other cells present in the bag of
+ cells themselves are $c_1, \ldots, c_{k}$. All other cells present in the bag of
@@
- absent cells themselves are represented by $c_{n−l}, \ldots, c_{n−1}$, and only
+ absent cells themselves are represented by $c_{n−l+1}, \ldots, c_{n}$, and only
[HIGH] Non copy-pasteable import path in TypeScript example
Location:
mintlify-ton-docs/tvm/serialization/boc.mdx
Lines 131 to 133 in babdad4
import { beginCell } from "@ton/core"; | |
import { serializeBoc } from "@ton/core/src/boc/cell/serialization.ts" | |
// serializeBoc has two arguments: |
Description:
The example imports from an internal deep path with a .ts
extension, which is not part of the published API and breaks copy‑pasteability.
Suggestion:
-import { beginCell } from "@ton/core";
-import { serializeBoc } from "@ton/core/src/boc/cell/serialization.ts"
+import { beginCell, serializeBoc } from "@ton/core";
Medium (12)
Click to expand
[MEDIUM] Incorrect abbreviation and casing for “BoC/BoCs” and “TON Blockchain”
Location:
mintlify-ton-docs/tvm/serialization/boc.mdx
Lines 8 to 12 in babdad4
32-byte [representation hash](/tvm/serialization/cells) of the cell referred to. Thus a _bag of cells (Boc)_ is obtained. | |
In general, a Boc can be obtained from several trees of cells, thus forming a forest. By convention, the root of the original tree of | |
cells is a marked element of the resulting bag of cells, so that anybody receiving this bag of cells and knowing | |
the marked element can reconstruct the original DAG of cells, hence also the original tree of cells. | |
However, this Boc needs to be serialized into a file, suitable for disk storage or network transfer. |
Description:
Multiple instances use “Boc/Bocs” instead of “BoC/BoCs”, and “Ton Blockchain” instead of “TON Blockchain”, reducing consistency and searchability.
Suggestion:
- Thus a _bag of cells (Boc)_ is obtained.
+ Thus a bag of cells (BoC) is obtained.
- In general, a Boc can be obtained from several trees of cells, thus forming a forest.
+ In general, a BoC can be obtained from several trees of cells, thus forming a forest.
- However, this Boc needs to be serialized into a file, suitable for disk storage or network transfer.
+ However, this BoC needs to be serialized into a file, suitable for disk storage or network transfer.
Also update (same replacements):
- tvm/serialization/boc.mdx:9, 12, 20, 22–27, 31, 39, 122, 127, 129: “Boc/Bocs” → “BoC/BoCs”
- tvm/serialization/boc.mdx:99: “Ton Blockchain” → “TON Blockchain”
[MEDIUM] Typos in heading and body: “adsent” and plural form
Location:
mintlify-ton-docs/tvm/serialization/boc.mdx
Lines 20 to 27 in babdad4
### Internal references, adsent cells, and complete Bocs | |
Let's fix an arbitrary cell `c` in a given Boc. A reference of `c` is called _internal_ if | |
the cell corresponding to the reference is also represented in Boc. Otherwise, the reference is called _external_ and the corresponding | |
cell is called _absent_ from that Boc. In turn, a Boc is called _complete_ if it does not contain any external references. | |
Although most real-world cases only deal with complete Bocs, in general, the serialization of adsent cells in Boc | |
differs from the serialization of included cells. Therefore, it is very important to be able to identify the type of references. |
Description:
The heading/body misspell “adsent” and use “Bocs” instead of “BoCs,” which reduces clarity and polish.
Suggestion:
- ### Internal references, adsent cells, and complete Bocs
+ ### Internal references, absent cells, and complete BoCs
- Although most real-world cases only deal with complete Bocs, in general, the serialization of adsent cells in Boc
+ Although most real-world cases only deal with complete BoCs, in general, the serialization of absent cells in a BoC
[MEDIUM] Ambiguous “Let’s”/“we”; use neutral/second-person voice
Location:
mintlify-ton-docs/tvm/serialization/boc.mdx
Lines 22 to 24 in babdad4
Let's fix an arbitrary cell `c` in a given Boc. A reference of `c` is called _internal_ if | |
the cell corresponding to the reference is also represented in Boc. Otherwise, the reference is called _external_ and the corresponding | |
cell is called _absent_ from that Boc. In turn, a Boc is called _complete_ if it does not contain any external references. |
Description:
“Let’s fix an arbitrary cell …” uses inclusive “we,” which the style guide discourages. Neutral phrasing avoids ambiguity.
Suggestion:
- Let's fix an arbitrary cell `c` in a given Boc. A reference of `c` is called _internal_ if
+ Consider an arbitrary cell `c` in a given BoC. A reference of `c` is called internal if
[MEDIUM] Missing verb in sentence (“is serialized”)
Location:
individual cell `c` serialized as follows, provided `s` is a multiple of eight. |
Description:
“each individual cell c
serialized as follows” is missing “is,” which hinders readability.
Suggestion:
- individual cell `c` serialized as follows, provided `s` is a multiple of eight.
+ individual cell `c` is serialized as follows, provided `s` is a multiple of eight.
[MEDIUM] Code span errors and stray backticks
Location:
mintlify-ton-docs/tvm/serialization/boc.mdx
Lines 48 to 53 in babdad4
- `0 ≤ r ≤ 4` is the number of cell references present in cell `c`, if `c `is | |
absent from the bag of cells being serialized and is represented by | |
its hashes only, then `r` is set to `7`; | |
- `0 ≤ b ≤ 1023` is the number of data bits in cell `c`; | |
- `0 ≤ l ≤ 3 `is the level of cell `c`; | |
- `s = 1` for exotic cells and `s = 0` for ordinary cells; |
Description:
There is a stray backtick and extra space inside code spans (“c
is” and “0 ≤ l ≤ 3
is”), which render incorrectly.
Suggestion:
- - `0 ≤ r ≤ 4` is the number of cell references present in cell `c`, if `c `is
+ - `0 ≤ r ≤ 4` is the number of cell references present in cell `c`, if `c` is
- - `0 ≤ l ≤ 3 `is the level of cell `c`;
+ - `0 ≤ l ≤ 3` is the level of cell `c`;
[MEDIUM] Punctuation inside code span and missing preposition
Location:
mintlify-ton-docs/tvm/serialization/boc.mdx
Lines 62 to 66 in babdad4
- Optionally, an index can be constructed that consists of `n + 1` t-bit integer entries `L1, ..., Ln,` where `Li` | |
is the total length (in bytes) of the representations of cells `cj` with `j ≤ i`, and integer `t ≥ 0` is chosen so | |
that `Ln ≤ 2^t`. If the indexes are included, any cell `ci` the serialized bag of cells may be easily | |
accessed by its index `i` without deserializing all other cells, or even without | |
loading the entire serialized bag of cells in memory. |
Description:
The trailing comma is inside the code span for entries L1, ..., Ln,
. Also, “any cell ci
the serialized bag of cells” is missing “in.” Align singular “index” with the earlier construction.
Suggestion:
- - Optionally, an index can be constructed that consists of `n + 1` t-bit integer entries `L1, ..., Ln,` where `Li`
+ - Optionally, an index can be constructed that consists of `n + 1` t-bit integer entries `L1, ..., Ln`, where `Li`
- that `Ln ≤ 2^t`. If the indexes are included, any cell `ci` the serialized bag of cells may be easily
+ that `Ln ≤ 2^t`. If the index is included, any cell `ci` in the serialized bag of cells may be easily
[MEDIUM] Unstable link to source; use a stable permalink
Location:
mintlify-ton-docs/tvm/serialization/boc.mdx
Lines 126 to 127 in babdad4
According to the TL-B scheme above there is the [SDK](https://github.com/ton-org/ton-core/blob/main/src/boc/cell/serialization.ts#L1) | |
for serialization and parsing Boc. |
Description:
The SDK link targets the moving main
branch, which can drift. Pin to a commit permalink for stability.
Suggestion:
Replace “https://github.com/ton-org/ton-core/blob/main/src/boc/cell/serialization.ts#L1” with a commit permalink, e.g. “https://github.com/ton-org/ton-core/blob//src/boc/cell/serialization.ts#L1”.
[MEDIUM] Banned time-relative “Currently”
Location:
mintlify-ton-docs/tvm/serialization/boc.mdx
Line 129 in babdad4
Currently, only serialization of Bocs containing only one root and without absent cells is supported. |
Description:
Time-relative wording stales quickly. Prefer timeless phrasing.
Suggestion:
- Currently, only serialization of Bocs containing only one root and without absent cells is supported.
+ Only serialization of BoCs with one root and no absent cells is supported.
[MEDIUM] Code identifiers must use code font
Location:
mintlify-ton-docs/tvm/serialization/boc.mdx
Line 115 in babdad4
Field cells is `n`, roots is `k`, absent is `l`, and `tot_cells_size` is `Ln` (the total |
Description:
Field names appear in prose without code formatting while others are formatted. Apply code spans consistently.
Suggestion:
-Field cells is `n`, roots is `k`, absent is `l`, and `tot_cells_size` is `Ln` (the total
+Field `cells` is `n`, `roots` is `k`, `absent` is `l`, and `tot_cells_size` is `Ln` (the total
[MEDIUM] Grammar — “we must be h = 1”
Location:
- `h = 1` if the cell’s hashes are explicitly included into the serialization; otherwise, `h = 0` (when `r = 7`, we must be `h = 1`). |
Description:
“we must be h = 1
” is ungrammatical. State the requirement directly.
Suggestion:
- - `h = 1` if the cell’s hashes are explicitly included into the serialization; otherwise, `h = 0` (when `r = 7`, we must be `h = 1`).
+ - `h = 1` if the cell’s hashes are explicitly included into the serialization; otherwise, `h = 0` (when `r = 7`, `h` must be `1`).
[MEDIUM] CRC term inconsistency (CRC32 vs CRC32C)
Location:
- An optional `CRC32` may be appended to the serialization for integrity |
Description:
The narrative mentions “CRC32” while the TL‑B scheme below uses has_crc32c
/crc32c
. Align terminology.
Suggestion:
-- An optional `CRC32` may be appended to the serialization for integrity
+- An optional `CRC32C` may be appended to the serialization for integrity
[MEDIUM] Assumes a single root in general outline
Location:
mintlify-ton-docs/tvm/serialization/boc.mdx
Lines 40 to 41 in babdad4
- List the cells from B in a chosen order: `c1, ..., cn` with `c1` being the root cell. | |
- Choose an integer number `s`, such that `n ≤ 2^s`. Represent each cell `ci` by an integral number of bytes |
Description:
The outline implies a single root, while the spec supports multiple roots. Generalize to avoid confusion.
Suggestion:
-- List the cells from B in a chosen order: `c1, ..., cn` with `c1` being the root cell.
+- List the cells from B in a chosen order: `c1, ..., cn` (with `c1, ..., c_k` as root cells).
Low (4)
Click to expand
[LOW] “indexes” vs “indices” inconsistency
Location:
In the process of Boc serialization, the assignment of indexes of its cells plays an important role. |
Description:
Uses “indexes” where “indices” is preferred for consistency with the rest of the document.
Suggestion:
- In the process of Boc serialization, the assignment of indexes of its cells plays an important role.
+ In the process of BoC serialization, the assignment of indices of its cells plays an important role.
[LOW] Missing newline at end of file
Location:
mintlify-ton-docs/tvm/serialization/boc.mdx
Line 144 in babdad4
``` |
Description:
The file lacks a trailing newline, which some tools expect and which avoids spurious diffs.
Suggestion:
Add a single newline at the end of the file.
[LOW] Unnecessary comma in opening sentence
Location:
[DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph) of cells, by identifying identical cells in the |
Description:
There is an unnecessary comma after “cells,” which interrupts flow.
Suggestion:
-[DAG](https://[REDACTED]/wiki/Directed_acyclic_graph) of cells, by identifying identical cells in the
+[DAG](https://[REDACTED]/wiki/Directed_acyclic_graph) of cells by identifying identical cells in the
[LOW] Acronym casing and wording in SDK comment
Location:
mintlify-ton-docs/tvm/serialization/boc.mdx
Lines 133 to 135 in babdad4
// serializeBoc has two arguments: | |
// root: Cell. A root cell of a given tree of cells | |
// opt: { idx: boolean, crc32: boolean }. Two flags indicating whether indexes and crc32 will be included in serialization |
Description:
The comment should use “indices” and uppercase “CRC32,” and present tense reads better.
Suggestion:
- // opt: { idx: boolean, crc32: boolean }. Two flags indicating whether indexes and crc32 will be included in serialization
+ // opt: { idx: boolean, crc32: boolean }. Two flags indicating whether indices and CRC32 are included in serialization
import { beginCell, serializeBoc } from "@ton/core"; | ||
// serializeBoc has two arguments: | ||
// root: Cell. A root cell of a given tree of cells | ||
// opt: { idx: boolean, crc32: boolean }. Two flags indicating whether indexes and CRC32C will be included in serialization | ||
|
||
const innerCell = beginCell().storeUint(456, 16).endCell(); | ||
|
||
const rootCell = beginCell().storeUint(0, 64).storeRef(innerCell).endCell(); | ||
|
||
const serialized_boc = serializeBoc(rootCell, { idx: false, crc32: false }); | ||
|
||
const serialized_boc_with_indexes_and_crc32 = serializeBoc(rootCell, { idx: true, crc32: true }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: Anton Trunov <anton.a.trunov@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The serialization specification is correct.
However, the article reads too terse and compressed—similar to Nikolai Durov’s original PDFs—which is not ideal for documentation.
The cell serialization duplicates “/tvm/serialization/cells”; it might be better to keep it in one place and link from the other.
The topic “What are multiroot BoC for?” is not covered.
There are no examples of deserialization, and serialization with hex output.
I would recommend balancing the dry text with visuals and examples, as was done in the old documentation.
+1 to @pyAndr3w I'd recommend to show the full serialization procedure on some example, i.e. with numbers and values. |
The mentioned topic I shortly mentioned in Intro: bag of cells could be a forest, so, as I know, It is just convenient to send a serialization of more than one tree of cells in some situations. If you know more clear reason to use them, please, let me know. Yes, it is good idea and I just put a link to cell serialization. I also port the example of manual serialization from old docs and add code examples related mentioned issues. |
To be honest, I did not see such an option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Karkarmath ping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Want to move this from TVM to Foundations
@verytactical
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Want to have this described formally, in TL-B and/or C-like structures, both outer boc structure and cell serialization. It is hard to understand informal structure as in tblkch.pdf.
Closes #229