Resolve open issues for #255

Two questions remain. First, for consistency with other sections, allow the IDPF to define its own type for the public share and express the encoding correction words for {{BBCGG21}} in TLS-syntax. Second, for consistency with other sections, we express the encoding of `Poplar1`'s aggregation parameter in TLS-syntax. This is slightly hairy because of the prefix packing procedure, but this can be factored out pretty nicely.
cfrg · Aug 8, 2024 · d194e29 · d194e29
1 parent 9ccd754
commit d194e29
Show file tree

Hide file tree

Showing 6 changed files with 233 additions and 191 deletions.
diff --git a/draft-irtf-cfrg-vdaf.md b/draft-irtf-cfrg-vdaf.md
@@ -4185,14 +4185,14 @@ denotes either a vector of inner node field elements or leaf node field
 elements.) The scheme is comprised of the following algorithms:
 
 * `idpf.gen(alpha: int, beta_inner: list[list[FieldInner]], beta_leaf:
-  list[FieldLeaf], nonce: bytes, rand: bytes) -> tuple[bytes,
-  list[bytes]]` is the randomized IDPF-key generation algorithm. Its inputs are the index `alpha`
-  the values `beta`, and a nonce string.
+  list[FieldLeaf], nonce: bytes, rand: bytes) -> tuple[PublicShare,
+  list[bytes]]` is the randomized IDPF-key generation algorithm. Its inputs are
+  the index `alpha` the values `beta`, and a nonce string.
 
-  The output is a public part that is sent to all Aggregators
-  and a vector of private IDPF keys, one for each aggregator. The binder string
-  is used to derive the key in the underlying XofFixedKeyAes128 XOF that is used
-  for expanding seeds at each level.
+  The output is a public part (of type `PublicShare`) that is sent to all
+  Aggregators and a vector of private IDPF keys, one for each aggregator. The
+  binder string is used to derive the key in the underlying XofFixedKeyAes128
+  XOF that is used for expanding seeds at each level.
 
   Pre-conditions:
 
@@ -4202,20 +4202,16 @@ elements.) The scheme is comprised of the following algorithms:
        `range(BITS - 1)`.
     * `beta_leaf` MUST have length `VALUE_LEN`.
     * `rand` MUST be generated by a CSPRNG and have length `RAND_SIZE`.
-    * `nonce` MUST be of length `Idpf.NONCE_SIZE` and chosen uniformly at random by the Client (see
-      {{nonce-requirements}}).
+    * `nonce` MUST be of length `Idpf.NONCE_SIZE` and chosen uniformly at
+      random by the Client (see {{nonce-requirements}}).
 
-  > TODO(issue #255) Decide whether to treat the public share as an opaque byte
-  > string or to replace it with an explicit type.
-
-* `idpf.eval(agg_id: int, public_share: bytes, key: bytes, level:
-  int, prefixes: tuple[int, ...], nonce: bytes) -> Output` is the
-  deterministic, stateless IDPF-key evaluation algorithm run by each
-  Aggregator. Its inputs are the Aggregator's unique identifier, the public
-  share distributed to all of the Aggregators, the Aggregator's IDPF key, the
-  "level" at which to evaluate the IDPF, the sequence of candidate prefixes,
-  and a nonce string. It returns the share of the value corresponding to each
-  candidate prefix.
+* `idpf.eval(agg_id: int, public_share: PublicShare, key: bytes, level: int,
+  prefixes: tuple[int, ...], nonce: bytes) -> Output` is the deterministic,
+  stateless IDPF-key evaluation algorithm run by each Aggregator. Its inputs
+  are the Aggregator's unique identifier, the public share distributed to all
+  of the Aggregators, the Aggregator's IDPF key, the "level" at which to
+  evaluate the IDPF, the sequence of candidate prefixes, and a nonce string. It
+  returns the share of the value corresponding to each candidate prefix.
 
   The output type (i.e., `Output`) depends on the value of `level`: If `level <
   BITS-1`, the output is the value for an inner node, which has type
@@ -4247,18 +4243,19 @@ not include shared state across across VDAF evaluations. In practice, of course,
 it will often be beneficial to expose a stateful API for IDPFs and carry the
 state across evaluations. See {{idpf-bbcggi21}} for details.
 
-| Parameter  | Description               |
-|:-----------|:--------------------------|
-| SHARES     | Number of IDPF keys output by IDPF-key generator |
-| BITS       | Length in bits of each input string |
-| VALUE_LEN  | Number of field elements of each output value |
-| RAND_SIZE  | Size of the random string consumed by the IDPF-key generator. Equal to twice the XOF's seed size. |
+| Parameter   | Description               |
+|:------------|:--------------------------|
+| SHARES      | Number of IDPF keys output by IDPF-key generator |
+| BITS        | Length in bits of each input string |
+| VALUE_LEN   | Number of field elements of each output value |
+| RAND_SIZE   | Size of the random string consumed by the IDPF-key generator. Equal to twice the XOF's seed size. |
 | NONCE_SIZE  | Size of the randon nonce generated by the Client. |
-| KEY_SIZE   | Size in bytes of each IDPF key |
-| FieldInner | Implementation of `Field` ({{field}}) used for values of inner nodes |
-| FieldLeaf  | Implementation of `Field` used for values of leaf nodes |
-| Output     | Alias of `list[list[FieldInner]] | list[list[FieldLeaf]]` |
-| FieldVec   | Alias of `list[FieldInner] | list[FieldLeaf]` |
+| KEY_SIZE    | Size in bytes of each IDPF key |
+| FieldInner  | Implementation of `Field` ({{field}}) used for values of inner nodes |
+| FieldLeaf   | Implementation of `Field` used for values of leaf nodes |
+| PublicShare | Type of public share for this IDPF |
+| Output      | Alias of `list[list[FieldInner]] | list[list[FieldLeaf]]` |
+| FieldVec    | Alias of `list[FieldInner] | list[FieldLeaf]` |
 {: #idpf-param title="Constants and types defined by a concrete IDPF."}
 
 ### Encoding inputs as indices {#poplar1-idpf-index-encoding}
@@ -4302,7 +4299,7 @@ subsections. These methods make use of constants defined in {{poplar1-const}}.
 | `SHARES`          | `2`                                  |
 | `Measurement`     | `int`                                |
 | `AggParam`        | `tuple[int, Sequence[int]]`          |
-| `PublicShare`     | `bytes` (IDPF public share)          |
+| `PublicShare`     | same as the IDPF                    |
 | `InputShare`      | `tuple[bytes, bytes, list[FieldInner], list[FieldLeaf]]` |
 | `OutShare`        | `FieldVec`                           |
 | `AggShare`        | `FieldVec`                           |
@@ -4346,7 +4343,8 @@ def shard(
         self,
         measurement: int,
         nonce: bytes,
-        rand: bytes) -> tuple[bytes, list[Poplar1InputShare]]:
+        rand: bytes,
+    ) -> tuple[Poplar1PublicShare, list[Poplar1InputShare]]:
     if len(nonce) != self.NONCE_SIZE:
         raise ValueError("incorrect nonce size")
     if len(rand) != self.RAND_SIZE:
@@ -4481,7 +4479,7 @@ def prep_init(
         agg_id: int,
         agg_param: Poplar1AggParam,
         nonce: bytes,
-        public_share: bytes,
+        public_share: Poplar1PublicShare,
         input_share: Poplar1InputShare) -> tuple[
             Poplar1PrepState,
             FieldVec]:
@@ -4745,8 +4743,69 @@ opaque Poplar1FieldLeaf[Fl];
 
 #### Public Share
 
-The public share is equal to the IDPF public share, which is a byte string.
-(See {{idpf}}.)
+The public share of the IDPF scheme in {{idpf-bbcggi21}} consists of a sequence
+of "correction words". A correction word has three components:
+
+1. the XOF seed of type `bytes`;
+2. the control bits of type `tuple[Field2, Field2]`; and
+3. the payload of type `list[Field64]` for the first `BITS-1` words and
+   `list[Field255]` for the last word.
+
+The encoding is straightforward, except that the control bits are packed as
+tightly as possible. The encoded public share is structured as follows:
+
+~~~ tls-presentation
+struct {
+    Poplar1Seed seed;
+    Poplar1FieldInner payload[Fi * Poplar1.Idpf.VALUE_LEN];
+} Poplar1CWSeedAndPayloadInner;
+
+struct {
+    Poplar1Seed seed;
+    Poplar1FieldLeaf payload[Fl * Poplar1.Idpf.VALUE_LEN];
+} Poplar1CWSeedAndPayloadLeaf;
+
+struct {
+    opaque packed_control_bits[packed_len];
+    Poplar1CWSeedAndPayloadInner inner[Ci * (Poplar1.Idpf.BITS-1)];
+    Poplar1CWSeedAndPayloadLeaf leaf;
+} Poplar1PublicShare;
+~~~
+
+Here `Ci` denotes the length of `Poplar1ControlWordSeedAndPayloadInner` and
+`packed_len = (2*Poplar1.Idpf.BITS + 7) // 8` is the length of the packed
+control bits.
+
+Field `packed_control_bits` is encoded with the following function:
+
+~~~ python
+packed_control_buf = [int(0)] * packed_len
+for i, bit in enumerate(control_bits):
+    packed_control_buf[i // 8] |= bit.as_unsigned() << (i % 8)
+packed_control_bits = bytes(packed_control_bits)
+~~~
+
+Each group of eight bits into a byte, in LSB to MSB order, padding the most
+significant bits of the last byte with zeros as necessary, and returns the byte
+array. Decoding performs the reverse operation: it takes in a byte array
+and a number of bits, and returns a list of bits, extracting eight bits from
+each byte in turn, in LSB to MSB order, and stopping after the requested number
+of bits. If the byte array has an incorrect length, or if unused bits in the
+last bytes are not zero, it throws an error:
+
+~~~ python
+control_bits = []
+    for i in range(length):
+        control_bits.append(Field2(
+            (packed_control_bits[i // 8] >> (i % 8)) & 1
+        ))
+    leftover_bits = packed_control_bits[-1] >> (
+        (length + 7) % 8 + 1
+    )
+    if (length + 7) // 8 != len(packed_control_bits) or \
+        leftover_bits != 0:
+        raise ValueError('trailing bits')
+~~~
 
 #### Input Share
 
@@ -4854,41 +4913,35 @@ struct {
 
 The aggregation parameter is encoded as follows:
 
-> TODO(issue #255) Express the aggregation parameter encoding in TLS syntax.
-> Decide whether to RECOMMEND this encoding, and if so, add it to test vectors.
+~~~ tls-presentation
+struct {
+    uint16_t level;
+    uint32_t num_prefixes;
+    opaque packed_prefixes[packed_len];
+} Poplar1AggParam;
+~~~
+
+The fields in this struct are: `level`, the level of the IDPF tree of each
+prefixes; `num_prefixes`, the number of prefixes to evaluate; and
+`packed_prefixes`, the sequence of prefixes packed into a byte string of
+length `packed_len`. The prefixes are encoded with the following procedure:
 
 ~~~ python
-def encode_agg_param(self, agg_param: Poplar1AggParam) -> bytes:
-    level, prefixes = agg_param
-    if level not in range(2 ** 16):
-        raise ValueError('level out of range')
-    if len(prefixes) not in range(2 ** 32):
-        raise ValueError('number of prefixes out of range')
-    encoded = bytes()
-    encoded += to_be_bytes(level, 2)
-    encoded += to_be_bytes(len(prefixes), 4)
-    packed = 0
-    for (i, prefix) in enumerate(prefixes):
-        packed |= prefix << ((level + 1) * i)
-    l = ((level + 1) * len(prefixes) + 7) // 8
-    encoded += to_be_bytes(packed, l)
-    return encoded
+packed = 0
+for (i, prefix) in enumerate(prefixes):
+    packed |= prefix << ((level + 1) * i)
+packed_len = ((level + 1) * len(prefixes) + 7) // 8
+packed_prefixes = to_be_bytes(packed, packed_len)
+~~~
 
-def decode_agg_param(self, encoded: bytes) -> Poplar1AggParam:
-    encoded_level, encoded = encoded[:2], encoded[2:]
-    level = from_be_bytes(encoded_level)
-    encoded_prefix_count, encoded = encoded[:4], encoded[4:]
-    prefix_count = from_be_bytes(encoded_prefix_count)
-    l = ((level + 1) * prefix_count + 7) // 8
-    encoded_packed, encoded = encoded[:l], encoded[l:]
-    packed = from_be_bytes(encoded_packed)
-    prefixes = []
-    m = 2 ** (level + 1) - 1
-    for i in range(prefix_count):
-        prefixes.append(packed >> ((level + 1) * i) & m)
-    if len(encoded) != 0:
-        raise ValueError('trailing bytes')
-    return (level, tuple(prefixes))
+Decoding involves the following procedure:
+
+~~~ python
+packed = from_be_bytes(packed_prefixes)
+prefixes = []
+m = 2 ** (level + 1) - 1
+for i in range(num_prefixes):
+    prefixes.append(packed >> ((level + 1) * i) & m)
 ~~~
 
 Implementation note: The aggregation parameter includes the level of the IDPF
@@ -4939,13 +4992,15 @@ def gen(
         beta_inner: list[list[Field64]],
         beta_leaf: list[Field255],
         nonce: bytes,
-        rand: bytes) -> tuple[bytes, list[bytes]]:
+        rand: bytes) -> tuple[list[CorrectionWord], list[bytes]]:
     if alpha not in range(2 ** self.BITS):
         raise ValueError("alpha out of range")
     if len(beta_inner) != self.BITS - 1:
         raise ValueError("incorrect beta_inner length")
     if len(rand) != self.RAND_SIZE:
         raise ValueError("incorrect rand size")
+    if len(nonce) != self.NONCE_SIZE:
+        raise ValueError("incorrect nonce size")
 
     key = [
         rand[:XofFixedKeyAes128.SEED_SIZE],
@@ -4954,7 +5009,7 @@ def gen(
 
     seed = key.copy()
     ctrl = [Field2(0), Field2(1)]
-    correction_words = []
+    public_share = []
     for level in range(self.BITS):
         field: type[Field]
         field = cast(type[Field], self.current_field(level))
@@ -4994,9 +5049,7 @@ def gen(
         for i in range(len(w_cw)):
             w_cw[i] *= mask
 
-        correction_words.append((seed_cw, ctrl_cw, w_cw))
-
-    public_share = self.encode_public_share(correction_words)
+        public_share.append((seed_cw, ctrl_cw, w_cw))
     return (public_share, key)
 ~~~
 {: #idpf-bbcggi21-gen title="IDPF-key generation algorithm of BBCGGI21."}
@@ -5013,7 +5066,7 @@ functions `extend()`, `convert()`, and `decode_public_share()` defined in
 def eval(
         self,
         agg_id: int,
-        public_share: bytes,
+        public_share: list[CorrectionWord],
         key: bytes,
         level: int,
         prefixes: Sequence[int],
@@ -5026,7 +5079,6 @@ def eval(
     if len(set(prefixes)) != len(prefixes):
         raise ValueError('prefixes must be unique')
 
-    correction_words = self.decode_public_share(public_share)
     out_share = []
     for prefix in prefixes:
         if prefix not in range(2 ** (level + 1)):
@@ -5060,7 +5112,7 @@ def eval(
             (seed, ctrl, y) = self.eval_next(
                 seed,
                 ctrl,
-                correction_words[current_level],
+                public_share[current_level],
                 current_level,
                 bit,
                 nonce,
@@ -5078,7 +5130,7 @@ def eval_next(
         self,
         prev_seed: bytes,
         prev_ctrl: Field2,
-        correction_word: CorrectionWordTuple,
+        correction_word: CorrectionWord,
         level: int,
         bit: int,
         nonce: bytes) -> tuple[bytes, Field2, FieldVec]:
@@ -5143,56 +5195,9 @@ def convert(
     field = self.current_field(level)
     w = xof.next_vec(field, self.VALUE_LEN)
     return (next_seed, cast(FieldVec, w))
-
-def encode_public_share(
-        self,
-        correction_words: list[CorrectionWordTuple]) -> bytes:
-    encoded = bytes()
-    control_bits = list(itertools.chain.from_iterable(
-        cw[1] for cw in correction_words
-    ))
-    encoded += pack_bits(control_bits)
-    for (level, (seed_cw, _, w_cw)) \
-            in enumerate(correction_words):
-        field = cast(type[Field], self.current_field(level))
-        encoded += seed_cw
-        encoded += field.encode_vec(cast(list[Field], w_cw))
-    return encoded
-
-def decode_public_share(
-        self,
-        encoded: bytes) -> list[CorrectionWordTuple]:
-    l = (2 * self.BITS + 7) // 8
-    encoded_ctrl, encoded = encoded[:l], encoded[l:]
-    control_bits = unpack_bits(encoded_ctrl, 2 * self.BITS)
-    correction_words = []
-    for level in range(self.BITS):
-        field = self.current_field(level)
-        ctrl_cw = (
-            control_bits[level * 2],
-            control_bits[level * 2 + 1],
-        )
-        l = XofFixedKeyAes128.SEED_SIZE
-        seed_cw, encoded = encoded[:l], encoded[l:]
-        l = field.ENCODED_SIZE * self.VALUE_LEN
-        encoded_w_cw, encoded = encoded[:l], encoded[l:]
-        w_cw = field.decode_vec(encoded_w_cw)
-        correction_words.append((seed_cw, ctrl_cw, w_cw))
-    if len(encoded) != 0:
-        raise ValueError('trailing bytes')
-    return correction_words
 ~~~
 {: #idpf-bbcggi21-helpers title="Helper functions for the IDPF."}
 
-Here, `pack_bits()` takes a list of bits, packs each group of eight bits into a
-byte, in LSB to MSB order, padding the most significant bits of the last byte
-with zeros as necessary, and returns the byte array. `unpack_bits()` performs
-the reverse operation: it takes in a byte array and a number of bits, and
-returns a list of bits, extracting eight bits from each byte in turn, in LSB to
-MSB order, and stopping after the requested number of bits. If the byte array
-has an incorrect length, or if unused bits in the last bytes are not zero, it
-throws an error.
-
 ## Instantiation {#poplar1-inst}
 
 By default, Poplar1 is instantiated with the IDPF in {{idpf-bbcggi21}} (`VALUE_LEN

diff --git a/poc/gen_test_vec.py b/poc/gen_test_vec.py
@@ -283,7 +283,7 @@ def gen_test_vec_for_idpf(idpf: Idpf,
         'beta_inner': printable_beta_inner,
         'beta_leaf': printable_beta_leaf,
         'nonce': nonce.hex(),
-        'public_share': public_share.hex(),
+        'public_share': idpf.test_vec_encode_public_share(public_share).hex(),
         'keys': printable_keys,
     }