From 7a1ecf4aeef6d0430bb74c8c2e0171131ebe07ba Mon Sep 17 00:00:00 2001
From: goverthrow <goverthrow@protonmail.com>
Date: Fri, 27 May 2022 16:53:35 +0200
Subject: [PATCH 01/11] Add CIP

---
 CIP-?/README.md | 718 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 718 insertions(+)
 create mode 100644 CIP-?/README.md

diff --git a/CIP-?/README.md b/CIP-?/README.md
new file mode 100644
index 0000000000..2273ccc702
--- /dev/null
+++ b/CIP-?/README.md
@@ -0,0 +1,718 @@
+---
+CIP: \?
+Title: Bitwise primitives
+Author:	Koz Ross <koz@mlabs.city>, Maximilian König <maximilian@mlabs.city>
+Comments-URI:	https://github.com/cardano-foundation/CIPs/wiki/Comments:CIP-\?
+Status:	Draft
+Type:	Standards Track
+Created: 2022-05-27
+License: Apache-2.0
+---
+
+# Abstract
+
+Add primitives for bitwise operations, based on `BuiltinByteString`, without requiring new data types.
+
+# Motivation
+
+Bitwise operations are one of the most fundamental building blocks of algorithms
+and data structures. They can be used for a wide variety of applications,
+ranging from representing and manipulating sets of integers efficiently, to
+implementations of cryptographic primitives, to fast searches. Their wide
+availability, law-abiding behaviour and efficiency are the key reasons why they
+are widely used, and widely depended on.
+
+At present, Plutus lacks meaningful support for bitwise operations, which
+significantly limits what can be usefully done on-chain. While it is possible to
+mimic some of these capabilities with what currently exists, and it is always
+possible to introduce new primitives for any task, this is extremely
+unsustainable, and often leads to significant inefficiencies and duplication of
+effort. 
+
+We describe a list of bitwise operations, as well as their intended semantics,
+designed to address this problem.
+
+## Example applications
+
+We provide a range of applications that could be useful or beneficial on-chain,
+but are difficult or impossible to implement without some, or all, of the
+primitives we propose.
+
+## Succinct data structures
+
+Due to the on-chain size limit, many data structures become impractical or
+impossible, as they require too much space either for their elements, or their
+overheads, to allow them to fit alongside the operations we want to perform on
+them. Succinct data structures could serve as a solution to this, as they
+represent data in an amount of space much closer to the entropy limit and ensure
+only constant overheads. There are several examples of these, and all rely on
+bitwise operations for their implementations.
+
+For example, consider wanting to store a set of `BuiltinInteger`s
+on-chain. Given current on-chain primitives, the most viable option involves
+some variant on a `BuiltinList` of `BuiltinInteger`s; however,
+this is unviable in practice unless the set is small. To see why, suppose that
+we have an upper limit of $k$ on the `BuiltinInteger`s we want to store;
+this is realistic in practically all cases. To store $n$
+`BuiltinInteger`s under the above scheme requires 
+
+$$n \cdot \left( \left\lceil \frac{\log_2(k)}{64} \right\rceil \cdot 64  + c\right)
+$$
+
+bits, where $c$ denotes the constant overhead for each cons cell of
+the `BuiltinList` holding the data. If the set being represented is dense
+(meaning that the number of entries is a sizeable fraction of $k$), this cost
+becomes intolerable quickly, especially when taking into account the need to
+also store the operations manipulating such a structure on-chain with the script
+where the set is being used.
+
+If we instead represented the same set as a bitmap based on
+`BuiltinByteString`, the amount of space required would instead be 
+
+$$\left\lceil \frac{k}{8} \right\rceil \cdot 8 + \left\lceil
+\frac{\log_2(k)}{64} \right\rceil \cdot 64
+$$
+
+bits. This is significantly better unless $n$ is small. Furthermore,
+this representation would likely be more efficient in terms of time in practice,
+as instead of having to crawl through a cons-like structure, we can implement
+set operations on a memory-contiguous byte string:
+
+- The cardinality of the set can be computed as a population count. This
+can have terrifyingly efficient implementations: the Muła-Kurz-Lemire
+algorithm (the current state of the art) can process four kilobytes per loop
+iteration, which amounts to over four thousand potential stored integers.
+- Insertion or removal is a bit set or bit clear respectively.
+- Finding the smallest element is a find-first-one.
+- Testing for membership is a check to see if the bit is set.
+- Set intersection is bitwise and.
+- Set union is bitwise inclusive or.
+- Set symmetric difference is bitwise exclusive or.
+
+
+A potential implementation could use a range of techniques to make these
+operations extremely efficient, by relying on SWAR (SIMD-within-a-register)
+techniques if portability is desired, and SIMD instructions for maximum speed.
+This would allow both potentially large integer sets to be represented on-chain
+without breaking the size limit, and nodes to efficiently compute with such,
+reducing the usage of resources by the chain. Lastly, in practice, if
+compression techniques are used (which also rely on bitwise operations!), the
+number of required bits can be reduced considerably in most cases without
+compromising performance: the current state-of-the-art (Roaring Bitmaps) can be
+used as an example of the possible gains.
+
+In order to make such techniques viable, bitwise primitives are mandatory.
+Furthermore, succinct data structures are not limited to sets of integers, but
+**all** require bitwise operations to be implementable.
+
+## Binary representations and encodings
+
+On-chain, space is at a premium. One way that space can be saved is with binary
+representations, which can potentially represent something much closer to the
+entropy limit, especially if the structure or value being represented has
+significant redundant structure. While some possibilities for a more efficient
+*packing* already exist in the form of `BuiltinData`, it is rather
+idiosyncratic to the needs of Plutus, and its decoding is potentially quite
+costly. 
+
+Bitwise primitives would allow more compact binary encodings to be defined,
+where complex structures or values are represented using fixed-size
+`BuiltinByteString`s. The encoders and decoders for these could also be
+implemented more efficiently than currently possible, as there exist numerous
+bitwise techniques for this.
+
+
+## Goals
+
+To ensure a focused and meaningful proposal, we specify our goals below.
+
+### Useful primitives
+
+The primitives provided should enable implementations of algorithms and data
+structures that are currently impossible or impractical. Furthermore, the
+primitives provided should have a high power-to-weight ratio: having them should
+enable as much as possible to be implemented.
+
+### Maintaining as many algebraic laws as possible
+
+Bitwise operations, via Boolean algebras, have a long and storied history of
+algebraic laws, dating back to important results by the like of de Morgan, Post
+and many others. These algebraic laws are useful for a range of reasons: they
+guide implementations, enable easier testing (especially property testing) and
+in some cases much more efficient implementations. To some extent, they also
+formalize our intuition about how these operations *should work*. Thus,
+maintaining as many of these laws in our implementation, and being clear about
+them, is important.
+
+### Allowing efficient, portable implementations
+
+Providing primitives alone is not enough: they should also be efficient. This is
+not least of all because many would associate *primitive operation* with a
+notion of being *close to the machine*, and therefore fast. Thus, it is on us to
+ensure that the implementations of the primitives we provide have to be
+implementable in an efficient way, across a range of hardware.
+
+### Clear indication of failure
+
+While totality is desirable, in some cases, there isn't a sensible answer for us
+to give. A good example is a division-by-zero: if we are asked to do such a
+thing, the only choice we have is to reject it. However, we need to make it as
+easy as possible for someone to realize why their program is failing, by
+emitting a sensible message which can later be inspected.
+
+## Non-goals
+
+We also specify some specific non-goals of this proposal.
+
+### No metaphor-mixing between numbers and bits
+
+A widespread legacy of C is the mixing of treatment of numbers and blobs of
+bits: specifically, the allowing of logical operations on representations of
+numbers. This applies to Haskell as much as any other language: according to the
+Haskell Report, it is in fact **required** that any type implementing
+`Bits` implement `Num` first. While GHC Haskell only mandates
+`Eq`, it still defines `Bits` instances for types clearly meant to
+represent numbers. This is a bad choice, as it creates complex situations and
+partiality in several cases, for arguably no real gain other than C-like bit
+twiddling code.
+
+Even if two types share a representation, their type distinctness is meant to be
+a semantic or abstraction boundary: just because a number is represented as a
+blob of bits does not necessarily mean that arbitrary bit manipulations are
+sensible. However, by defining such a capability, we create several semantic
+problems:
+
+- Some operations end up needing multiple definitions to take this into
+account. A good example are shifts: instead of simply having left or right
+shifts, we now have to distinguish **arithmetic** versus **logical**
+shifts, simply to take into account that a shift can be used on something
+which is meant to be a number, which could be signed. This creates
+unnecessary complexity and duplication of operations.
+- As Plutus `BuiltinInteger`s are of arbitrary precision, certain
+bitwise operations are not well-defined on them. A good example is bitwise
+complement: the bitwise complement of $0$ cannot be defined sensibly, and in
+fact, is partial in its `Bits` instance.
+- Certain bitwise operations on `BuiltinInteger` would have quite
+undesirable semantic changes in order to be implementable. A good example
+are bitwise rotations: we should be able to *decompose* a rotation left or
+right by $n$ into two rotations (by $m_1$ and $m_2$ such that $m_1 + m_2 = n$)
+without changing the outcome. However, because trailing zeroes are not
+tracked by the implementation, this can fail depending on the choice of
+decomposition, which seems needlessly annoying for no good reason.
+- Certain bitwise operations on `BuiltinInteger` would require
+additional arguments and padding to define them sensibly. Consider bitwise
+logical AND: in order to perform this sensibly on `BuiltinInteger`s
+we would need to specify what *length* we assume they have, and some policy
+of *padding* when the length requested is longer than one, or both,
+arguments. This feels unnecessary, and it isn't even clear exactly how we
+should do this: for example, how would negative numbers be padded?
+
+
+These complexities, and many more besides, are poor choices, owing more to the
+legacy of C than any real useful functionality. Furthermore, they feel like a
+casual and senseless undermining of type safety and its guarantees for very
+small and questionable gains. Therefore, defining bitwise operations on
+`BuiltinInteger` is not something we wish to support.
+
+There are legitimate cases where a conversion from `BuiltinInteger` to
+`BuiltinByteString` is desirable; this conversion should be provided, and
+be both explicit and specified in a way that is independent of the machine or
+the implementation of `BuiltinInteger`, as well as total and
+round-tripping. Arguably, it is also desirable to provide built-in support for
+`BuiltinByteString` literals specified in a way convenient to their
+treatment as blobs of bytes (for example, hexadecimal or binary notation), but
+this is outside the scope of this proposal.
+
+
+# Specification
+
+## Proposed operations
+
+We propose several classes of operations. Firstly, we propose two operations for
+inter-conversion between  `BuiltinByteString` and `BuiltinInteger`:
+
+```haskell
+integerToByteString :: BuiltinInteger -> BuiltinByteString
+```
+Convert a number to a bitwise representation.
+
+---
+```haskell
+byteStringToInteger :: BuiltinByteString -> BuiltinInteger
+```
+Reinterpret a bitwise representation as a number.
+
+
+---
+We also propose several logical operations on `BuiltinByteString`s:
+
+
+```haskell
+andByteString :: BuiltinByteString -> BuiltinByteString -> BuiltinByteString
+```
+Perform a bitwise logical AND on arguments of the same
+length, producing a result of the same length, erroring otherwise.
+
+---
+```haskell
+iorByteString :: BuiltinByteString -> BuiltinByteString -> BuiltinByteString
+```
+Perform a bitwise logical IOR on arguments of the same
+length, producing a result of the same length, erroring otherwise.
+
+---
+```haskell
+xorByteString :: BuiltinByteString -> BuiltinByteString -> BuiltinByteString
+```
+Perform a bitwise logical XOR on arguments of the same
+length, producing a result of the same length, erroring otherwise.
+
+---
+```haskell
+complementByteString :: BuiltinByteString -> BuiltinByteString
+```
+Complement all the bits in the argument, producing a
+result of the same length.
+
+
+---
+Lastly, we define the following additional operations:
+
+
+```haskell
+shiftByteString :: BuiltinByteString -> BuiltinInteger -> BuiltinByteString
+```
+Performs a bitwise shift of the first argument by the
+absolute value of the second argument, with padding, the direction being
+indicated by the sign of the second argument.
+
+---
+```haskell
+rotateByteString :: BuiltinByteString -> BuiltinInteger -> BuiltinByteString
+```
+Performs a bitwise rotation of the first argument by
+the absolute value of the second argument, the direction being indicated by
+the sign of the second argument.
+
+---
+```haskell
+popCountByteString :: BuiltinByteString -> BuiltinInteger
+```
+Returns the number of $1$ bits in the argument.
+
+---
+```haskell
+testBitByteString :: BuiltinByteString -> BuiltinInteger -> BuiltinBool
+```
+If the position given by the second argument is not in
+bounds for the first argument, error; otherwise, if the bit given by that
+position is $1$, return `True`, and `False` otherwise.
+
+---
+```haskell
+writeBitByteString :: BuiltinByteString -> BuiltinInteger -> BuiltinBool -> BuiltinByteString
+```
+If the position given by the second
+argument is not in bound for the first argument, error; otherwise, set the
+bit given by that position to $1$ if the third argument is `True`,
+and $0$ otherwise.
+
+---
+```haskell
+findFirstSetByteString :: BuiltinByteString -> BuiltinInteger
+```
+Return the lowest index such that `testBitByteString` with the first
+argument and that index would be `True`. If no such index exists,
+return $-1$ instead.
+
+
+## Semantics
+
+### Preliminaries
+
+We define $\mathbb{N}^{+} = \{ x \in \mathbb{N} \mid x \neq 0 \}$. We assume
+that `BuiltinInteger` is a faithful representation of $\mathbb{Z}$. A
+**bit sequence** $s = s_n s_{n-1} \ldots s_0$ is a sequence such that for
+all $i \in \{0,1,\ldots,n\}$, $s_i \in \{0, 1\}$. A bit sequence $s = s_n s_{n-1} \ldots s_0$ is a **byte sequence** if $n = 8k - 1$ for some $k \in \mathbb{N}$. We denote the **empty bit sequence** (and, indeed, byte sequence
+as well) by $\emptyset$.
+
+We intend that `BuiltinByteString`s represent byte sequences, with the
+sequence of bits being exactly as the description above. For example, given the
+byte sequence `0110111100001100`, the `BuiltinByteString`
+corresponding to it would be `o\f`.
+
+
+Let $i \in \mathbb{N}^{+}$. We define the sequence $\mathtt{binary}(i) = (d_0, m_0), (d_1, m_1), \ldots$ as 
+
+- $m_0 = i \mod 2$, $d_0 = \frac{i}{2}$ if $i$ is even, and $\frac{i - 1}{2}$ if it is odd.
+- $m_j = d_{j - 1} \mod 2$, $d_j = \frac{d_{j-1}}{2}$ if $d_j$ is even,
+and $\frac{d_{j-1} - 1}{2}$ if it is odd.
+
+
+### Representation of `BuiltinInteger` as `BuiltinByteString` and conversions
+
+We describe the translation of `BuiltinInteger` into
+`BuiltinByteString` which is implemented as the 
+`integerToByteString` primitive. Informally, we represent
+`BuiltinInteger`s with the least significant bit at bit position $0$,
+using a twos-complement representation. More precisely, let $i \in \mathbb{N}^{+}$. We represent $i$ as the bit sequence $s = s_n s_{n-1} \ldots s_0$, such that:
+
+- $\sum_{j \in \{0, 1, \ldots, n\}} s_j \cdot 2^j = i$; and
+- $s_n = 0$.
+- Let $\mathtt{binary}(j) = (d_0, m_0), (d_1, m_1), \ldots$. For any $j \in \{0, 1, \ldots, n - 1\}$, $s_j = m_j$; and
+- $n + 1 = 8k$ for the smallest $k \in \mathbb{N}^{+}$ consistent with the previous requirements.
+
+
+For $0$, we represent it as the sequence `00000000` (one zero byte). We
+represent any $i \in \{ x \in \mathbb{Z} \mid x < 0 \}$ as the twos-complement
+of the representation of its additive inverse. We observe that any such sequence
+is by definition a byte sequence.
+
+To interpret a byte sequence $s = s_n s_{n - 1} \ldots s_0$ as a
+`BuiltinInteger`, we use the following process:
+
+- If $s$ is `00000000`, then the result is $0$.
+- Otherwise, if $s_n = 1$, let $s^{\prime}$ be the twos-complement of $s$. Then the result is the additive inverse of the result of interpreting $s^{\prime}$.
+- Otherwise, the result is $\sum_{i \in \{0, 1, \ldots, n\}} s_i \cdot 2^i$.
+
+
+The above interpretation is implemented as the `byteStringToInteger`
+primitive. We observe that  `byteStringToInteger` and
+`integerToByteString` form an isomorphism. More specifically:
+
+```haskell
+byteStringToInteger . integerToByteString = 
+integerToByteString . byteStringToInteger = 
+id
+```
+
+### Bitwise logical operations on `BuiltinByteString`
+
+Throughout, let $s = s_n s_{n-1} \ldots s_0$ and $t = t_m t_{m - 1} \ldots t_0$ be two byte sequences. Whenever we
+specify a **mismatched length error** result, its error message must contain
+at least the following information:
+
+- The name of the failed operation;
+- The reason (mismatched lengths); and
+- The lengths of the arguments.
+
+
+
+We describe the semantics of `andByteString`. For inputs $s$ and $t$, if
+$n \neq m$, the result is a mismatched length error. Otherwise, the result is 
+the byte sequence $u = u_n u_{n - 1} \ldots, u_0$ such that for all $i \in \{0, 1, \ldots, n\}$ we have
+
+$$u_i = \begin{cases}
+1 & s_i = t_i = 1
+0 & \text{otherwise}
+\end{cases}
+$$
+
+For `iorByteString`, for inputs $s$ and $t$, if $n \neq m$, the result is
+a mismatched length error. Otherwise, the result is the byte sequence $u = u_n u_{n - 1} \ldots u_0$ such that for all $i \in \{0, 1, \ldots, n\}$ we have
+
+$$u_i = \begin{cases}
+1 & s_i = 1
+1 & t_i = 1
+0 & \text{otherwise}
+\end{cases}
+$$
+
+For `xorByteString`, for inputs $s$ and $t$, if $n \neq m$, the result is
+a mismatched length error. Otherwise, the result is the byte sequence $u = u_n u_{n-1} \ldots u_0$ such that for all $i \in \{0, 1, \ldots, n\}$ we have
+
+$$u_i = \begin{cases}
+0 & s_i = t_i
+1 & \text{otherwise}
+\end{cases}
+$$
+
+We observe that, for length-matched arguments, each of `andByteString`,
+`iorByteString` and `xorByteString` describes a commutative and
+associative operation. Furthermore, for any given length $k$, each of these
+operations have an identity element: for `iorByteString`, this is the bit 
+sequence of length $k$ where each element is $0$, and for `andByteString` 
+and `xorByteString`, this is the bit sequence of length $k$ where each 
+element is $1$. Lastly, for any length $k$, the bit sequence of length $k$ where
+each element is $0$ is an absorbing element for `andByteString`, and the
+bit sequence of length $k$ where each element is $1$ is an absorbing element for
+`iorByteString`.
+
+
+We now describe the semantics of `complementByteString`. For input $s$,
+the result is the byte sequence $u = u_n u_{n - 1} \ldots u_0$ such that for all
+$i \in \{0, 1, \ldots, n\}$ we have
+
+$$u_i = \begin{cases}
+1 & s_i = 0
+0 & \text{otherwise}
+\end{cases}
+$$
+
+We observe that `complementByteString` is self-inverting. We also note
+the following equivalences hold assuming `b` and `b'` have the
+same length; these are the DeMorgan laws:
+
+```haskell
+complementByteString (andByteString b b') = 
+iorByteString (complementByteString b) (complementByteString b')
+
+complementByteString (iorByteString b b') = 
+andByteString (complementByteString b) (complementByteString b')
+```
+
+### Mixed operations
+
+Throughout this section, let $s = s_n s_{n-1} \ldots s_0$ and $t = t_m t_{m - 1} \ldots t_0$ be byte sequences, and let $i \in \mathbb{Z}$.
+
+
+We describe the semantics of `shiftByteString`. Informally, these are logical 
+shifts, with negative shifts *moving* away from bit index $0$, and positive 
+shifts *moving* towards bit index $0$. More precisely, given the argument 
+$s$ and $i$, the result of `shiftByteString` is the byte sequence 
+$u_n u_{n - 1} \ldots u_0$, such that for all $j \in \{0, 1, \ldots, n \}$, we have
+
+$$u_j = \begin{cases}
+s_{j + i} & j - i \in \{0, 1, \ldots, n \}
+0 & \text{otherwise}
+\end{cases}
+$$
+
+We observe that for $k, \ell$ with the same sign and any `bs`, we have
+
+```haskell
+shiftByteString (shiftBytestring bs k) l = shiftByteString bs (k + l)
+```
+
+We now describe `rotateByteString`, assuming the same inputs as the
+description of `shiftByteString` above. Informally, the *direction* of
+the rotations matches that of `shiftByteString` above. More precisely, 
+the result of `rotateByteString` on the given inputs is the byte sequence
+$u_n u_{n - 1} \ldots u_0$ such that for all $j \in \{0, 1, \ldots, n\}$, we
+have $u_j = s_{j + i \mod (n + 1)}$. We observe that for any $k, \ell$, and any
+`bs`, we have
+
+```haskell
+rotateByteString (rotateByteString bs k) l = rotateByteString bs (k + l)
+```
+
+We also note that
+
+```haskell
+rotateByteString bs 0 = shiftByteString bs 0 = bs
+```
+
+For `popCountByteString` with argument $s$, the result is
+
+$$\sum_{j \in \{0, 1, \ldots, n\}} s_j
+$$
+
+Informally, this is just the total count of $1$ bits. We observe that 
+for any `bs` and `bs'`, we have
+
+```haskell
+popCountByteString bs + popCountByteString bs' = 
+popCountByteString (appendByteString bs bs')
+```
+
+We now describe the semantics of `testBitByteString` and
+`writeBitByteString`. Throughout, whenever we specify an **out-of-bounds error** result, its error message must contain at least the
+following information:
+
+- The name of the failed operation;
+- The reason (out of bounds access);
+- What index was accessed out-of-bounds; and
+- The valid range of indexes.
+
+
+For `testBitByteString` with arguments $s$ and $i$, if $0 \leq i \leq n$,
+then the result is `True` if $s_i = 1$, and `False` if $s_i = 0$;
+otherwise, the result is an out-of-bounds error. Let `b :: BuiltinBool`;
+for `writeBitByteString` with arguments $s$, $i$ and `b`, if $0\leq i \leq n$, then the result is the byte sequence $u_n u_{n - 1} \ldots u_0$
+such that for all $j \in \{0, 1, \ldots, n\}$, we have
+
+$$u_j = \begin{cases}
+1 & i = j \text{ and `b` } = \text{`True`}
+0 & i = j \text{ and `b` } = \text{`False`}
+s_j & \text{otherwise}
+\end{cases}
+$$
+
+If $i < 0$ or $i > n$, the result is an out-of-bounds error.
+
+Lastly, we describe the semantics of `findFirstSetByteString`. Given the
+argument $s$, if for any $j \in \{0, 1, \ldots, n \}$, $s_j = 0$, the result is
+$-1$; otherwise, the result is $k$ such that all of the following hold:
+
+- $k \in \{0, 1, \ldots, n\}$;
+- $s_k = 1$; and
+- For all $0 \leq k^{\prime} < k$, $s_{k^{\prime}} = 0$.
+
+### Costing
+
+All of the primitives we describe are linear in one of their arguments. For a
+more precise description, see the table below.
+
+Primitive | Linear in
+--- | ---
+`integerToByteString` | Argument (only one)
+`byteStringToInteger` | Argument (only one)
+`andByteString` | One argument (same length for both)
+`iorByteString` | One argument (same length for both)
+`xorByteString` | One argument (same length for both)
+`complementByteString` | Argument (only one)
+`shiftByteString` | `BuiltinByteString` argument
+`rotateByteString` | `BuiltinByteString` argument
+`popCountByteString` | Argument (only one)
+`testBitByteString` | `BuiltinByteString` argument
+`writeBitByteString` | `BuiltinByteString` argument
+`findFirstSetByteString` | Argument (only one)
+
+Primitives and which argument they are linear in
+
+
+# Rationale
+
+## Why these operations?
+
+There needs to be a well-defined
+interface between the *world* of `BuiltinInteger` and
+`BuiltinByteString`. To provide this, we require
+`integerToByteString` and `byteStringToInteger`, which is designed
+to roundtrip (that is, describe an isomorphism). Furthermore, by spelling out a
+precise description of the conversions,
+we make this predictable and portable.
+
+Our choice of logical AND, IOR, XOR and complement as the primary logical
+operations is driven by a mixture of prior art, utility and convenience. These
+are the typical bitwise logical operations provided in hardware, and in most
+programming languages; for example, in the x86 instruction set, the following
+bitwise operations have existed since the 8086:
+
+- `AND`: Bitwise AND.
+- `OR`: Bitwise IOR.
+- `NOT`: Bitwise complement.
+- `XOR`: Bitwise XOR.
+
+
+Likewise, on the ARM instruction set, the following bitwise operations have
+existed since ARM2:
+
+- `AND`: Bitwise AND.
+- `ORR`: Bitwise IOR.
+- `EOR`: Bitwise XOR.
+- `ORN`: Bitwise IOR with complement of the second argument.
+- `BIC`: Bitwise AND with complement of the second argument.
+
+
+Going *up a level*, the C and Forth programming languages (according to C89 and
+ANS Forth respectively) define bitwise AND (denoted `\&` and
+`AND` respectively), bitwise IOR (denoted `|` and `OR`
+respectively), bitwise XOR (denoted ` \^` and `XOR` respectively) 
+and bitwise complement (denoted ` \~` and `NOT` respectively) as 
+the primitive bitwise operations. This is followed by basically all languages 
+*higher-up* than C and Forth: Haskell's `Bits` type class defines these 
+same four as `.&.`, `.|.`, `xor` and `complement`. 
+
+This ubiquity in choices leads to most algorithm descriptions that rely on
+bitwise operations to assume that these four are primitive, and thus,
+constant-time and cost. While we could reduce this number
+(and, in fact, due to Post, we know that there exist two **sole** sufficient
+operators), this would be both inconvenient and inefficient. As an example,
+consider implementing XOR using AND, IOR and complement: this would translate
+$x \text{ XOR } y$ into
+
+$$(\text{COMPLEMENT } x \text{ AND } y) \text{ IOR } (x \text{ AND COMPLEMENT }
+y)
+$$
+
+This is both needlessly complex and also inefficient, as it requires copying the
+arguments twice, only to throw away both copies. 
+
+Like our *baseline* bitwise operations above, shifts and rotations are widely
+used, and considered as primitive. For example, x86 platforms have had the
+following available since the 8086:
+
+- `RCL`: Rotate left.
+- `RCR`: Rotate right.
+- `SHL`: Shift left.
+- `SHR`: Shift right.
+
+
+Likewise, ARM platforms have had the following available since ARM2:
+
+- `ROR`: Rotate right.
+- `LSL`: Shift left.
+- `LSR`: Shift right.
+
+
+While C and Forth both have shifts (denoted with `<<` and `>>` in
+C, and `LSHIFT` and `RSHIFT` in Forth), they don't have rotations;
+however, many higher-level languages do: Haskell's `Bits` type class has
+`rotate`, which enables both left and right rotations.
+
+While `popCountByteString` could in theory be simulated using
+`testBitByteString` and a fold, this is quite inefficient: the best way
+to simulate this operation would involve using something similar to the
+Harley-Seal algorithm, which requires a large lookup table, making it
+impractical on-chain. Furthermore, population counting is important for several
+classes of succinct data structure (particularly rank-select dictionaries and
+bitmaps), and is in fact provided as part of the `SSE4.2` x86 instruction
+set as a primitive `POPCNT`.
+
+In order to usefully manipulate individual bits, both `testBitByteString`
+and `writeBitByteString` are needed. They can also be used as part of
+specifying, and verifying, that other bitwise operations, both primitive and
+non-primitive, are behaving correctly. They are also particularly essential for
+binary encodings.
+
+`findFirstSetByteString` is an essential primitive for several succinct
+data structures: both Roaring Bitmaps and rank-select dictionaries rely on it
+being efficient for much of their usefulness. Furthermore, this operation is
+provided in hardware by several instruction sets: on x86, there exist (at least)
+`BSF`, `BSR`, `LZCNT` and `TZCNT`, which allow
+finding both the first **and** last set bits, while on ARM, there exists
+`CLZ`, which can be used to simulate finding the first set bit. The
+instruction also exists in higher-level languages: for example, GHC's
+`FiniteBits` type class has `countTrailingZeros` and
+`countLeadingZeros`. The main reason we propose taking *finding the first set bit* as primitive, rather than *counting leading zeroes* or *counting trailing zeroes* is that finding the first set bit is required specifically for
+several succinct data structures.
+
+
+## On-chain vectors
+
+For linear structures on-chain, we are currently limited to `BuiltinList`
+and `BuiltinMap`, which don't allow constant-time indexing. This is a
+significant restriction, especially when many data structures and algorithms
+rely on the broad availability of a constant-time-indexable linear structure,
+such as a C array or Haskell `Vector`. While we could introduce a
+primitive of this sort, this is a significant undertaking, and would require
+both implementing and costing a possibly large API.
+
+While for variable-length data, we don't have any alternatives if constant-time
+indexing is a goal, for fixed-length (or limited-length at least) data, there is
+a possibility, based on a similar approach taken by the `finitary`
+library. Essentially, given finitary data, we can transform any item into a
+numerical index, which is then stored by embedding into a byte array. As the
+indexes are of a fixed maximum size, this can be done efficiently, but only if
+there is a way of converting indices into bitstrings, and vice versa. Such a
+construction would allow using a (wrapper around) `BuiltinByteString` as
+a constant-time indexable structure of any finitary type. This is not much of a
+restriction in practice, as on-chain, fixed-width or size-bounded types are
+preferable due to the on-chain size limit.
+
+Currently, all the pieces to make this work already exist: the only missing
+piece is the ability to convert indices (which would have to be
+`BuiltinInteger`s) into bit strings (which would have to be
+`BuiltinByteString`s) and back again. With this capability, it would be
+possible to use these techniques to implement something like an array or vector
+without new primitive data types.
+
+
+# Backwards compatibility 
+
+# Path to Active
+
+# Copyright
+
+Apache-2.0

From 415e8931e34585be731ec0cf6aa1da16d0646a3d Mon Sep 17 00:00:00 2001
From: Koz Ross <koz.ross@retro-freedom.nz>
Date: Thu, 23 Jun 2022 09:37:46 +1200
Subject: [PATCH 02/11] Notation fix-ups

---
 CIP-?/README.md | 453 +++++++++++++++++++++++-------------------------
 1 file changed, 219 insertions(+), 234 deletions(-)

diff --git a/CIP-?/README.md b/CIP-?/README.md
index 2273ccc702..d4fef1eddf 100644
--- a/CIP-?/README.md
+++ b/CIP-?/README.md
@@ -89,7 +89,6 @@ iteration, which amounts to over four thousand potential stored integers.
 - Set union is bitwise inclusive or.
 - Set symmetric difference is bitwise exclusive or.
 
-
 A potential implementation could use a range of techniques to make these
 operations extremely efficient, by relying on SWAR (SIMD-within-a-register)
 techniques if portability is desired, and SIMD instructions for maximum speed.
@@ -103,7 +102,7 @@ used as an example of the possible gains.
 
 In order to make such techniques viable, bitwise primitives are mandatory.
 Furthermore, succinct data structures are not limited to sets of integers, but
-**all** require bitwise operations to be implementable.
+*all* require bitwise operations to be implementable.
 
 ## Binary representations and encodings
 
@@ -111,7 +110,7 @@ On-chain, space is at a premium. One way that space can be saved is with binary
 representations, which can potentially represent something much closer to the
 entropy limit, especially if the structure or value being represented has
 significant redundant structure. While some possibilities for a more efficient
-*packing* already exist in the form of `BuiltinData`, it is rather
+'packing' already exist in the form of `BuiltinData`, it is rather
 idiosyncratic to the needs of Plutus, and its decoding is potentially quite
 costly. 
 
@@ -121,6 +120,34 @@ where complex structures or values are represented using fixed-size
 implemented more efficiently than currently possible, as there exist numerous
 bitwise techniques for this.
 
+## On-chain vectors
+
+For linear structures on-chain, we are currently limited to `BuiltinList`
+and `BuiltinMap`, which don't allow constant-time indexing. This is a
+significant restriction, especially when many data structures and algorithms
+rely on the broad availability of a constant-time-indexable linear structure,
+such as a C array or Haskell `Vector`. While we could introduce a primitive 
+data type like this, doing so would be a significant undertaking, and would 
+require both implementing and costing a large API.
+
+While for variable-length data, we don't have any alternatives if constant-time
+indexing is a goal, for fixed-length (or limited-length at least) data, there is
+a possibility, based on a similar approach taken by the `finitary`
+library. Essentially, given finitary data, we can transform any item into a
+numerical index, which is then stored by embedding into a byte array. As the
+indexes are of a fixed maximum size, this can be done efficiently, but only if
+there is a way of converting indices into bitstrings, and vice versa. Such a
+construction would allow using a (wrapper around) `BuiltinByteString` as
+a constant-time indexable structure of any finitary type. This is not much of a
+restriction in practice, as on-chain, fixed-width or size-bounded types are
+preferable due to the on-chain size limit.
+
+Currently, all the pieces to make this work already exist: the only missing
+piece is the ability to convert indices (which would have to be
+`BuiltinInteger`s) into bit strings (which would have to be
+`BuiltinByteString`s) and back again. With this capability, it would be
+possible to use these techniques to implement something like an array or vector
+without new primitive data types.
 
 ## Goals
 
@@ -140,15 +167,15 @@ algebraic laws, dating back to important results by the like of de Morgan, Post
 and many others. These algebraic laws are useful for a range of reasons: they
 guide implementations, enable easier testing (especially property testing) and
 in some cases much more efficient implementations. To some extent, they also
-formalize our intuition about how these operations *should work*. Thus,
+formalize our intuition about how these operations 'should work'. Thus,
 maintaining as many of these laws in our implementation, and being clear about
 them, is important.
 
 ### Allowing efficient, portable implementations
 
 Providing primitives alone is not enough: they should also be efficient. This is
-not least of all because many would associate *primitive operation* with a
-notion of being *close to the machine*, and therefore fast. Thus, it is on us to
+not least of all because many would associate 'primitive operation' with a
+notion of being 'close to the machine', and therefore fast. Thus, it is on us to
 ensure that the implementations of the primitives we provide have to be
 implementable in an efficient way, across a range of hardware.
 
@@ -169,7 +196,7 @@ We also specify some specific non-goals of this proposal.
 A widespread legacy of C is the mixing of treatment of numbers and blobs of
 bits: specifically, the allowing of logical operations on representations of
 numbers. This applies to Haskell as much as any other language: according to the
-Haskell Report, it is in fact **required** that any type implementing
+Haskell Report, it is in fact *required* that any type implementing
 `Bits` implement `Num` first. While GHC Haskell only mandates
 `Eq`, it still defines `Bits` instances for types clearly meant to
 represent numbers. This is a bad choice, as it creates complex situations and
@@ -184,7 +211,7 @@ problems:
 
 - Some operations end up needing multiple definitions to take this into
 account. A good example are shifts: instead of simply having left or right
-shifts, we now have to distinguish **arithmetic** versus **logical**
+shifts, we now have to distinguish *arithmetic* versus *logical*
 shifts, simply to take into account that a shift can be used on something
 which is meant to be a number, which could be signed. This creates
 unnecessary complexity and duplication of operations.
@@ -194,7 +221,7 @@ complement: the bitwise complement of $0$ cannot be defined sensibly, and in
 fact, is partial in its `Bits` instance.
 - Certain bitwise operations on `BuiltinInteger` would have quite
 undesirable semantic changes in order to be implementable. A good example
-are bitwise rotations: we should be able to *decompose* a rotation left or
+are bitwise rotations: we should be able to 'decompose' a rotation left or
 right by $n$ into two rotations (by $m_1$ and $m_2$ such that $m_1 + m_2 = n$)
 without changing the outcome. However, because trailing zeroes are not
 tracked by the implementation, this can fail depending on the choice of
@@ -202,12 +229,11 @@ decomposition, which seems needlessly annoying for no good reason.
 - Certain bitwise operations on `BuiltinInteger` would require
 additional arguments and padding to define them sensibly. Consider bitwise
 logical AND: in order to perform this sensibly on `BuiltinInteger`s
-we would need to specify what *length* we assume they have, and some policy
-of *padding* when the length requested is longer than one, or both,
+we would need to specify what 'length' we assume they have, and some policy
+of 'padding' when the length requested is longer than one, or both,
 arguments. This feels unnecessary, and it isn't even clear exactly how we
 should do this: for example, how would negative numbers be padded?
 
-
 These complexities, and many more besides, are poor choices, owing more to the
 legacy of C than any real useful functionality. Furthermore, they feel like a
 casual and senseless undermining of type safety and its guarantees for very
@@ -223,7 +249,6 @@ round-tripping. Arguably, it is also desirable to provide built-in support for
 treatment as blobs of bytes (for example, hexadecimal or binary notation), but
 this is outside the scope of this proposal.
 
-
 # Specification
 
 ## Proposed operations
@@ -242,11 +267,9 @@ byteStringToInteger :: BuiltinByteString -> BuiltinInteger
 ```
 Reinterpret a bitwise representation as a number.
 
-
 ---
 We also propose several logical operations on `BuiltinByteString`s:
 
-
 ```haskell
 andByteString :: BuiltinByteString -> BuiltinByteString -> BuiltinByteString
 ```
@@ -274,16 +297,14 @@ complementByteString :: BuiltinByteString -> BuiltinByteString
 Complement all the bits in the argument, producing a
 result of the same length.
 
-
 ---
 Lastly, we define the following additional operations:
 
-
 ```haskell
 shiftByteString :: BuiltinByteString -> BuiltinInteger -> BuiltinByteString
 ```
 Performs a bitwise shift of the first argument by the
-absolute value of the second argument, with padding, the direction being
+absolute value of the second argument, with zero padding, the direction being
 indicated by the sign of the second argument.
 
 ---
@@ -312,184 +333,176 @@ position is $1$, return `True`, and `False` otherwise.
 ```haskell
 writeBitByteString :: BuiltinByteString -> BuiltinInteger -> BuiltinBool -> BuiltinByteString
 ```
-If the position given by the second
-argument is not in bound for the first argument, error; otherwise, set the
-bit given by that position to $1$ if the third argument is `True`,
-and $0$ otherwise.
+If the position given by the second argument is not in bound for the first 
+argument, error; otherwise, set the bit given by that position to $1$ if the 
+third argument is `True`, and $0$ otherwise.
 
 ---
 ```haskell
 findFirstSetByteString :: BuiltinByteString -> BuiltinInteger
 ```
-Return the lowest index such that `testBitByteString` with the first
-argument and that index would be `True`. If no such index exists,
-return $-1$ instead.
-
+Return the lowest index such that `testBitByteString` with the first argument 
+and that index would be `True`. If no such index exists, return $-1$ instead.
 
 ## Semantics
 
 ### Preliminaries
 
-We define $\mathbb{N}^{+} = \{ x \in \mathbb{N} \mid x \neq 0 \}$. We assume
+We define $\mathbb{N}^{+} = \\{ x \in \mathbb{N} \mid x \neq 0 \\}$. We assume
 that `BuiltinInteger` is a faithful representation of $\mathbb{Z}$. A
-**bit sequence** $s = s_n s_{n-1} \ldots s_0$ is a sequence such that for
-all $i \in \{0,1,\ldots,n\}$, $s_i \in \{0, 1\}$. A bit sequence $s = s_n s_{n-1} \ldots s_0$ is a **byte sequence** if $n = 8k - 1$ for some $k \in \mathbb{N}$. We denote the **empty bit sequence** (and, indeed, byte sequence
-as well) by $\emptyset$.
+*bit sequence* $s = s_n s_{n-1} \ldots s_0$ is a sequence such that for
+all $i \in \\{ 0,1,\ldots,n \\}$, $s_i \in \\{ 0, 1 \\}$. A bit sequence 
+$s = s_n s_{n-1} \ldots s_0$ is a *byte sequence* if $n = 8k - 1$ for some 
+$k \in \mathbb{N}$. We denote the *empty bit sequence* (and, indeed, empty byte 
+sequence as well) by $\emptyset$.
 
-We intend that `BuiltinByteString`s represent byte sequences, with the
-sequence of bits being exactly as the description above. For example, given the
-byte sequence `0110111100001100`, the `BuiltinByteString`
-corresponding to it would be `o\f`.
+We assume that `BuiltinByteString`s represent byte sequences, with the sequence
+of bits being exactly as the description above. For example, given the byte
+sequence `0110111100001100`, the corresponding `BuiltinByteString` would be 
+`"o\f"`.
 
+Let $i \in \mathbb{N}^{+}$. 
+We define the sequence $\texttt{binary}(i) = (d_0, m_0), (d_1, m_1), \ldots$ as 
 
-Let $i \in \mathbb{N}^{+}$. We define the sequence $\mathtt{binary}(i) = (d_0, m_0), (d_1, m_1), \ldots$ as 
+- $m_0 = i \mod 2$, 
+  $d_0 = \frac{i}{2}$ if $i$ is even, 
+  and $\frac{i - 1}{2}$ otherwise.
+- $m_j = d_{j - 1} \mod 2$, 
+  $d_j = \frac{d_{j-1}}{2}$ if $d_j$ is even,
+  and $\frac{d_{j-1} - 1}{2}$ if it is odd.
 
-- $m_0 = i \mod 2$, $d_0 = \frac{i}{2}$ if $i$ is even, and $\frac{i - 1}{2}$ if it is odd.
-- $m_j = d_{j - 1} \mod 2$, $d_j = \frac{d_{j-1}}{2}$ if $d_j$ is even,
-and $\frac{d_{j-1} - 1}{2}$ if it is odd.
+Some examples follow.
 
+- $\texttt{binary}(4) = (2, 0), (1, 0), (0, 1), (0, 0), (0, 0), \ldots$
+- $\texttt{binary}(17) = (8, 1), (4, 0), (2, 0), (1, 0), (0, 1), (0, 0), (0, 0), \ldots$
+- $\texttt{binary}(553) = (276, 1), (138, 0), (69, 0), (34, 1), (17, 0), (8, 1), (4, 0), (2, 0), (1, 0), (0, 1), (0, 0), (0, 0), \ldots$
 
 ### Representation of `BuiltinInteger` as `BuiltinByteString` and conversions
 
-We describe the translation of `BuiltinInteger` into
-`BuiltinByteString` which is implemented as the 
-`integerToByteString` primitive. Informally, we represent
-`BuiltinInteger`s with the least significant bit at bit position $0$,
-using a twos-complement representation. More precisely, let $i \in \mathbb{N}^{+}$. We represent $i$ as the bit sequence $s = s_n s_{n-1} \ldots s_0$, such that:
+We describe the translation of `BuiltinInteger` into `BuiltinByteString` which 
+is implemented as the `integerToByteString` primitive. Informally, we represent
+`BuiltinInteger`s with the least significant bit at bit position $0$, using a 
+twos-complement representation. More precisely, let $i \in \mathbb{N}^{+}$. We 
+represent $i$ as the bit sequence $s = s_n s_{n-1} \ldots s_0$, such that:
 
-- $\sum_{j \in \{0, 1, \ldots, n\}} s_j \cdot 2^j = i$; and
+- $\sum_{j \in \\{0, 1, \ldots, n\\}} s_j \cdot 2^j = i$; and
 - $s_n = 0$.
-- Let $\mathtt{binary}(j) = (d_0, m_0), (d_1, m_1), \ldots$. For any $j \in \{0, 1, \ldots, n - 1\}$, $s_j = m_j$; and
-- $n + 1 = 8k$ for the smallest $k \in \mathbb{N}^{+}$ consistent with the previous requirements.
-
+- Let $\mathtt{binary}(i) = (d_0, m_0), (d_1, m_1), \ldots$. 
+  For any $j \in \\{0, 1, \ldots, n - 1\\}$, $s_j = m_j$; and
+- $n + 1 = 8k$ for the smallest $k \in \mathbb{N}^{+}$ consistent with the 
+  previous requirements.
 
 For $0$, we represent it as the sequence `00000000` (one zero byte). We
-represent any $i \in \{ x \in \mathbb{Z} \mid x < 0 \}$ as the twos-complement
-of the representation of its additive inverse. We observe that any such sequence
-is by definition a byte sequence.
+represent any $i \in \\{ x \in \mathbb{Z} \mid x < 0 \\}$ as the 
+twos-complement of the representation of its additive inverse. We observe that 
+any such sequence is by definition a byte sequence.
 
 To interpret a byte sequence $s = s_n s_{n - 1} \ldots s_0$ as a
 `BuiltinInteger`, we use the following process:
 
 - If $s$ is `00000000`, then the result is $0$.
-- Otherwise, if $s_n = 1$, let $s^{\prime}$ be the twos-complement of $s$. Then the result is the additive inverse of the result of interpreting $s^{\prime}$.
-- Otherwise, the result is $\sum_{i \in \{0, 1, \ldots, n\}} s_i \cdot 2^i$.
+- Otherwise, if $s_n = 1$, let $s^{\prime}$ be the twos-complement of $s$. Then 
+  the result is the additive inverse of the result of interpreting $s^{\prime}$.
+- Otherwise, the result is $\sum_{i \in \\{0, 1, \ldots, n\\}} s_i \cdot 2^i$.
+
+We implement the above as the `byteStringToInteger` primitive. We observe that
+`byteStringToInteger` and `integerToByteString` form an isomorphism. More
+specifically, we have:
 
+```haskell
+byteStringToInteger . integerToByteString = id
+```
 
-The above interpretation is implemented as the `byteStringToInteger`
-primitive. We observe that  `byteStringToInteger` and
-`integerToByteString` form an isomorphism. More specifically:
+and
 
 ```haskell
-byteStringToInteger . integerToByteString = 
-integerToByteString . byteStringToInteger = 
-id
+integerToByteString . byteStringToInteger = id
 ```
 
+
 ### Bitwise logical operations on `BuiltinByteString`
 
-Throughout, let $s = s_n s_{n-1} \ldots s_0$ and $t = t_m t_{m - 1} \ldots t_0$ be two byte sequences. Whenever we
-specify a **mismatched length error** result, its error message must contain
-at least the following information:
+Throughout, let $s = s_n s_{n-1} \ldots s_0$ and 
+$t = t_m t_{m - 1} \ldots t_0$ be two byte sequences. Whenever we specify a 
+*mismatched length error* result, its error message must contain at least the 
+following information:
 
 - The name of the failed operation;
 - The reason (mismatched lengths); and
 - The lengths of the arguments.
 
-
-
 We describe the semantics of `andByteString`. For inputs $s$ and $t$, if
 $n \neq m$, the result is a mismatched length error. Otherwise, the result is 
-the byte sequence $u = u_n u_{n - 1} \ldots, u_0$ such that for all $i \in \{0, 1, \ldots, n\}$ we have
-
-$$u_i = \begin{cases}
-1 & s_i = t_i = 1
-0 & \text{otherwise}
-\end{cases}
-$$
-
-For `iorByteString`, for inputs $s$ and $t$, if $n \neq m$, the result is
-a mismatched length error. Otherwise, the result is the byte sequence $u = u_n u_{n - 1} \ldots u_0$ such that for all $i \in \{0, 1, \ldots, n\}$ we have
-
-$$u_i = \begin{cases}
-1 & s_i = 1
-1 & t_i = 1
-0 & \text{otherwise}
-\end{cases}
-$$
+the byte sequence $u = u_n u_{n - 1} \ldots, u_0$ such that for all 
+$i \in \\{0, 1, \ldots, n\\}$ we have $u_i = 1$ if $s_i = t_i = 1$, and $0$
+otherwise.
 
-For `xorByteString`, for inputs $s$ and $t$, if $n \neq m$, the result is
-a mismatched length error. Otherwise, the result is the byte sequence $u = u_n u_{n-1} \ldots u_0$ such that for all $i \in \{0, 1, \ldots, n\}$ we have
+For `iorByteString`, for inputs $s$ and $t$, if $n \neq m$, the result is a 
+mismatched length error. Otherwise, the result is the byte sequence 
+$u = u_n u_{n - 1} \ldots u_0$ such that for all $i \in \\{0, 1, \ldots, n\\}$ 
+we have $u_i = 1$ if at least one of $s_i, t_i$ is $1$, and $0$ otherwise.
 
-$$u_i = \begin{cases}
-0 & s_i = t_i
-1 & \text{otherwise}
-\end{cases}
-$$
+For `xorByteString`, for inputs $s$ and $t$, if $n \neq m$, the result is a 
+mismatched length error. Otherwise, the result is the byte sequence 
+$u = u_n u_{n-1} \ldots u_0$ such that for all $i \in \\{0, 1, \ldots, n\\}$ 
+we have $u_i = 0$ if $s_i = t_i$, and $1$ otherwise.
 
 We observe that, for length-matched arguments, each of `andByteString`,
-`iorByteString` and `xorByteString` describes a commutative and
-associative operation. Furthermore, for any given length $k$, each of these
-operations have an identity element: for `iorByteString`, this is the bit 
-sequence of length $k$ where each element is $0$, and for `andByteString` 
-and `xorByteString`, this is the bit sequence of length $k$ where each 
-element is $1$. Lastly, for any length $k$, the bit sequence of length $k$ where
-each element is $0$ is an absorbing element for `andByteString`, and the
-bit sequence of length $k$ where each element is $1$ is an absorbing element for
-`iorByteString`.
-
+`iorByteString` and `xorByteString` describes a commutative and associative 
+operation. Furthermore, for any given length $k$, each of these operations has 
+an identity element: for `iorByteString`, this is the bit sequence of length 
+$k$ where each element is $0$, and for `andByteString` and `xorByteString`, 
+this is the bit sequence of length $k$ where each element is $1$. Lastly, for 
+any length $k$, the bit sequence of length $k$ where each element is $0$ is an 
+absorbing element for `andByteString`, and the bit sequence of length $k$ 
+where each element is $1$ is an absorbing element for `iorByteString`.
 
 We now describe the semantics of `complementByteString`. For input $s$,
 the result is the byte sequence $u = u_n u_{n - 1} \ldots u_0$ such that for all
-$i \in \{0, 1, \ldots, n\}$ we have
-
-$$u_i = \begin{cases}
-1 & s_i = 0
-0 & \text{otherwise}
-\end{cases}
-$$
+$i \in \{0, 1, \ldots, n\}$ we have $u_i = 0$ if $s_i = 1$, and $1$ otherwise.
 
 We observe that `complementByteString` is self-inverting. We also note
 the following equivalences hold assuming `b` and `b'` have the
 same length; these are the DeMorgan laws:
 
 ```haskell
-complementByteString (andByteString b b') = 
-iorByteString (complementByteString b) (complementByteString b')
+complementByteString (andByteString b b') = iorByteString (complementByteString b) (complementByteString b')
+```
 
-complementByteString (iorByteString b b') = 
-andByteString (complementByteString b) (complementByteString b')
+```haskell
+complementByteString (iorByteString b b') = andByteString (complementByteString b) (complementByteString b')
 ```
 
 ### Mixed operations
 
-Throughout this section, let $s = s_n s_{n-1} \ldots s_0$ and $t = t_m t_{m - 1} \ldots t_0$ be byte sequences, and let $i \in \mathbb{Z}$.
-
+Throughout this section, let $s = s_n s_{n-1} \ldots s_0$ and 
+$t = t_m t_{m - 1} \ldots t_0$ be byte sequences, and let $i \in \mathbb{Z}$.
 
 We describe the semantics of `shiftByteString`. Informally, these are logical 
-shifts, with negative shifts *moving* away from bit index $0$, and positive 
-shifts *moving* towards bit index $0$. More precisely, given the argument 
+shifts, with negative shifts moving *away* from bit index $0$, and positive 
+shifts moving *towards* bit index $0$. More precisely, given the arguments
 $s$ and $i$, the result of `shiftByteString` is the byte sequence 
-$u_n u_{n - 1} \ldots u_0$, such that for all $j \in \{0, 1, \ldots, n \}$, we have
+$u = u_n u_{n - 1} \ldots u_0$, such that for all 
+$j \in \\{0, 1, \ldots, n \\}$, we have $u_j = s_{j + i}$ if 
+$j - i \in \\{0, 1, \ldots, n \\}$, and $0$ otherwise.
 
-$$u_j = \begin{cases}
-s_{j + i} & j - i \in \{0, 1, \ldots, n \}
-0 & \text{otherwise}
-\end{cases}
-$$
+Let $k, \ell \in \mathbb{Z}$ 
+such that either 
+$k$ or $\ell$ is $0$, or 
+$k$ and $\ell$ have the same sign. 
+We observe that, for any `bs`, we have
 
-We observe that for $k, \ell$ with the same sign and any `bs`, we have
 
 ```haskell
 shiftByteString (shiftBytestring bs k) l = shiftByteString bs (k + l)
 ```
 
-We now describe `rotateByteString`, assuming the same inputs as the
-description of `shiftByteString` above. Informally, the *direction* of
-the rotations matches that of `shiftByteString` above. More precisely, 
-the result of `rotateByteString` on the given inputs is the byte sequence
-$u_n u_{n - 1} \ldots u_0$ such that for all $j \in \{0, 1, \ldots, n\}$, we
-have $u_j = s_{j + i \mod (n + 1)}$. We observe that for any $k, \ell$, and any
+We now describe `rotateByteString`, assuming the same inputs as the description 
+of `shiftByteString` above. Informally, the 'direction' of rotations matches 
+that of `shiftByteString` above. More precisely, given then arguments $s$ and 
+$i$, the result of `rotateByteString` is the byte sequence
+$u = u_n u_{n - 1} \ldots u_0$ such that for all $j \in \\{0, 1, \ldots, n\\}$, 
+we have $u_j = s_{j + i \mod (n + 1)}$. We observe that for any $k, \ell$, and any
 `bs`, we have
 
 ```haskell
@@ -504,47 +517,43 @@ rotateByteString bs 0 = shiftByteString bs 0 = bs
 
 For `popCountByteString` with argument $s$, the result is
 
-$$\sum_{j \in \{0, 1, \ldots, n\}} s_j
-$$
+$$\sum_{j \in \\{0, 1, \ldots, n\\}} s_j$$
 
 Informally, this is just the total count of $1$ bits. We observe that 
 for any `bs` and `bs'`, we have
 
 ```haskell
-popCountByteString bs + popCountByteString bs' = 
-popCountByteString (appendByteString bs bs')
+popCountByteString bs + popCountByteString bs' = popCountByteString (appendByteString bs bs')
 ```
 
-We now describe the semantics of `testBitByteString` and
-`writeBitByteString`. Throughout, whenever we specify an **out-of-bounds error** result, its error message must contain at least the
-following information:
+We now describe the semantics of `testBitByteString` and `writeBitByteString`. 
+Throughout, whenever we specify an *out-of-bounds error* result, its error 
+message must contain at least the following information:
 
 - The name of the failed operation;
 - The reason (out of bounds access);
 - What index was accessed out-of-bounds; and
 - The valid range of indexes.
 
+For `testBitByteString` with arguments $s$ and $i$, if $0 \leq i \leq n$, then 
+the result is `True` if $s_i = 1$, and `False` if $s_i = 0$; otherwise, the 
+result is an out-of-bounds error. Let `b :: BuiltinBool`; for 
+`writeBitByteString` with arguments $s$, $i$ and `b`, if $0 \leq i \leq n$, 
+then the result is the byte sequence $u = u_n u_{n - 1} \ldots u_0$ such that 
+for all $j \in \{0, 1, \ldots, n\}$, we have:
 
-For `testBitByteString` with arguments $s$ and $i$, if $0 \leq i \leq n$,
-then the result is `True` if $s_i = 1$, and `False` if $s_i = 0$;
-otherwise, the result is an out-of-bounds error. Let `b :: BuiltinBool`;
-for `writeBitByteString` with arguments $s$, $i$ and `b`, if $0\leq i \leq n$, then the result is the byte sequence $u_n u_{n - 1} \ldots u_0$
-such that for all $j \in \{0, 1, \ldots, n\}$, we have
-
-$$u_j = \begin{cases}
-1 & i = j \text{ and `b` } = \text{`True`}
-0 & i = j \text{ and `b` } = \text{`False`}
-s_j & \text{otherwise}
-\end{cases}
-$$
+- $u_j = 1$ when $i = j$ and `b == True`;
+- $u_j = 0$ when $i = j$ and `b == False`;
+- $u_j = s_j$ otherwise.
 
-If $i < 0$ or $i > n$, the result is an out-of-bounds error.
+For either `testBitByteString` or `writeBitByteString`, if $i < 0$ or $i > n$, 
+the result is an out-of-bounds error.
 
 Lastly, we describe the semantics of `findFirstSetByteString`. Given the
-argument $s$, if for any $j \in \{0, 1, \ldots, n \}$, $s_j = 0$, the result is
-$-1$; otherwise, the result is $k$ such that all of the following hold:
+argument $s$, if for any $j \in \\{0, 1, \ldots, n \\}$, $s_j = 0$, the result 
+is $-1$; otherwise, the result is $k$ such that all of the following hold:
 
-- $k \in \{0, 1, \ldots, n\}$;
+- $k \in \\{0, 1, \ldots, n\\}$;
 - $s_k = 1$; and
 - For all $0 \leq k^{\prime} < k$, $s_{k^{\prime}} = 0$.
 
@@ -568,22 +577,17 @@ Primitive | Linear in
 `writeBitByteString` | `BuiltinByteString` argument
 `findFirstSetByteString` | Argument (only one)
 
-Primitives and which argument they are linear in
-
-
 # Rationale
 
 ## Why these operations?
 
-There needs to be a well-defined
-interface between the *world* of `BuiltinInteger` and
-`BuiltinByteString`. To provide this, we require
-`integerToByteString` and `byteStringToInteger`, which is designed
-to roundtrip (that is, describe an isomorphism). Furthermore, by spelling out a
-precise description of the conversions,
-we make this predictable and portable.
+There needs to be a well-defined interface between the 'world' of 
+`BuiltinInteger` and `BuiltinByteString`. To provide this, we require
+`integerToByteString` and `byteStringToInteger`, which are designed to roundtrip
+(that is, describe two halves of an isomorphism). Furthermore, by spelling out 
+a precise description of the conversions, we make this predictable and portable.
 
-Our choice of logical AND, IOR, XOR and complement as the primary logical
+Our choice of logical AND, IOR, XOR and complement as the primary logical 
 operations is driven by a mixture of prior art, utility and convenience. These
 are the typical bitwise logical operations provided in hardware, and in most
 programming languages; for example, in the x86 instruction set, the following
@@ -594,7 +598,6 @@ bitwise operations have existed since the 8086:
 - `NOT`: Bitwise complement.
 - `XOR`: Bitwise XOR.
 
-
 Likewise, on the ARM instruction set, the following bitwise operations have
 existed since ARM2:
 
@@ -604,32 +607,35 @@ existed since ARM2:
 - `ORN`: Bitwise IOR with complement of the second argument.
 - `BIC`: Bitwise AND with complement of the second argument.
 
+Going 'up a level', the C and Forth programming languages (according to C89 and
+ANS Forth respectively) define bitwise AND (denoted `&` and `AND` 
+respectively), bitwise IOR (denoted `|` and `OR` respectively), bitwise XOR 
+(denoted ` ^` and `XOR` respectively) and bitwise complement (denoted `~` and 
+`NOT` respectively) as primitive bitwise operations. These choices are mirrored 
+by basically all 'high-level' languages; for example, Haskell's `Bits` type
+class defines these same four operations as `.&.`, `.|.`, `xor` and `complement`
+respectively.
+
+This ubiquity in choices leads to most algorithm descriptions that rely on 
+bitwise operations to assume that these specific four operations are 
+'primitive', implying that they are constant-time and constant-cost. While we
+could reduce the number of primitive bitwise operations (and, in fact, due to
+Post, we know that there exist two operations that can implement all of them),
+this would be both inconvenient and inefficient. As an example, consider
+implementing XOR using AND, IOR and complement: this would translate `x XOR y`
+into 
 
-Going *up a level*, the C and Forth programming languages (according to C89 and
-ANS Forth respectively) define bitwise AND (denoted `\&` and
-`AND` respectively), bitwise IOR (denoted `|` and `OR`
-respectively), bitwise XOR (denoted ` \^` and `XOR` respectively) 
-and bitwise complement (denoted ` \~` and `NOT` respectively) as 
-the primitive bitwise operations. This is followed by basically all languages 
-*higher-up* than C and Forth: Haskell's `Bits` type class defines these 
-same four as `.&.`, `.|.`, `xor` and `complement`. 
-
-This ubiquity in choices leads to most algorithm descriptions that rely on
-bitwise operations to assume that these four are primitive, and thus,
-constant-time and cost. While we could reduce this number
-(and, in fact, due to Post, we know that there exist two **sole** sufficient
-operators), this would be both inconvenient and inefficient. As an example,
-consider implementing XOR using AND, IOR and complement: this would translate
-$x \text{ XOR } y$ into
-
-$$(\text{COMPLEMENT } x \text{ AND } y) \text{ IOR } (x \text{ AND COMPLEMENT }
-y)
-$$
+```
+(COMPLEMENT x AND y) IOR (x AND COMPLEMENT y)
+```
 
-This is both needlessly complex and also inefficient, as it requires copying the
-arguments twice, only to throw away both copies. 
+This is both needlessly complex, and also inefficient, as it requires copying
+the arguments twice, only to then throw away both copies. This is less of a
+concern if copying is 'cheap', but given that we need to operate on
+variable-width data (specifically `BuiltinByteString`s), this seems needlessly
+wasteful.
 
-Like our *baseline* bitwise operations above, shifts and rotations are widely
+Like our 'baseline' bitwise operations above, shifts and rotations are widely
 used, and considered as primitive. For example, x86 platforms have had the
 following available since the 8086:
 
@@ -638,30 +644,28 @@ following available since the 8086:
 - `SHL`: Shift left.
 - `SHR`: Shift right.
 
-
 Likewise, ARM platforms have had the following available since ARM2:
 
 - `ROR`: Rotate right.
 - `LSL`: Shift left.
 - `LSR`: Shift right.
 
+While C and Forth both have shifts (denoted with `<<` and `>>` in C, and 
+`LSHIFT` and `RSHIFT` in Forth), they don't have rotations; however, many 
+higher-level languages do: Haskell's `Bits` type class has `rotate`, which 
+enables both left and right rotations.
 
-While C and Forth both have shifts (denoted with `<<` and `>>` in
-C, and `LSHIFT` and `RSHIFT` in Forth), they don't have rotations;
-however, many higher-level languages do: Haskell's `Bits` type class has
-`rotate`, which enables both left and right rotations.
-
-While `popCountByteString` could in theory be simulated using
-`testBitByteString` and a fold, this is quite inefficient: the best way
-to simulate this operation would involve using something similar to the
-Harley-Seal algorithm, which requires a large lookup table, making it
+While `popCountByteString` could in theory be simulated using 
+`testBitByteString` and a fold, this is quite inefficient: the best way to 
+simulate this operation would involve using something similar to the 
+Harley-Seal algorithm, which requires a large lookup table, making it 
 impractical on-chain. Furthermore, population counting is important for several
 classes of succinct data structure (particularly rank-select dictionaries and
-bitmaps), and is in fact provided as part of the `SSE4.2` x86 instruction
-set as a primitive `POPCNT`.
+bitmaps), and is in fact provided as part of the `SSE4.2` x86 instruction set 
+as a primitive named `POPCNT`.
 
 In order to usefully manipulate individual bits, both `testBitByteString`
-and `writeBitByteString` are needed. They can also be used as part of
+and `writeBitByteString` are needed. They can also be used as part of 
 specifying, and verifying, that other bitwise operations, both primitive and
 non-primitive, are behaving correctly. They are also particularly essential for
 binary encodings.
@@ -670,49 +674,30 @@ binary encodings.
 data structures: both Roaring Bitmaps and rank-select dictionaries rely on it
 being efficient for much of their usefulness. Furthermore, this operation is
 provided in hardware by several instruction sets: on x86, there exist (at least)
-`BSF`, `BSR`, `LZCNT` and `TZCNT`, which allow
-finding both the first **and** last set bits, while on ARM, there exists
-`CLZ`, which can be used to simulate finding the first set bit. The
-instruction also exists in higher-level languages: for example, GHC's
-`FiniteBits` type class has `countTrailingZeros` and
-`countLeadingZeros`. The main reason we propose taking *finding the first set bit* as primitive, rather than *counting leading zeroes* or *counting trailing zeroes* is that finding the first set bit is required specifically for
-several succinct data structures.
+`BSF`, `BSR`, `LZCNT` and `TZCNT`, which allow finding both the first *and* 
+last set bits, while on ARM, there exists `CLZ`, which can be used to simulate 
+finding the first set bit. The instruction also exists in higher-level 
+languages: for example, GHC's `FiniteBits` type class has `countTrailingZeros` 
+and `countLeadingZeros`. The main reason we propose taking 
+'finding the first set bit' as primitive, rather than 'counting leading 
+zeroes' or 'counting trailing zeroes' is that finding the first set bit is 
+required specifically for several succinct data structures.
 
+# Backwards compatibility 
 
-## On-chain vectors
-
-For linear structures on-chain, we are currently limited to `BuiltinList`
-and `BuiltinMap`, which don't allow constant-time indexing. This is a
-significant restriction, especially when many data structures and algorithms
-rely on the broad availability of a constant-time-indexable linear structure,
-such as a C array or Haskell `Vector`. While we could introduce a
-primitive of this sort, this is a significant undertaking, and would require
-both implementing and costing a possibly large API.
-
-While for variable-length data, we don't have any alternatives if constant-time
-indexing is a goal, for fixed-length (or limited-length at least) data, there is
-a possibility, based on a similar approach taken by the `finitary`
-library. Essentially, given finitary data, we can transform any item into a
-numerical index, which is then stored by embedding into a byte array. As the
-indexes are of a fixed maximum size, this can be done efficiently, but only if
-there is a way of converting indices into bitstrings, and vice versa. Such a
-construction would allow using a (wrapper around) `BuiltinByteString` as
-a constant-time indexable structure of any finitary type. This is not much of a
-restriction in practice, as on-chain, fixed-width or size-bounded types are
-preferable due to the on-chain size limit.
-
-Currently, all the pieces to make this work already exist: the only missing
-piece is the ability to convert indices (which would have to be
-`BuiltinInteger`s) into bit strings (which would have to be
-`BuiltinByteString`s) and back again. With this capability, it would be
-possible to use these techniques to implement something like an array or vector
-without new primitive data types.
-
+At the Plutus Core level, implementing this proposal introduces no
+backwards-incompatibility: the proposed new primitives do not break any existing
+functionality or affect any other builtins. Likewise, at levels above Plutus
+Core (such as `PlutusTx`), no existing functionality should be affected.
 
-# Backwards compatibility 
+On-chain, this requires a hard fork, as this introduces new primitives.
 
 # Path to Active
 
+MLabs will implement these primitives, as well as tests for these. Costing will
+have to be done after this is complete, but must be done by the Plutus Core
+team, due to limitations in how costing is performed.
+
 # Copyright
 
-Apache-2.0
+This CIP is licensed under Apache-2.0.

From 1ea10623d22ba74e6859a26475366b7cb61cd86e Mon Sep 17 00:00:00 2001
From: Koz Ross <koz.ross@retro-freedom.nz>
Date: Thu, 23 Jun 2022 12:56:21 +1200
Subject: [PATCH 03/11] Fix heading levels on use cases

---
 CIP-?/README.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/CIP-?/README.md b/CIP-?/README.md
index d4fef1eddf..653f1e6dab 100644
--- a/CIP-?/README.md
+++ b/CIP-?/README.md
@@ -38,7 +38,7 @@ We provide a range of applications that could be useful or beneficial on-chain,
 but are difficult or impossible to implement without some, or all, of the
 primitives we propose.
 
-## Succinct data structures
+### Succinct data structures
 
 Due to the on-chain size limit, many data structures become impractical or
 impossible, as they require too much space either for their elements, or their
@@ -104,7 +104,7 @@ In order to make such techniques viable, bitwise primitives are mandatory.
 Furthermore, succinct data structures are not limited to sets of integers, but
 *all* require bitwise operations to be implementable.
 
-## Binary representations and encodings
+### Binary representations and encodings
 
 On-chain, space is at a premium. One way that space can be saved is with binary
 representations, which can potentially represent something much closer to the
@@ -120,7 +120,7 @@ where complex structures or values are represented using fixed-size
 implemented more efficiently than currently possible, as there exist numerous
 bitwise techniques for this.
 
-## On-chain vectors
+### On-chain vectors
 
 For linear structures on-chain, we are currently limited to `BuiltinList`
 and `BuiltinMap`, which don't allow constant-time indexing. This is a

From ea8fe5c02b5ec2b40cf1381f499dbca73b862c0f Mon Sep 17 00:00:00 2001
From: Koz Ross <koz.ross@retro-freedom.nz>
Date: Thu, 23 Jun 2022 14:17:45 +1200
Subject: [PATCH 04/11] Rewrite how shifts work

---
 CIP-?/README.md | 92 +++++++++++++++++++++++++++++++++++++------------
 1 file changed, 70 insertions(+), 22 deletions(-)

diff --git a/CIP-?/README.md b/CIP-?/README.md
index 653f1e6dab..6e3751ee27 100644
--- a/CIP-?/README.md
+++ b/CIP-?/README.md
@@ -303,17 +303,17 @@ Lastly, we define the following additional operations:
 ```haskell
 shiftByteString :: BuiltinByteString -> BuiltinInteger -> BuiltinByteString
 ```
-Performs a bitwise shift of the first argument by the
-absolute value of the second argument, with zero padding, the direction being
-indicated by the sign of the second argument.
+Performs a bitwise shift of the first argument by a number of bit positions
+equal to the absolute value of the second argument, the direction of the shift
+being indicated by the sign of the second argument.
 
 ---
 ```haskell
 rotateByteString :: BuiltinByteString -> BuiltinInteger -> BuiltinByteString
 ```
-Performs a bitwise rotation of the first argument by
-the absolute value of the second argument, the direction being indicated by
-the sign of the second argument.
+Performs a bitwise rotation of the first argument by a number of bit positions
+equal to the absolute value of the second argument, the direction being
+indicated by the sign of the second argument.
 
 ---
 ```haskell
@@ -419,7 +419,6 @@ and
 integerToByteString . byteStringToInteger = id
 ```
 
-
 ### Bitwise logical operations on `BuiltinByteString`
 
 Throughout, let $s = s_n s_{n-1} \ldots s_0$ and 
@@ -475,16 +474,46 @@ complementByteString (iorByteString b b') = andByteString (complementByteString
 
 ### Mixed operations
 
-Throughout this section, let $s = s_n s_{n-1} \ldots s_0$ and 
-$t = t_m t_{m - 1} \ldots t_0$ be byte sequences, and let $i \in \mathbb{Z}$.
+Throughout this section, let $s = s_n s_{n-1} \ldots s_0$ be a byte sequence, 
+and let $i \in \mathbb{Z}$.
+
+We describe the semantics of `shiftByteString` and `rotateByteString`.
+Informally, both of these are 'index modifiers' for bit sequences: given a
+positive $i$, the index of a bit in $s$ 'increases' in the result; given a
+negative $i$, the index of a bit in $s$ 'decreases' in the result. This can mean
+that for some indexes in the result, there are no corresponding bits in $s$ by
+the previous definition: we term these *missing indexes*. Additionally, by such
+calculations, bits at some indexes in $s$ may be projected to negative indexes,
+or indexes over $n$, in the result; we term these *out-of-bounds indexes*. How
+we handle missing and out-of-bounds indexes is what distinguishes
+`shiftByteString` and `rotateByteString`:
+
+* `shiftByteString` sets any missing index to $0$ and ignores any data at
+  out-of-bounds indexes.
+* `rotateByteString` uses out-of-bounds indexes as sources for missing indexes
+  by 'wraparound'.
+
+We describe the semantics of `shiftByteString` precisely. Given arguments $s$
+and $i$, the result of `shiftByteString` is the byte sequence 
+$u = u_n u_{n - 1} \ldots u_0$, such that for all $j \in \\{0, 1, \ldots, n \\}$, we have 
+$u_j = s_{j - i}$ if $j - i \in \\{0, 1, \ldots, n \\}$, and $0$ otherwise. For
+example, let $t = 01011110$ and $k = 2$. If we perform `shiftByteString` with
+$t$ and $k$ as arguments, the result will be
 
-We describe the semantics of `shiftByteString`. Informally, these are logical 
-shifts, with negative shifts moving *away* from bit index $0$, and positive 
-shifts moving *towards* bit index $0$. More precisely, given the arguments
-$s$ and $i$, the result of `shiftByteString` is the byte sequence 
-$u = u_n u_{n - 1} \ldots u_0$, such that for all 
-$j \in \\{0, 1, \ldots, n \\}$, we have $u_j = s_{j + i}$ if 
-$j - i \in \\{0, 1, \ldots, n \\}$, and $0$ otherwise.
+$$
+u = t_{(8 - 2)}t_{(7 - 2)}t_{(6 - 2)}t_{(5 - 2)}t_{(4 - 2)}t_{(3 - 2)}t_{(2 - 2)}00
+  = t_6t_5t_4t_3t_2t_1t_000
+  = 01111000
+$$
+
+If instead we perform `shiftByteString` with $t$ and 
+$-k$ as arguments, the result will be
+
+$$
+u = 00t_{(6 + 2}t_{5 + 2}t_{(4 + 2)}t_{(3 + 2)}t_{(2 + 2)}t_{(1 + 2)}t_{(0 + 2)}
+  = 00t_8t_7t_6t_5t_4t_3t_2
+  = 00010111
+$$
 
 Let $k, \ell \in \mathbb{Z}$ 
 such that either 
@@ -497,12 +526,31 @@ We observe that, for any `bs`, we have
 shiftByteString (shiftBytestring bs k) l = shiftByteString bs (k + l)
 ```
 
-We now describe `rotateByteString`, assuming the same inputs as the description 
-of `shiftByteString` above. Informally, the 'direction' of rotations matches 
-that of `shiftByteString` above. More precisely, given then arguments $s$ and 
-$i$, the result of `rotateByteString` is the byte sequence
-$u = u_n u_{n - 1} \ldots u_0$ such that for all $j \in \\{0, 1, \ldots, n\\}$, 
-we have $u_j = s_{j + i \mod (n + 1)}$. We observe that for any $k, \ell$, and any
+We now describe the semantics of `rotateByteString` precisely; we assume the
+same arguments as for `shiftByteString` above. The result of `rotateByteString`
+is the byte sequence $u = u_n u_{n + 1} \ldots u_0$ such that for all 
+$j \in \\{0, 1, \ldots, n\\}$, we have $u_j = s_{n + 1 + j - i \mod (n + 1)}$. For
+example, let $t = 01011110$ and $k = 2$. If we perform `rotateByteString` with
+$t$ and $k$ as arguments, the result will be
+
+$$
+u = t_{(15 \mod 9)}t_{(14 \mod 9)}t_{(13 \mod 9)}t_{(12 \mod 9)}t_{(11 \mod 9)}t_{(10 \mod 9)}t_{(9 \mod 9)}t_{(8
+\mod 9)}t_{(7 \mod 9)}
+  = t_6t_5t_4t_3t_2t_1t_0t_8t_7
+  = 01111001
+$$
+
+If instead we perform `rotateByteString` with $t$ and 
+$-k$ as arguments, the result will be
+
+$$
+u = t_{(19 \mod 9)}t_{(18 \mod 9)}t_{(17 \mod 9)}t_{(16 \mod 9)}t_{(15 \mod 9)}t_{(14 \mod 9)}t_{(13 \mod 9)}t_{(12
+\mod 9)}t_{(11 \mod 9)}
+  = t_1t_0t_8t_7t_6t_5t_4t_3t_2
+  = 10010111
+$$
+
+We observe that for any $k, \ell$, and any
 `bs`, we have
 
 ```haskell

From 115bfb12c453f3c172e408e036470121fbe2f7a8 Mon Sep 17 00:00:00 2001
From: Koz Ross <koz.ross@retro-freedom.nz>
Date: Thu, 23 Jun 2022 14:34:35 +1200
Subject: [PATCH 05/11] Clarify findFirstSet description

---
 CIP-?/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/CIP-?/README.md b/CIP-?/README.md
index 6e3751ee27..a2fdbb5120 100644
--- a/CIP-?/README.md
+++ b/CIP-?/README.md
@@ -598,7 +598,7 @@ For either `testBitByteString` or `writeBitByteString`, if $i < 0$ or $i > n$,
 the result is an out-of-bounds error.
 
 Lastly, we describe the semantics of `findFirstSetByteString`. Given the
-argument $s$, if for any $j \in \\{0, 1, \ldots, n \\}$, $s_j = 0$, the result 
+argument $s$, if for all $j \in \\{0, 1, \ldots, n \\}$, $s_j = 0$, the result 
 is $-1$; otherwise, the result is $k$ such that all of the following hold:
 
 - $k \in \\{0, 1, \ldots, n\\}$;

From 0b990c6e01bd9c8059e8d78d06a744826115f736 Mon Sep 17 00:00:00 2001
From: Koz Ross <koz.ross@retro-freedom.nz>
Date: Thu, 23 Jun 2022 14:58:53 +1200
Subject: [PATCH 06/11] Clarify representation

---
 CIP-?/README.md | 60 +++++++++++++++++++++++++++++++++++++------------
 1 file changed, 46 insertions(+), 14 deletions(-)

diff --git a/CIP-?/README.md b/CIP-?/README.md
index a2fdbb5120..de67ae269d 100644
--- a/CIP-?/README.md
+++ b/CIP-?/README.md
@@ -356,10 +356,11 @@ $s = s_n s_{n-1} \ldots s_0$ is a *byte sequence* if $n = 8k - 1$ for some
 $k \in \mathbb{N}$. We denote the *empty bit sequence* (and, indeed, empty byte 
 sequence as well) by $\emptyset$.
 
-We assume that `BuiltinByteString`s represent byte sequences, with the sequence
-of bits being exactly as the description above. For example, given the byte
-sequence `0110111100001100`, the corresponding `BuiltinByteString` would be 
-`"o\f"`.
+We assume that `BuiltinByteString`s represent byte sequences, with the indexes
+of the represented byte sequence being treated in little-endian,
+least-significant-bit-first encoding. For example, consider the byte sequence 
+$s = 110110011100000$; the `BuiltinByteString` literal corresponding to this would
+be `"\217\224"`.
 
 Let $i \in \mathbb{N}^{+}$. 
 We define the sequence $\texttt{binary}(i) = (d_0, m_0), (d_1, m_1), \ldots$ as 
@@ -397,6 +398,27 @@ represent any $i \in \\{ x \in \mathbb{Z} \mid x < 0 \\}$ as the
 twos-complement of the representation of its additive inverse. We observe that 
 any such sequence is by definition a byte sequence.
 
+For example, consider the representation of $23$. We note that
+
+$$
+\texttt{binary}(23) = (11, 1), (5, 1), (2, 1), (1, 0), (0, 1), (0, 0), (0, 0),
+(0, 0), \ldots
+$$
+
+The representation of $23$ as a byte sequence is
+
+$$
+s = s7s6s5s4s3s2s1s0
+  = 00010111
+$$
+
+If we instead consider $-23$, its representation would instead be
+
+$$
+t = t7t6t5t4t3t2t1t0
+  = 11101001
+$$
+
 To interpret a byte sequence $s = s_n s_{n - 1} \ldots s_0$ as a
 `BuiltinInteger`, we use the following process:
 
@@ -405,6 +427,16 @@ To interpret a byte sequence $s = s_n s_{n - 1} \ldots s_0$ as a
   the result is the additive inverse of the result of interpreting $s^{\prime}$.
 - Otherwise, the result is $\sum_{i \in \\{0, 1, \ldots, n\\}} s_i \cdot 2^i$.
 
+Going by our previous example, for the sequence $s = 00010111$ as above, as 
+$s_7 = 0$, we have
+
+$$
+\sum_{i \in \\{0, 1, \ldots, 7\\}} s_i \cdot 2^i = 
+2^4 + 2^2 + 2^1 + 2^0 = 
+16 + 4 + 2 + 1 = 
+23
+$$
+
 We implement the above as the `byteStringToInteger` primitive. We observe that
 `byteStringToInteger` and `integerToByteString` form an isomorphism. More
 specifically, we have:
@@ -501,8 +533,8 @@ example, let $t = 01011110$ and $k = 2$. If we perform `shiftByteString` with
 $t$ and $k$ as arguments, the result will be
 
 $$
-u = t_{(8 - 2)}t_{(7 - 2)}t_{(6 - 2)}t_{(5 - 2)}t_{(4 - 2)}t_{(3 - 2)}t_{(2 - 2)}00
-  = t_6t_5t_4t_3t_2t_1t_000
+u = t_{(7 - 2)}t_{(6 - 2)}t_{(5 - 2)}t_{(4 - 2)}t_{(3 - 2)}t_{(2 - 2)}00
+  = t_5t_4t_3t_2t_1t_000
   = 01111000
 $$
 
@@ -510,8 +542,8 @@ If instead we perform `shiftByteString` with $t$ and
 $-k$ as arguments, the result will be
 
 $$
-u = 00t_{(6 + 2}t_{5 + 2}t_{(4 + 2)}t_{(3 + 2)}t_{(2 + 2)}t_{(1 + 2)}t_{(0 + 2)}
-  = 00t_8t_7t_6t_5t_4t_3t_2
+u = 00t_{(5 + 2)}t_{(4 + 2)}t_{(3 + 2)}t_{(2 + 2)}t_{(1 + 2)}t_{(0 + 2)}
+  = 00t_7t_6t_5t_4t_3t_2
   = 00010111
 $$
 
@@ -534,9 +566,9 @@ example, let $t = 01011110$ and $k = 2$. If we perform `rotateByteString` with
 $t$ and $k$ as arguments, the result will be
 
 $$
-u = t_{(15 \mod 9)}t_{(14 \mod 9)}t_{(13 \mod 9)}t_{(12 \mod 9)}t_{(11 \mod 9)}t_{(10 \mod 9)}t_{(9 \mod 9)}t_{(8
-\mod 9)}t_{(7 \mod 9)}
-  = t_6t_5t_4t_3t_2t_1t_0t_8t_7
+u = t_{(13 \mod 8)}t_{(12 \mod 8)}t_{(11 \mod 8)}t_{(10 \mod 8)}t_{(9 \mod 8)}t_{(8 \mod 8)}t_{(7 \mod 8)}t_{(6
+\mod 8)}
+  = t_5t_4t_3t_2t_1t_0t_7t_6
   = 01111001
 $$
 
@@ -544,9 +576,9 @@ If instead we perform `rotateByteString` with $t$ and
 $-k$ as arguments, the result will be
 
 $$
-u = t_{(19 \mod 9)}t_{(18 \mod 9)}t_{(17 \mod 9)}t_{(16 \mod 9)}t_{(15 \mod 9)}t_{(14 \mod 9)}t_{(13 \mod 9)}t_{(12
-\mod 9)}t_{(11 \mod 9)}
-  = t_1t_0t_8t_7t_6t_5t_4t_3t_2
+u = t_{(17 \mod 8)}t_{(16 \mod 8)}t_{(15 \mod 8)}t_{(14 \mod 8)}t_{(13 \mod 8)}t_{(12 \mod 8)}t_{(11 \mod 8)}t_{(10
+\mod 8)}
+  = t_1t_0t_7t_6t_5t_4t_3t_2
   = 10010111
 $$
 

From 0766678431a379d0236be0f38a7bcebe1c89883e Mon Sep 17 00:00:00 2001
From: Koz Ross <koz.ross@retro-freedom.nz>
Date: Mon, 27 Jun 2022 14:22:22 +1200
Subject: [PATCH 07/11] Clarifications

---
 CIP-?/README.md | 153 ++++++++++++++++++++++++++++--------------------
 1 file changed, 89 insertions(+), 64 deletions(-)

diff --git a/CIP-?/README.md b/CIP-?/README.md
index de67ae269d..a5d6294f8b 100644
--- a/CIP-?/README.md
+++ b/CIP-?/README.md
@@ -11,7 +11,8 @@ License: Apache-2.0
 
 # Abstract
 
-Add primitives for bitwise operations, based on `BuiltinByteString`, without requiring new data types.
+Add primitives for bitwise operations, based on `BuiltinByteString`, without 
+requiring new data types.
 
 # Motivation
 
@@ -43,10 +44,12 @@ primitives we propose.
 Due to the on-chain size limit, many data structures become impractical or
 impossible, as they require too much space either for their elements, or their
 overheads, to allow them to fit alongside the operations we want to perform on
-them. Succinct data structures could serve as a solution to this, as they
-represent data in an amount of space much closer to the entropy limit and ensure
-only constant overheads. There are several examples of these, and all rely on
-bitwise operations for their implementations.
+them. [Succinct data
+structures](https://en.wikipedia.org/wiki/Succinct_data_structure) could serve 
+as a solution to this, as they represent data in an amount of space much 
+closer to the entropy limit and ensure only constant overheads. There are 
+several examples of these, and all rely on bitwise operations for their 
+implementations.
 
 For example, consider wanting to store a set of `BuiltinInteger`s
 on-chain. Given current on-chain primitives, the most viable option involves
@@ -66,7 +69,8 @@ becomes intolerable quickly, especially when taking into account the need to
 also store the operations manipulating such a structure on-chain with the script
 where the set is being used.
 
-If we instead represented the same set as a bitmap based on
+If we instead represented the same set as a
+[bitmap](https://en.wikipedia.org/wiki/Bit_array) based on
 `BuiltinByteString`, the amount of space required would instead be 
 
 $$\left\lceil \frac{k}{8} \right\rceil \cdot 8 + \left\lceil
@@ -79,7 +83,8 @@ as instead of having to crawl through a cons-like structure, we can implement
 set operations on a memory-contiguous byte string:
 
 - The cardinality of the set can be computed as a population count. This
-can have terrifyingly efficient implementations: the Muła-Kurz-Lemire
+can have terrifyingly efficient implementations: the
+[Muła-Kurz-Lemire](https://lemire.me/en/publication/arxiv161107612/)
 algorithm (the current state of the art) can process four kilobytes per loop
 iteration, which amounts to over four thousand potential stored integers.
 - Insertion or removal is a bit set or bit clear respectively.
@@ -90,14 +95,17 @@ iteration, which amounts to over four thousand potential stored integers.
 - Set symmetric difference is bitwise exclusive or.
 
 A potential implementation could use a range of techniques to make these
-operations extremely efficient, by relying on SWAR (SIMD-within-a-register)
-techniques if portability is desired, and SIMD instructions for maximum speed.
-This would allow both potentially large integer sets to be represented on-chain
-without breaking the size limit, and nodes to efficiently compute with such,
-reducing the usage of resources by the chain. Lastly, in practice, if
-compression techniques are used (which also rely on bitwise operations!), the
-number of required bits can be reduced considerably in most cases without
-compromising performance: the current state-of-the-art (Roaring Bitmaps) can be
+operations extremely efficient, by relying on
+[SWAR](https://en.wikipedia.org/wiki/SWAR)
+techniques if portability is desired, and
+[SIMD](https://en.wikipedia.org/wiki/Single_instruction,_multiple_data) 
+instructions for maximum speed. This would allow both potentially large 
+integer sets to be represented on-chain without breaking the size limit, and 
+nodes to efficiently compute with such, reducing the usage of resources by the 
+chain. Lastly, in practice, if compression techniques are used (which also 
+rely on bitwise operations!), the number of required bits can be reduced 
+considerably in most cases without compromising performance: the current 
+state-of-the-art ([Roaring Bitmaps](https://roaringbitmap.org/)) can be
 used as an example of the possible gains.
 
 In order to make such techniques viable, bitwise primitives are mandatory.
@@ -132,7 +140,8 @@ require both implementing and costing a large API.
 
 While for variable-length data, we don't have any alternatives if constant-time
 indexing is a goal, for fixed-length (or limited-length at least) data, there is
-a possibility, based on a similar approach taken by the `finitary`
+a possibility, based on a similar approach taken by the
+[`finitary`](https://hackage.haskell.org/package/finitary)
 library. Essentially, given finitary data, we can transform any item into a
 numerical index, which is then stored by embedding into a byte array. As the
 indexes are of a fixed maximum size, this can be done efficiently, but only if
@@ -162,14 +171,15 @@ enable as much as possible to be implemented.
 
 ### Maintaining as many algebraic laws as possible
 
-Bitwise operations, via Boolean algebras, have a long and storied history of
-algebraic laws, dating back to important results by the like of de Morgan, Post
-and many others. These algebraic laws are useful for a range of reasons: they
-guide implementations, enable easier testing (especially property testing) and
-in some cases much more efficient implementations. To some extent, they also
-formalize our intuition about how these operations 'should work'. Thus,
-maintaining as many of these laws in our implementation, and being clear about
-them, is important.
+Bitwise operations, via [Boolean
+algebras](https://en.wikipedia.org/wiki/Boolean_algebra_(structure)), have a 
+long and storied history of algebraic laws, dating back to important results 
+by the like of de Morgan, Post and many others. These algebraic laws are 
+useful for a range of reasons: they guide implementations, enable easier 
+testing (especially property testing) and in some cases much more efficient 
+implementations. To some extent, they also formalize our intuition about how 
+these operations 'should work'. Thus, maintaining as many of these laws in our 
+implementation as possible, and being clear about them, is important.
 
 ### Allowing efficient, portable implementations
 
@@ -196,12 +206,15 @@ We also specify some specific non-goals of this proposal.
 A widespread legacy of C is the mixing of treatment of numbers and blobs of
 bits: specifically, the allowing of logical operations on representations of
 numbers. This applies to Haskell as much as any other language: according to the
-Haskell Report, it is in fact *required* that any type implementing
-`Bits` implement `Num` first. While GHC Haskell only mandates
-`Eq`, it still defines `Bits` instances for types clearly meant to
+[Haskell
+Report](https://www.haskell.org/onlinereport/haskell2010/haskellch15.html#x23-20800015), 
+it is in fact *required* that any type implementing
+`Bits` implement `Num` first. While GHC Haskell [only mandates
+`Eq`](https://hackage.haskell.org/package/base-4.16.1.0/docs/Data-Bits.html#t:Bits), 
+it still defines `Bits` instances for types clearly meant to
 represent numbers. This is a bad choice, as it creates complex situations and
-partiality in several cases, for arguably no real gain other than C-like bit
-twiddling code.
+partiality in several cases, for arguably no real gain other than easier
+translation of bit twiddling code originally written in C.
 
 Even if two types share a representation, their type distinctness is meant to be
 a semantic or abstraction boundary: just because a number is represented as a
@@ -259,13 +272,13 @@ inter-conversion between  `BuiltinByteString` and `BuiltinInteger`:
 ```haskell
 integerToByteString :: BuiltinInteger -> BuiltinByteString
 ```
-Convert a number to a bitwise representation.
+Convert a number to its bitwise representation.
 
 ---
 ```haskell
 byteStringToInteger :: BuiltinByteString -> BuiltinInteger
 ```
-Reinterpret a bitwise representation as a number.
+Reinterpret a bitwise representation to the corresponding number.
 
 ---
 We also propose several logical operations on `BuiltinByteString`s:
@@ -304,16 +317,18 @@ Lastly, we define the following additional operations:
 shiftByteString :: BuiltinByteString -> BuiltinInteger -> BuiltinByteString
 ```
 Performs a bitwise shift of the first argument by a number of bit positions
-equal to the absolute value of the second argument, the direction of the shift
-being indicated by the sign of the second argument.
+equal to the absolute value of the second argument. A positive second argument
+indicates a shift towards higher bit indexes; a negative second argument
+indicates a shift towards lower bit indexes.
 
 ---
 ```haskell
 rotateByteString :: BuiltinByteString -> BuiltinInteger -> BuiltinByteString
 ```
 Performs a bitwise rotation of the first argument by a number of bit positions
-equal to the absolute value of the second argument, the direction being
-indicated by the sign of the second argument.
+equal to the absolute value of the second argument.  A positive second argument
+indicates a rotation towards higher bit indexes; a negative second argument
+indicates a rotation towards lower bit indexes.
 
 ---
 ```haskell
@@ -333,7 +348,7 @@ position is $1$, return `True`, and `False` otherwise.
 ```haskell
 writeBitByteString :: BuiltinByteString -> BuiltinInteger -> BuiltinBool -> BuiltinByteString
 ```
-If the position given by the second argument is not in bound for the first 
+If the position given by the second argument is not in bounds for the first 
 argument, error; otherwise, set the bit given by that position to $1$ if the 
 third argument is `True`, and $0$ otherwise.
 
@@ -352,15 +367,14 @@ We define $\mathbb{N}^{+} = \\{ x \in \mathbb{N} \mid x \neq 0 \\}$. We assume
 that `BuiltinInteger` is a faithful representation of $\mathbb{Z}$. A
 *bit sequence* $s = s_n s_{n-1} \ldots s_0$ is a sequence such that for
 all $i \in \\{ 0,1,\ldots,n \\}$, $s_i \in \\{ 0, 1 \\}$. A bit sequence 
-$s = s_n s_{n-1} \ldots s_0$ is a *byte sequence* if $n = 8k - 1$ for some 
-$k \in \mathbb{N}$. We denote the *empty bit sequence* (and, indeed, empty byte 
-sequence as well) by $\emptyset$.
+$s = s_n s_{n-1} \ldots s_0$ is a *byte sequence* if:
 
-We assume that `BuiltinByteString`s represent byte sequences, with the indexes
-of the represented byte sequence being treated in little-endian,
-least-significant-bit-first encoding. For example, consider the byte sequence 
-$s = 110110011100000$; the `BuiltinByteString` literal corresponding to this would
-be `"\217\224"`.
+- Either $s$ is empty (that is, contains no bits); or
+- $n = 8k - 1$ for some $k \in \mathbb{N}^{+}$. 
+
+We assume that `BuiltinByteString`s represent byte sequences, such that the
+lowest bit indexes are at the end of the representation; that is, bit $0$ is the
+least-significant bit in the highest-index byte.
 
 Let $i \in \mathbb{N}^{+}$. 
 We define the sequence $\texttt{binary}(i) = (d_0, m_0), (d_1, m_1), \ldots$ as 
@@ -380,11 +394,13 @@ Some examples follow.
 
 ### Representation of `BuiltinInteger` as `BuiltinByteString` and conversions
 
-We describe the translation of `BuiltinInteger` into `BuiltinByteString` which 
+We describe the translation of `BuiltinInteger` into `BuiltinByteString`, which
 is implemented as the `integerToByteString` primitive. Informally, we represent
-`BuiltinInteger`s with the least significant bit at bit position $0$, using a 
-twos-complement representation. More precisely, let $i \in \mathbb{N}^{+}$. We 
-represent $i$ as the bit sequence $s = s_n s_{n-1} \ldots s_0$, such that:
+`BuiltinInteger`s as [little
+endian](https://en.wikipedia.org/wiki/Endianness#Little), with the least
+significant bit at bit index $0$, using a [two's-complement](https://en.wikipedia.org/wiki/Two%27s_complement) 
+representation. More precisely, let $i \in \mathbb{N}^{+}$. We represent $i$ as the bit sequence 
+$s = s_n s_{n-1} \ldots s_0$, such that:
 
 - $\sum_{j \in \\{0, 1, \ldots, n\\}} s_j \cdot 2^j = i$; and
 - $s_n = 0$.
@@ -395,8 +411,9 @@ represent $i$ as the bit sequence $s = s_n s_{n-1} \ldots s_0$, such that:
 
 For $0$, we represent it as the sequence `00000000` (one zero byte). We
 represent any $i \in \\{ x \in \mathbb{Z} \mid x < 0 \\}$ as the 
-twos-complement of the representation of its additive inverse. We observe that 
-any such sequence is by definition a byte sequence.
+[two's-complement](https://en.wikipedia.org/wiki/Two%27s_complement) of 
+the representation of its additive inverse. We observe that any such 
+sequence is by definition a byte sequence.
 
 For example, consider the representation of $23$. We note that
 
@@ -422,9 +439,8 @@ $$
 To interpret a byte sequence $s = s_n s_{n - 1} \ldots s_0$ as a
 `BuiltinInteger`, we use the following process:
 
-- If $s$ is `00000000`, then the result is $0$.
-- Otherwise, if $s_n = 1$, let $s^{\prime}$ be the twos-complement of $s$. Then 
-  the result is the additive inverse of the result of interpreting $s^{\prime}$.
+- If $s_n = 1$, let $s^{\prime}$ be the two's-complement of $s$. Then the result
+  is the additive inverse of the result of interpreting $s^{\prime}$.
 - Otherwise, the result is $\sum_{i \in \\{0, 1, \ldots, n\\}} s_i \cdot 2^i$.
 
 Going by our previous example, for the sequence $s = 00010111$ as above, as 
@@ -438,18 +454,19 @@ $$
 $$
 
 We implement the above as the `byteStringToInteger` primitive. We observe that
-`byteStringToInteger` and `integerToByteString` form an isomorphism. More
-specifically, we have:
+`integerToByteString` and `byteStringToInteger` round-trip; mor especifically,
+we have:
 
 ```haskell
 byteStringToInteger . integerToByteString = id
 ```
 
-and
-
-```haskell
-integerToByteString . byteStringToInteger = id
-```
+The other direction does not necessarily hold: informally, this is due to
+'trailing zeroes' not contributing to a numerical value. More precisely,
+consider the `BuiltinByteString` consisting of two zero bytes. If we convert
+this `BuiltinByteString` to a `BuiltinInteger` using `byteStringToInteger`, we
+would get $0$; however, if we convert $0$ to a `BuiltinByteString`, it would
+consist of _one_ zero byte.
 
 ### Bitwise logical operations on `BuiltinByteString`
 
@@ -494,7 +511,8 @@ $i \in \{0, 1, \ldots, n\}$ we have $u_i = 0$ if $s_i = 1$, and $1$ otherwise.
 
 We observe that `complementByteString` is self-inverting. We also note
 the following equivalences hold assuming `b` and `b'` have the
-same length; these are the DeMorgan laws:
+same length; these are [De Morgan's
+laws](https://en.wikipedia.org/wiki/De_Morgan%27s_laws):
 
 ```haskell
 complementByteString (andByteString b b') = iorByteString (complementByteString b) (complementByteString b')
@@ -595,6 +613,12 @@ We also note that
 rotateByteString bs 0 = shiftByteString bs 0 = bs
 ```
 
+Lastly, we note that
+
+```haskell
+rotateByteString bs k = rotateByteString bs (k `remInteger` (lengthByteString bs * 8))
+```
+
 For `popCountByteString` with argument $s$, the result is
 
 $$\sum_{j \in \\{0, 1, \ldots, n\\}} s_j$$
@@ -663,9 +687,10 @@ Primitive | Linear in
 
 There needs to be a well-defined interface between the 'world' of 
 `BuiltinInteger` and `BuiltinByteString`. To provide this, we require
-`integerToByteString` and `byteStringToInteger`, which are designed to roundtrip
-(that is, describe two halves of an isomorphism). Furthermore, by spelling out 
-a precise description of the conversions, we make this predictable and portable.
+`integerToByteString` and `byteStringToInteger`; we require that
+`integerToByteString` and `byteStringToInteger` roundtrip. Furthermore, 
+by spelling out a precise description of the conversions, we make this 
+predictable and portable.
 
 Our choice of logical AND, IOR, XOR and complement as the primary logical 
 operations is driven by a mixture of prior art, utility and convenience. These

From 76ab90e5839824ea646b44f9569b03e639125ba1 Mon Sep 17 00:00:00 2001
From: Koz Ross <koz.ross@retro-freedom.nz>
Date: Wed, 14 Sep 2022 13:45:00 +1200
Subject: [PATCH 08/11] Replace FFS with TZCNT and LZCNT, add finite field
 arithmetic example

---
 CIP-?/README.md | 74 +++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 68 insertions(+), 6 deletions(-)

diff --git a/CIP-?/README.md b/CIP-?/README.md
index a5d6294f8b..d865d335b3 100644
--- a/CIP-?/README.md
+++ b/CIP-?/README.md
@@ -39,6 +39,39 @@ We provide a range of applications that could be useful or beneficial on-chain,
 but are difficult or impossible to implement without some, or all, of the
 primitives we propose.
 
+### Finite field arithmetic
+
+[Finite field arithmetic](https://en.wikipedia.org/wiki/Finite_field_arithmetic)
+is an area with many applications, ranging from [linear block
+codes](https://en.wikipedia.org/wiki/Block_code) to [zero-knowledge
+proofs](https://en.wikipedia.org/wiki/Zero-knowledge_proof) to scheduling and
+experimental design. Having such capabilities on-chain is useful in for a wide
+range of applications. 
+
+A good example is multiplication over the [Goldilocks
+field](https://blog.polygon.technology/introducing-plonky2) (with characteristic
+$2^64 - 2^32 + 1$). To perform this operation requires 'slicing' the
+representation being worked with into 32-bit chunks. As finite field
+reprentations are some kind of unsigned integer in every implementation, in
+Plutus, this would correspond to `Integer`s, but currently, there is no way to
+perform this kind of 'slicing' on an `Integer` on-chain.
+
+Furthermore, finite field arithmetic can gain significant performance
+optimizations with the use of bitwise primitive operations. Two good examples
+are power-of-two division and computing inverses. The first of these (useful
+even in `Integer` arithmetic) replaces a division by a power of 2 with a shift;
+the second uses a count trailing zeroes operation to compute a multiplicative
+finite field inverse. While some of these operations could theoretically be done
+by other means, their performance is far from guaranteed. For example, GHC does
+not convert a power-of-two division or multiplication to a shift, even if the
+divisor or multiplier is statically-known. Given the restrictions on computation
+resources on-chain, any gains are significant.
+
+Having bitwise primitives, as well as the ability to convert `Integer`s into a
+form amenable to this kind of work, would allow efficient finite field
+arithmetic on-chain. This could enable a range of new uses without being
+inefficient or difficult to port.
+
 ### Succinct data structures
 
 Due to the on-chain size limit, many data structures become impractical or
@@ -88,7 +121,8 @@ can have terrifyingly efficient implementations: the
 algorithm (the current state of the art) can process four kilobytes per loop
 iteration, which amounts to over four thousand potential stored integers.
 - Insertion or removal is a bit set or bit clear respectively.
-- Finding the smallest element is a find-first-one.
+- Finding the smallest element uses a count leading zeroes.
+- Finding the last element uses a count trailing zeroes.
 - Testing for membership is a check to see if the bit is set.
 - Set intersection is bitwise and.
 - Set union is bitwise inclusive or.
@@ -272,13 +306,16 @@ inter-conversion between  `BuiltinByteString` and `BuiltinInteger`:
 ```haskell
 integerToByteString :: BuiltinInteger -> BuiltinByteString
 ```
-Convert a number to its bitwise representation.
+
+Convert a non-negative number to its bitwise representation, erroring if given a
+negative number.
 
 ---
 ```haskell
 byteStringToInteger :: BuiltinByteString -> BuiltinInteger
 ```
-Reinterpret a bitwise representation to the corresponding number.
+
+Reinterpret a bitwise representation to its corresponding non-negative number.
 
 ---
 We also propose several logical operations on `BuiltinByteString`s:
@@ -354,10 +391,19 @@ third argument is `True`, and $0$ otherwise.
 
 ---
 ```haskell
-findFirstSetByteString :: BuiltinByteString -> BuiltinInteger
+countLeadingZeroesByteString :: BuiltinByteString -> BuiltinInteger
 ```
-Return the lowest index such that `testBitByteString` with the first argument 
-and that index would be `True`. If no such index exists, return $-1$ instead.
+
+Counts the initial sequence of 0 bits in the argument (that is, starting from
+index 0). If the argument is empty, this returns 0.
+
+---
+```haskell
+countTrailingZeroesByteString :: BuiltinByteString -> BuiltinInteger
+```
+
+Counts the final sequence of 0 bits in the argument (that is, starting from the
+1 bit with the highest index). If the argument is empty, this returns 0.
 
 ## Semantics
 
@@ -653,6 +699,22 @@ for all $j \in \{0, 1, \ldots, n\}$, we have:
 For either `testBitByteString` or `writeBitByteString`, if $i < 0$ or $i > n$, 
 the result is an out-of-bounds error.
 
+Lastly, we describe the semantics of `countLeadingZeroesByteString` and
+`countTrailingZeroesByteString. Given the argument $s$,
+`countLeadingZeroesByteString` produces the answer $\ell$ such that all of the
+following hold:
+
+- $0 \leq \ell < n$;
+- For all $0 \leq i < \ell$, $s_i = 0$; and
+- If $s$ is not empty, $s_{\ell} = 1$.
+
+`countTrailingZeroesByteString` instead produces the answer $\ell$ such that all
+of the following hold:
+
+- $0 \leq \ell < n$;
+- For all $\ell \leq i < n$, $s_i = 0$; and
+- If $s$ is not empty, $s_{\ell} = 1$.
+
 Lastly, we describe the semantics of `findFirstSetByteString`. Given the
 argument $s$, if for all $j \in \\{0, 1, \ldots, n \\}$, $s_j = 0$, the result 
 is $-1$; otherwise, the result is $k$ such that all of the following hold:

From fd308ab95ab9a4fee15324133d4d972d2b049c47 Mon Sep 17 00:00:00 2001
From: Koz Ross <koz.ross@retro-freedom.nz>
Date: Thu, 15 Sep 2022 11:06:51 +1200
Subject: [PATCH 09/11] Rewrite

---
 CIP-?/README.md | 408 ++++++++++++++++++++++--------------------------
 1 file changed, 185 insertions(+), 223 deletions(-)

diff --git a/CIP-?/README.md b/CIP-?/README.md
index d865d335b3..edbc357c44 100644
--- a/CIP-?/README.md
+++ b/CIP-?/README.md
@@ -148,7 +148,7 @@ Furthermore, succinct data structures are not limited to sets of integers, but
 
 ### Binary representations and encodings
 
-On-chain, space is at a premium. One way that space can be saved is with binary
+On-chain, space comes at a premium. One way that space can be saved is with binary
 representations, which can potentially represent something much closer to the
 entropy limit, especially if the structure or value being represented has
 significant redundant structure. While some possibilities for a more efficient
@@ -410,150 +410,134 @@ Counts the final sequence of 0 bits in the argument (that is, starting from the
 ### Preliminaries
 
 We define $\mathbb{N}^{+} = \\{ x \in \mathbb{N} \mid x \neq 0 \\}$. We assume
-that `BuiltinInteger` is a faithful representation of $\mathbb{Z}$. A
-*bit sequence* $s = s_n s_{n-1} \ldots s_0$ is a sequence such that for
-all $i \in \\{ 0,1,\ldots,n \\}$, $s_i \in \\{ 0, 1 \\}$. A bit sequence 
-$s = s_n s_{n-1} \ldots s_0$ is a *byte sequence* if:
-
-- Either $s$ is empty (that is, contains no bits); or
-- $n = 8k - 1$ for some $k \in \mathbb{N}^{+}$. 
-
-We assume that `BuiltinByteString`s represent byte sequences, such that the
-lowest bit indexes are at the end of the representation; that is, bit $0$ is the
-least-significant bit in the highest-index byte.
+that `BuiltinInteger` is a faithful representation of $\mathbb{Z}$, and will
+refer to them (and their elements) interchangeably. A *byte* is some $x \in
+\\{0,1,\ldots,255\\}$.
+
+We observe that, given some *base* $b \in \mathbb{N}^{+}$, any $n \in
+\mathbb{N}$ can be viewed as a sequence of values in $\\{0,1,\ldots, b - 1\\}$.
+We refer to any such sequence as a *base $b$ sequence*. In such a 'view', given 
+a base $b$ sequence $S = s_0 s_1 \ldots s_k$, we can compute its corresponding 
+$m \in \mathbb{N}^+$ as 
+
+\[
+\sum_{i \in \\{0,1,\ldots,k\\}} b^{k - i} * s_i
+\]
+
+If $b > 1$ and $Z$ is a base $b$ sequence consisting only of zeroes, we observe 
+that for any other base $b$ sequence $S$, $Z \cdot S$ and $S$ correspond to the 
+same number.
+
+We use *bit sequence* to refer to a base 2 sequence, and *byte sequence* to
+refer to a base 256 sequence. For a bit sequence $S = b_0 b_1 \ldots b_n$, we
+refer to $\\{0,1,\ldots,n \\}$ as the *valid bit indices* of $S$; analogously,
+for a byte sequence $T = y_0 y_1 \ldots y_m$, we refer to $\\{0,1,\ldots,m\\}$
+as the *valid byte indices* of $T$. We observe that the length of $S$ is $n + 1$
+and the length of $T$ is $m + 1$; we refer to these as the *bit length* of $S$
+and the *byte length* of $T$ for clarity. We write $S[i]$ and $T[j]$ to
+represent $b_i$ and $y_j$ for valid bit index $i$ and valid byte index $j$
+respectively.
 
-Let $i \in \mathbb{N}^{+}$. 
-We define the sequence $\texttt{binary}(i) = (d_0, m_0), (d_1, m_1), \ldots$ as 
+We describe a 'view' of bytes as bit sequences. Let $y$ be a byte; its
+corresponding bit sequence is $S_y = y_0 y_1 y_2 y_3 y_4 y_5 y_6 y_7$ such that
 
-- $m_0 = i \mod 2$, 
-  $d_0 = \frac{i}{2}$ if $i$ is even, 
-  and $\frac{i - 1}{2}$ otherwise.
-- $m_j = d_{j - 1} \mod 2$, 
-  $d_j = \frac{d_{j-1}}{2}$ if $d_j$ is even,
-  and $\frac{d_{j-1} - 1}{2}$ if it is odd.
+\[
+\sum_{i \in \\{0,1,\ldots,7\\}} 2^{7 - i} * y_i = y
+\]
 
-Some examples follow.
+For example, the byte $55$ has the corresponding byte sequence $00110111$. For
+any byte, its corresponding byte sequence is unique. We use this to extend our
+'view' to byte sequences as bit sequences. Specifically, let $T = y_0 y_1 \ldots
+y_m$ be a byte sequence. Its corresponding bit sequence $S = b_0b_1 \ldots b_m
+b_{m + 1} \ldots b_{8(m + 1) - 1}$ such that for any valid bit index $j$ of $S$,
+$b_j = 1$ if and only if $T[j / 8][j `mod` 8] = 1$, and is $0$ otherwise. 
 
-- $\texttt{binary}(4) = (2, 0), (1, 0), (0, 1), (0, 0), (0, 0), \ldots$
-- $\texttt{binary}(17) = (8, 1), (4, 0), (2, 0), (1, 0), (0, 1), (0, 0), (0, 0), \ldots$
-- $\texttt{binary}(553) = (276, 1), (138, 0), (69, 0), (34, 1), (17, 0), (8, 1), (4, 0), (2, 0), (1, 0), (0, 1), (0, 0), (0, 0), \ldots$
+Based on the above, we observe that any `BuiltinByteString` can be a bit
+sequence or a byte sequence. Furthermore, we assume that `indexByteString` and 
+`sliceByteString` 'agree' with valid byte indices. More precisely, suppose 
+`bs` represents a byte sequence $T$; then `indexByteString bs i` is seen as 
+equivalent to $T[\mathtt{i}]$; we extend this notion to `sliceByteString` 
+analogously. Throughout, we will refer to `BuiltinByteString`s and their 'views'
+as bit or byte sequences interchangeably.
 
 ### Representation of `BuiltinInteger` as `BuiltinByteString` and conversions
 
 We describe the translation of `BuiltinInteger` into `BuiltinByteString`, which
-is implemented as the `integerToByteString` primitive. Informally, we represent
-`BuiltinInteger`s as [little
-endian](https://en.wikipedia.org/wiki/Endianness#Little), with the least
-significant bit at bit index $0$, using a [two's-complement](https://en.wikipedia.org/wiki/Two%27s_complement) 
-representation. More precisely, let $i \in \mathbb{N}^{+}$. We represent $i$ as the bit sequence 
-$s = s_n s_{n-1} \ldots s_0$, such that:
-
-- $\sum_{j \in \\{0, 1, \ldots, n\\}} s_j \cdot 2^j = i$; and
-- $s_n = 0$.
-- Let $\mathtt{binary}(i) = (d_0, m_0), (d_1, m_1), \ldots$. 
-  For any $j \in \\{0, 1, \ldots, n - 1\\}$, $s_j = m_j$; and
-- $n + 1 = 8k$ for the smallest $k \in \mathbb{N}^{+}$ consistent with the 
-  previous requirements.
-
-For $0$, we represent it as the sequence `00000000` (one zero byte). We
-represent any $i \in \\{ x \in \mathbb{Z} \mid x < 0 \\}$ as the 
-[two's-complement](https://en.wikipedia.org/wiki/Two%27s_complement) of 
-the representation of its additive inverse. We observe that any such 
-sequence is by definition a byte sequence.
-
-For example, consider the representation of $23$. We note that
-
-$$
-\texttt{binary}(23) = (11, 1), (5, 1), (2, 1), (1, 0), (0, 1), (0, 0), (0, 0),
-(0, 0), \ldots
-$$
-
-The representation of $23$ as a byte sequence is
+is implemented as the `integerToByteString` primitive. Let $i$ be the argument
+`BuiltinInteger`; if this is negative, we produce an error, specifying at least
+the following:
 
-$$
-s = s7s6s5s4s3s2s1s0
-  = 00010111
-$$
-
-If we instead consider $-23$, its representation would instead be
-
-$$
-t = t7t6t5t4t3t2t1t0
-  = 11101001
-$$
+- The fact that specifically the `integerToByteString` operation failed;
+- The reason (given a negative number); and 
+- What exact number was given as an argument.
 
-To interpret a byte sequence $s = s_n s_{n - 1} \ldots s_0$ as a
-`BuiltinInteger`, we use the following process:
+Otherwise, we produce the `BuiltinByteString` corresponding to the base 256
+sequence which represents $i$.
 
-- If $s_n = 1$, let $s^{\prime}$ be the two's-complement of $s$. Then the result
-  is the additive inverse of the result of interpreting $s^{\prime}$.
-- Otherwise, the result is $\sum_{i \in \\{0, 1, \ldots, n\\}} s_i \cdot 2^i$.
+We now describe the reverse operation, implemented as the 'byteStringToInteger`
+primitive. This treats its argument `BuiltinByteString` as a base 256 sequence,
+and produces its corresponding number as a `BuiltinInteger`. We note that this
+is necessarily non-negative.
 
-Going by our previous example, for the sequence $s = 00010111$ as above, as 
-$s_7 = 0$, we have
-
-$$
-\sum_{i \in \\{0, 1, \ldots, 7\\}} s_i \cdot 2^i = 
-2^4 + 2^2 + 2^1 + 2^0 = 
-16 + 4 + 2 + 1 = 
-23
-$$
-
-We implement the above as the `byteStringToInteger` primitive. We observe that
-`integerToByteString` and `byteStringToInteger` round-trip; mor especifically,
-we have:
+We observe that `byteStringToInteger` 'undoes' `integerToByteString`:
 
 ```haskell
 byteStringToInteger . integerToByteString = id
 ```
 
-The other direction does not necessarily hold: informally, this is due to
-'trailing zeroes' not contributing to a numerical value. More precisely,
-consider the `BuiltinByteString` consisting of two zero bytes. If we convert
-this `BuiltinByteString` to a `BuiltinInteger` using `byteStringToInteger`, we
-would get $0$; however, if we convert $0$ to a `BuiltinByteString`, it would
-consist of _one_ zero byte.
+The other direction does not necessarily hold: if the argument to
+`byteStringToInteger` contains a prefix consisting only of zeroes, and we
+convert the resulting `BuiltinInteger` `i` back to a `BuiltinByteString` using
+`integerToByteString`, that prefix will be lost.
 
 ### Bitwise logical operations on `BuiltinByteString`
 
-Throughout, let $s = s_n s_{n-1} \ldots s_0$ and 
-$t = t_m t_{m - 1} \ldots t_0$ be two byte sequences. Whenever we specify a 
-*mismatched length error* result, its error message must contain at least the 
-following information:
+Throughout, let $S = s_0 s_1 \ldots s_n$ and $T = t_0 t_1 \ldots t_n$ be byte 
+sequences, and let $S^{\prime}$ and $T^{\prime}$ be their corresponding bit
+sequences, with bit lengths $n^{\prime} + 1$ and $m^{\prime} + 1$ respectively.
+Whenever we specify a *mismatched length error* result, its error message 
+must contain at least the following information:
+
+- The name of the failed operation;
+- The reason (mismatched lengths); and
+- The byte lengths of the arguments.
+
+For any of `andByteString`, `iorByteString` and `xorByteString`, given inputs
+$S$ and $T$, if $n \neq m$, the result is an error which must contain at least
+the following information:
 
 - The name of the failed operation;
 - The reason (mismatched lengths); and
-- The lengths of the arguments.
-
-We describe the semantics of `andByteString`. For inputs $s$ and $t$, if
-$n \neq m$, the result is a mismatched length error. Otherwise, the result is 
-the byte sequence $u = u_n u_{n - 1} \ldots, u_0$ such that for all 
-$i \in \\{0, 1, \ldots, n\\}$ we have $u_i = 1$ if $s_i = t_i = 1$, and $0$
-otherwise.
-
-For `iorByteString`, for inputs $s$ and $t$, if $n \neq m$, the result is a 
-mismatched length error. Otherwise, the result is the byte sequence 
-$u = u_n u_{n - 1} \ldots u_0$ such that for all $i \in \\{0, 1, \ldots, n\\}$ 
-we have $u_i = 1$ if at least one of $s_i, t_i$ is $1$, and $0$ otherwise.
-
-For `xorByteString`, for inputs $s$ and $t$, if $n \neq m$, the result is a 
-mismatched length error. Otherwise, the result is the byte sequence 
-$u = u_n u_{n-1} \ldots u_0$ such that for all $i \in \\{0, 1, \ldots, n\\}$ 
-we have $u_i = 0$ if $s_i = t_i$, and $1$ otherwise.
-
-We observe that, for length-matched arguments, each of `andByteString`,
-`iorByteString` and `xorByteString` describes a commutative and associative 
-operation. Furthermore, for any given length $k$, each of these operations has 
-an identity element: for `iorByteString`, this is the bit sequence of length 
-$k$ where each element is $0$, and for `andByteString` and `xorByteString`, 
-this is the bit sequence of length $k$ where each element is $1$. Lastly, for 
-any length $k$, the bit sequence of length $k$ where each element is $0$ is an 
-absorbing element for `andByteString`, and the bit sequence of length $k$ 
-where each element is $1$ is an absorbing element for `iorByteString`.
-
-We now describe the semantics of `complementByteString`. For input $s$,
-the result is the byte sequence $u = u_n u_{n - 1} \ldots u_0$ such that for all
-$i \in \{0, 1, \ldots, n\}$ we have $u_i = 0$ if $s_i = 1$, and $1$ otherwise.
+- The byte lengths of the arguments.
+
+If $n = m$, the result of each of these operations is the bit sequence $U = u_0
+u_1 \ldots u_{n^{\prime}}$, such that for all $i \in \\{0, 1, \ldots, n^{\prime}\\}$,
+$U[i]$ is $1$ under the following conditions:
+
+- For `andByteString`, when $S^{\prime}[i] = T^{\prime}[i] = 1$;
+- For `iorByteString`, when at least one of $S^{\prime}[i], T^{\prime}[i]$ is
+  $1$;
+- For `xorByteString`, when $S^{\prime}[i] \neq T^{\prime}[i]$.
+
+Otherwise, $U[i] = 0$.
+
+We observe that, for length-matched arguments, each of these operations
+describes a commutative and associative operation. Furthermore, for any given
+byte length $k$, each of these operations has an identity element:
+
+- For `andByteString` and `xorByteString`, the byte sequence of length $k$ where
+  each element is zero; and
+- For `iorByteString`, the byte sequence of length $k$ where each element is
+  255.
+
+Lastly, `andByteString` and `iorByteString` have an absorbing element for each
+byte length $k$, which is the byte sequence of length $k$ where each element is
+zero and 255 respectively.
+
+We now describe the semantics of `complementByteString`. For input $S$, the
+result is the bit sequence $U = u_0 u_1 \ldots u_{n^{\prime}}$ such that for all
+$i \in \{0, 1, \ldots, n^{\prime}\}$, we have $U[i] = 0$ if $S^{\prime}[i] = 1$ 
+and $1$ otherwise.
 
 We observe that `complementByteString` is self-inverting. We also note
 the following equivalences hold assuming `b` and `b'` have the
@@ -570,46 +554,30 @@ complementByteString (iorByteString b b') = andByteString (complementByteString
 
 ### Mixed operations
 
-Throughout this section, let $s = s_n s_{n-1} \ldots s_0$ be a byte sequence, 
-and let $i \in \mathbb{Z}$.
+Throughout, let $S = s_0 s_1 \ldots s_n$ be a byte sequence, and let 
+$S^{\prime}$ be its corresponding bit sequence with bit length $n^{\prime} + 1$.
 
 We describe the semantics of `shiftByteString` and `rotateByteString`.
-Informally, both of these are 'index modifiers' for bit sequences: given a
-positive $i$, the index of a bit in $s$ 'increases' in the result; given a
-negative $i$, the index of a bit in $s$ 'decreases' in the result. This can mean
-that for some indexes in the result, there are no corresponding bits in $s$ by
-the previous definition: we term these *missing indexes*. Additionally, by such
-calculations, bits at some indexes in $s$ may be projected to negative indexes,
-or indexes over $n$, in the result; we term these *out-of-bounds indexes*. How
-we handle missing and out-of-bounds indexes is what distinguishes
-`shiftByteString` and `rotateByteString`:
+Informally, bot hof these are 'bit index modifiers': given a positive $i$, the
+index of a bit in the result 'increases' relative the argument, and given a
+negative $i$, the index of a bit in the result 'decreases' relative the
+argument. This can mean that for some bit indexes in the result, there is no
+corresponding bit in the argument: we term these *missing indexes*.
+Additionally, by such calculations, a bit index in the argument may be projected
+to a negative index in the result: we term these *out-of-bounds indexes*. How we
+handle missing and out-of-bounds indexes is what distinguishes `shiftByteString`
+and `rotateByteString`:
 
 * `shiftByteString` sets any missing index to $0$ and ignores any data at
   out-of-bounds indexes.
 * `rotateByteString` uses out-of-bounds indexes as sources for missing indexes
   by 'wraparound'.
 
-We describe the semantics of `shiftByteString` precisely. Given arguments $s$
-and $i$, the result of `shiftByteString` is the byte sequence 
-$u = u_n u_{n - 1} \ldots u_0$, such that for all $j \in \\{0, 1, \ldots, n \\}$, we have 
-$u_j = s_{j - i}$ if $j - i \in \\{0, 1, \ldots, n \\}$, and $0$ otherwise. For
-example, let $t = 01011110$ and $k = 2$. If we perform `shiftByteString` with
-$t$ and $k$ as arguments, the result will be
-
-$$
-u = t_{(7 - 2)}t_{(6 - 2)}t_{(5 - 2)}t_{(4 - 2)}t_{(3 - 2)}t_{(2 - 2)}00
-  = t_5t_4t_3t_2t_1t_000
-  = 01111000
-$$
-
-If instead we perform `shiftByteString` with $t$ and 
-$-k$ as arguments, the result will be
-
-$$
-u = 00t_{(5 + 2)}t_{(4 + 2)}t_{(3 + 2)}t_{(2 + 2)}t_{(1 + 2)}t_{(0 + 2)}
-  = 00t_7t_6t_5t_4t_3t_2
-  = 00010111
-$$
+We describe the semantics of `shiftByteString` precisely. Given arguments $S$
+and some $i \in \mathbb{Z}$, the result is the bit sequence 
+$U = u_0 u_1 \ldots u_{n^{\prime}} such that for all $j \in \\{0, 1, \ldots,
+n^{\prime}\\}$, we have $U[j] = S^{\prime}[j - i]$ if $j - i$ is a valid bit
+index for $S^{\prime}$ and $0$ otherwise.
 
 Let $k, \ell \in \mathbb{Z}$ 
 such that either 
@@ -623,28 +591,9 @@ shiftByteString (shiftBytestring bs k) l = shiftByteString bs (k + l)
 ```
 
 We now describe the semantics of `rotateByteString` precisely; we assume the
-same arguments as for `shiftByteString` above. The result of `rotateByteString`
-is the byte sequence $u = u_n u_{n + 1} \ldots u_0$ such that for all 
-$j \in \\{0, 1, \ldots, n\\}$, we have $u_j = s_{n + 1 + j - i \mod (n + 1)}$. For
-example, let $t = 01011110$ and $k = 2$. If we perform `rotateByteString` with
-$t$ and $k$ as arguments, the result will be
-
-$$
-u = t_{(13 \mod 8)}t_{(12 \mod 8)}t_{(11 \mod 8)}t_{(10 \mod 8)}t_{(9 \mod 8)}t_{(8 \mod 8)}t_{(7 \mod 8)}t_{(6
-\mod 8)}
-  = t_5t_4t_3t_2t_1t_0t_7t_6
-  = 01111001
-$$
-
-If instead we perform `rotateByteString` with $t$ and 
-$-k$ as arguments, the result will be
-
-$$
-u = t_{(17 \mod 8)}t_{(16 \mod 8)}t_{(15 \mod 8)}t_{(14 \mod 8)}t_{(13 \mod 8)}t_{(12 \mod 8)}t_{(11 \mod 8)}t_{(10
-\mod 8)}
-  = t_1t_0t_7t_6t_5t_4t_3t_2
-  = 10010111
-$$
+same arguments as for `shiftByteString` above. The result is the bit sequence $U
+= u_0 u_1 \ldots u_{n^{\prime}}$ such that for all $j \in \\{0, 1, \ldots,
+n^{\prime}\\}$, we have $U[j] = S^{\prime}[n^{\prime} + j - i \mod n^{\prime}]$.
 
 We observe that for any $k, \ell$, and any
 `bs`, we have
@@ -665,9 +614,9 @@ Lastly, we note that
 rotateByteString bs k = rotateByteString bs (k `remInteger` (lengthByteString bs * 8))
 ```
 
-For `popCountByteString` with argument $s$, the result is
+For `popCountByteString` with argument $S$, the result is 
 
-$$\sum_{j \in \\{0, 1, \ldots, n\\}} s_j$$
+$$\sum_{j \in \\{0, 1, \ldots, n^{\prime}\\} S^{\prime}[j]$$
 
 Informally, this is just the total count of $1$ bits. We observe that 
 for any `bs` and `bs'`, we have
@@ -685,43 +634,56 @@ message must contain at least the following information:
 - What index was accessed out-of-bounds; and
 - The valid range of indexes.
 
-For `testBitByteString` with arguments $s$ and $i$, if $0 \leq i \leq n$, then 
-the result is `True` if $s_i = 1$, and `False` if $s_i = 0$; otherwise, the 
-result is an out-of-bounds error. Let `b :: BuiltinBool`; for 
-`writeBitByteString` with arguments $s$, $i$ and `b`, if $0 \leq i \leq n$, 
-then the result is the byte sequence $u = u_n u_{n - 1} \ldots u_0$ such that 
-for all $j \in \{0, 1, \ldots, n\}$, we have:
+For `testBitByteString` with arguments $S$ and some $i \in \mathbb{Z}$, if $i$
+is a valid bit index of $S^{\prime}$, the result is `True` if $S^{\prime}[i] =
+1$, and `False` if $S^{\prime}[i] = 0$. If $i$ is not a valid bit index of
+$S^{\prime}$, the result is an out-of-bounds error.
 
-- $u_j = 1$ when $i = j$ and `b == True`;
-- $u_j = 0$ when $i = j$ and `b == False`;
-- $u_j = s_j$ otherwise.
+For `writeBitByteString` with arguments $S$, some $i \in \mathbb{Z}$ and some
+`BuiltinBool` $b$, if $i$ is not a valid bit index for $S^{\prime}$, the result
+is an out-of-bounds error. Otherwise, the result is the bit sequence $U = u_0
+u_1 \ldots u_{n^{\prime}}$ such that for all $j \in \\{0, 1, \ldots, n\\}$, we
+have:
 
-For either `testBitByteString` or `writeBitByteString`, if $i < 0$ or $i > n$, 
-the result is an out-of-bounds error.
+- $U[j] = 1$ when $i = j$ and $b$ is `True`;
+- $U[j] = 0$ when $i = j$ and $b$ is `False`;
+- $U[j] = S^{\prime}[j]$ otherwise.
 
 Lastly, we describe the semantics of `countLeadingZeroesByteString` and
-`countTrailingZeroesByteString. Given the argument $s$,
-`countLeadingZeroesByteString` produces the answer $\ell$ such that all of the
+`countTrailingZeroesByteString. Given the argument $S$,
+`countLeadingZeroesByteString` gives the result $k$ such that all of the
 following hold:
 
-- $0 \leq \ell < n$;
-- For all $0 \leq i < \ell$, $s_i = 0$; and
-- If $s$ is not empty, $s_{\ell} = 1$.
+- $0 \leq k < n^{\prime} + 1$;
+- For all $0 \leq i < k$, $S^{\prime}[i] = 0$; and
+- If $n^{\prime} \neq 0$, then $S^{\prime}[k] = 1$.
+
+Given the same argument, `countTrailingZeroesByteString` instead gives the
+result $k$ such that all of the following hold:
+
+- $0 \leq k < n^{\prime} + 1$;
+- For all $k \leq i < n^{\prime}$, $S^{\prime}[i] = 0$; and
+- If $k /neq n^{\prime} + 1$, then $S^{\prime}[n^{prime} - k] = 1$.
 
-`countTrailingZeroesByteString` instead produces the answer $\ell$ such that all
-of the following hold:
+Let `zeroes` be a `BuiltinByteString` consisting only of zero bytes of length
+`len`. We observe that
 
-- $0 \leq \ell < n$;
-- For all $\ell \leq i < n$, $s_i = 0$; and
-- If $s$ is not empty, $s_{\ell} = 1$.
+```haskell
+countTrailingZeroesByteString zeroes = countLeadingZeroesByteString zeroes = len
+* 8
+```
+
+Furthermore, for two `BuiltinByteString`s `bs` and `bs'`, we have
 
-Lastly, we describe the semantics of `findFirstSetByteString`. Given the
-argument $s$, if for all $j \in \\{0, 1, \ldots, n \\}$, $s_j = 0$, the result 
-is $-1$; otherwise, the result is $k$ such that all of the following hold:
+```haskell
+countLeadingZeroesByteString (iorByteString bs bs') = 
+  min (countLeadingZeroesByteString bs) (countLeadingZeroesByteString bs')
+
+countTrailingZeroesByteString (iorByteString bs bs') = 
+  min (countTrailingZeroesByteString bs) (countTrailingZeroesByteString bs')
+```
 
-- $k \in \\{0, 1, \ldots, n\\}$;
-- $s_k = 1$; and
-- For all $0 \leq k^{\prime} < k$, $s_{k^{\prime}} = 0$.
+where `min` is the minimum value function.
 
 ### Costing
 
@@ -741,18 +703,19 @@ Primitive | Linear in
 `popCountByteString` | Argument (only one)
 `testBitByteString` | `BuiltinByteString` argument
 `writeBitByteString` | `BuiltinByteString` argument
-`findFirstSetByteString` | Argument (only one)
+`countLeadingZeroesByteString` | Argument (only one)
+`countTrailingZeroesByteString` | Argument (only one)
 
 # Rationale
 
 ## Why these operations?
 
-There needs to be a well-defined interface between the 'world' of 
-`BuiltinInteger` and `BuiltinByteString`. To provide this, we require
-`integerToByteString` and `byteStringToInteger`; we require that
-`integerToByteString` and `byteStringToInteger` roundtrip. Furthermore, 
-by spelling out a precise description of the conversions, we make this 
-predictable and portable.
+For work in finite field arithmetic (and the areas it enables), we frequently
+need to move between the 'worlds' of `BuiltinInteger` and `BuiltinByteString`.
+This needs to be consistent, and allow round-trips. We simplify this by only
+requiring conversions work on non-negative integers: this means that the
+translations can be simpler and more efficient, and also avoids representational
+questions for negative numbers.
 
 Our choice of logical AND, IOR, XOR and complement as the primary logical 
 operations is driven by a mixture of prior art, utility and convenience. These
@@ -837,18 +800,17 @@ specifying, and verifying, that other bitwise operations, both primitive and
 non-primitive, are behaving correctly. They are also particularly essential for
 binary encodings.
 
-`findFirstSetByteString` is an essential primitive for several succinct
-data structures: both Roaring Bitmaps and rank-select dictionaries rely on it
-being efficient for much of their usefulness. Furthermore, this operation is
-provided in hardware by several instruction sets: on x86, there exist (at least)
-`BSF`, `BSR`, `LZCNT` and `TZCNT`, which allow finding both the first *and* 
-last set bits, while on ARM, there exists `CLZ`, which can be used to simulate 
-finding the first set bit. The instruction also exists in higher-level 
+`countLeadingZeroesByteString` and `countTrailingZeroesByteString` is an
+essential primitive for several succinct data structures: both Roaring Bitmaps
+and rank-select dictionaries rely on them for much of their usefulness. For
+finite field arithmetic, these instructions are also beneficial to have
+available as efficiently as possible. Furthermore, this operation is provided 
+in hardware by several instruction sets: 
+on x86, there exist (at least) `BSF`, `BSR`, `LZCNT` and `TZCNT`, while on ARM, 
+we have `CLZ` for counting leading zeroes. These instructions also exist in higher-level 
 languages: for example, GHC's `FiniteBits` type class has `countTrailingZeros` 
-and `countLeadingZeros`. The main reason we propose taking 
-'finding the first set bit' as primitive, rather than 'counting leading 
-zeroes' or 'counting trailing zeroes' is that finding the first set bit is 
-required specifically for several succinct data structures.
+and `countLeadingZeros`. Lastly, while they can be emulated by
+`testBitByteString`, this is tedious, error-prone and extremely slow.
 
 # Backwards compatibility 
 

From a1b9ff0190ad9f3e51ae23e85c7a8f29583278f0 Mon Sep 17 00:00:00 2001
From: Koz Ross <koz.ross@retro-freedom.nz>
Date: Thu, 15 Sep 2022 11:09:59 +1200
Subject: [PATCH 10/11] Fix formatting

---
 CIP-?/README.md | 60 +++++++++++++++++++++++--------------------------
 1 file changed, 28 insertions(+), 32 deletions(-)

diff --git a/CIP-?/README.md b/CIP-?/README.md
index edbc357c44..0813947027 100644
--- a/CIP-?/README.md
+++ b/CIP-?/README.md
@@ -411,22 +411,20 @@ Counts the final sequence of 0 bits in the argument (that is, starting from the
 
 We define $\mathbb{N}^{+} = \\{ x \in \mathbb{N} \mid x \neq 0 \\}$. We assume
 that `BuiltinInteger` is a faithful representation of $\mathbb{Z}$, and will
-refer to them (and their elements) interchangeably. A *byte* is some $x \in
-\\{0,1,\ldots,255\\}$.
+refer to them (and their elements) interchangeably. A *byte* is some 
+$x \in \\{ 0,1,\ldots,255 \\}$.
 
-We observe that, given some *base* $b \in \mathbb{N}^{+}$, any $n \in
-\mathbb{N}$ can be viewed as a sequence of values in $\\{0,1,\ldots, b - 1\\}$.
+We observe that, given some *base* $b \in \mathbb{N}^{+}$, any 
+$n \in \mathbb{N}$ can be viewed as a sequence of values in $\\{0,1,\ldots, b - 1\\}$.
 We refer to any such sequence as a *base $b$ sequence*. In such a 'view', given 
 a base $b$ sequence $S = s_0 s_1 \ldots s_k$, we can compute its corresponding 
 $m \in \mathbb{N}^+$ as 
 
-\[
-\sum_{i \in \\{0,1,\ldots,k\\}} b^{k - i} * s_i
-\]
+$$\sum_{i \in \\{0,1,\ldots,k\\}} b^{k - i} \cdot s_i$$
 
 If $b > 1$ and $Z$ is a base $b$ sequence consisting only of zeroes, we observe 
 that for any other base $b$ sequence $S$, $Z \cdot S$ and $S$ correspond to the 
-same number.
+same number, where $\cdot$ is sequence concatenation.
 
 We use *bit sequence* to refer to a base 2 sequence, and *byte sequence* to
 refer to a base 256 sequence. For a bit sequence $S = b_0 b_1 \ldots b_n$, we
@@ -441,16 +439,14 @@ respectively.
 We describe a 'view' of bytes as bit sequences. Let $y$ be a byte; its
 corresponding bit sequence is $S_y = y_0 y_1 y_2 y_3 y_4 y_5 y_6 y_7$ such that
 
-\[
-\sum_{i \in \\{0,1,\ldots,7\\}} 2^{7 - i} * y_i = y
-\]
+$$\sum_{i \in \\{0,1,\ldots,7\\}} 2^{7 - i} \cdot y_i = y$$
 
 For example, the byte $55$ has the corresponding byte sequence $00110111$. For
 any byte, its corresponding byte sequence is unique. We use this to extend our
-'view' to byte sequences as bit sequences. Specifically, let $T = y_0 y_1 \ldots
-y_m$ be a byte sequence. Its corresponding bit sequence $S = b_0b_1 \ldots b_m
-b_{m + 1} \ldots b_{8(m + 1) - 1}$ such that for any valid bit index $j$ of $S$,
-$b_j = 1$ if and only if $T[j / 8][j `mod` 8] = 1$, and is $0$ otherwise. 
+'view' to byte sequences as bit sequences. Specifically, let 
+$T = y_0 y_1 \ldots y_m$ be a byte sequence. Its corresponding bit sequence 
+$S = b_0b_1 \ldots b_m b_{m + 1} \ldots b_{8(m + 1) - 1}$ such that for any valid bit index $j$ of $S$,
+$b_j = 1$ if and only if $T[j / 8][j \mod 8] = 1$, and is $0$ otherwise. 
 
 Based on the above, we observe that any `BuiltinByteString` can be a bit
 sequence or a byte sequence. Furthermore, we assume that `indexByteString` and 
@@ -474,7 +470,7 @@ the following:
 Otherwise, we produce the `BuiltinByteString` corresponding to the base 256
 sequence which represents $i$.
 
-We now describe the reverse operation, implemented as the 'byteStringToInteger`
+We now describe the reverse operation, implemented as the `byteStringToInteger`
 primitive. This treats its argument `BuiltinByteString` as a base 256 sequence,
 and produces its corresponding number as a `BuiltinInteger`. We note that this
 is necessarily non-negative.
@@ -510,9 +506,9 @@ the following information:
 - The reason (mismatched lengths); and
 - The byte lengths of the arguments.
 
-If $n = m$, the result of each of these operations is the bit sequence $U = u_0
-u_1 \ldots u_{n^{\prime}}$, such that for all $i \in \\{0, 1, \ldots, n^{\prime}\\}$,
-$U[i]$ is $1$ under the following conditions:
+If $n = m$, the result of each of these operations is the bit sequence 
+$U = u_0u_1 \ldots u_{n^{\prime}}$, such that for all $i \in \\{0, 1, \ldots, n^{\prime}\\}$,
+$U[i] = 1$ under the following conditions:
 
 - For `andByteString`, when $S^{\prime}[i] = T^{\prime}[i] = 1$;
 - For `iorByteString`, when at least one of $S^{\prime}[i], T^{\prime}[i]$ is
@@ -575,9 +571,9 @@ and `rotateByteString`:
 
 We describe the semantics of `shiftByteString` precisely. Given arguments $S$
 and some $i \in \mathbb{Z}$, the result is the bit sequence 
-$U = u_0 u_1 \ldots u_{n^{\prime}} such that for all $j \in \\{0, 1, \ldots,
-n^{\prime}\\}$, we have $U[j] = S^{\prime}[j - i]$ if $j - i$ is a valid bit
-index for $S^{\prime}$ and $0$ otherwise.
+$U = u_0 u_1 \ldots u_{n^{\prime}}$ such that for all 
+$j \in \\{0, 1, \ldots, n^{\prime}\\}$, we have $U[j] = S^{\prime}[j - i]$ if 
+$j - i$ is a valid bit index for $S^{\prime}$ and $0$ otherwise.
 
 Let $k, \ell \in \mathbb{Z}$ 
 such that either 
@@ -591,9 +587,9 @@ shiftByteString (shiftBytestring bs k) l = shiftByteString bs (k + l)
 ```
 
 We now describe the semantics of `rotateByteString` precisely; we assume the
-same arguments as for `shiftByteString` above. The result is the bit sequence $U
-= u_0 u_1 \ldots u_{n^{\prime}}$ such that for all $j \in \\{0, 1, \ldots,
-n^{\prime}\\}$, we have $U[j] = S^{\prime}[n^{\prime} + j - i \mod n^{\prime}]$.
+same arguments as for `shiftByteString` above. The result is the bit sequence 
+$U = u_0 u_1 \ldots u_{n^{\prime}}$ such that for all 
+$j \in \\{0, 1, \ldots, n^{\prime}\\}$, we have $U[j] = S^{\prime}[n^{\prime} + j - i \mod n^{\prime}]$.
 
 We observe that for any $k, \ell$, and any
 `bs`, we have
@@ -616,7 +612,7 @@ rotateByteString bs k = rotateByteString bs (k `remInteger` (lengthByteString bs
 
 For `popCountByteString` with argument $S$, the result is 
 
-$$\sum_{j \in \\{0, 1, \ldots, n^{\prime}\\} S^{\prime}[j]$$
+$$\sum_{j \in \\{0, 1, \ldots, n^{\prime}\\}} S^{\prime}[j]$$
 
 Informally, this is just the total count of $1$ bits. We observe that 
 for any `bs` and `bs'`, we have
@@ -635,14 +631,14 @@ message must contain at least the following information:
 - The valid range of indexes.
 
 For `testBitByteString` with arguments $S$ and some $i \in \mathbb{Z}$, if $i$
-is a valid bit index of $S^{\prime}$, the result is `True` if $S^{\prime}[i] =
-1$, and `False` if $S^{\prime}[i] = 0$. If $i$ is not a valid bit index of
-$S^{\prime}$, the result is an out-of-bounds error.
+is a valid bit index of $S^{\prime}$, the result is `True` if 
+$S^{\prime}[i] = 1$, and `False` if $S^{\prime}[i] = 0$. If $i$ is not a valid 
+bit index of $S^{\prime}$, the result is an out-of-bounds error.
 
 For `writeBitByteString` with arguments $S$, some $i \in \mathbb{Z}$ and some
 `BuiltinBool` $b$, if $i$ is not a valid bit index for $S^{\prime}$, the result
-is an out-of-bounds error. Otherwise, the result is the bit sequence $U = u_0
-u_1 \ldots u_{n^{\prime}}$ such that for all $j \in \\{0, 1, \ldots, n\\}$, we
+is an out-of-bounds error. Otherwise, the result is the bit sequence 
+$U = u_0 u_1 \ldots u_{n^{\prime}}$ such that for all $j \in \\{0, 1, \ldots, n\\}$, we
 have:
 
 - $U[j] = 1$ when $i = j$ and $b$ is `True`;
@@ -650,7 +646,7 @@ have:
 - $U[j] = S^{\prime}[j]$ otherwise.
 
 Lastly, we describe the semantics of `countLeadingZeroesByteString` and
-`countTrailingZeroesByteString. Given the argument $S$,
+`countTrailingZeroesByteString`. Given the argument $S$,
 `countLeadingZeroesByteString` gives the result $k$ such that all of the
 following hold:
 

From e86732f3d046ee438996d9823c911f731eeef161 Mon Sep 17 00:00:00 2001
From: Las Safin <me@las.rs>
Date: Wed, 22 Feb 2023 20:21:04 +0000
Subject: [PATCH 11/11] Apply suggestions from code review

Co-authored-by: Matthias Benkort <5680256+KtorZ@users.noreply.github.com>
---
 CIP-?/README.md | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/CIP-?/README.md b/CIP-?/README.md
index 0813947027..6b039909ab 100644
--- a/CIP-?/README.md
+++ b/CIP-?/README.md
@@ -1,9 +1,9 @@
 ---
-CIP: \?
+CIP: 58
 Title: Bitwise primitives
 Author:	Koz Ross <koz@mlabs.city>, Maximilian König <maximilian@mlabs.city>
 Comments-URI:	https://github.com/cardano-foundation/CIPs/wiki/Comments:CIP-\?
-Status:	Draft
+Status: Proposed
 Type:	Standards Track
 Created: 2022-05-27
 License: Apache-2.0
@@ -52,7 +52,7 @@ A good example is multiplication over the [Goldilocks
 field](https://blog.polygon.technology/introducing-plonky2) (with characteristic
 $2^64 - 2^32 + 1$). To perform this operation requires 'slicing' the
 representation being worked with into 32-bit chunks. As finite field
-reprentations are some kind of unsigned integer in every implementation, in
+representations are some kind of unsigned integer in every implementation, in
 Plutus, this would correspond to `Integer`s, but currently, there is no way to
 perform this kind of 'slicing' on an `Integer` on-chain.
 
@@ -416,7 +416,7 @@ $x \in \\{ 0,1,\ldots,255 \\}$.
 
 We observe that, given some *base* $b \in \mathbb{N}^{+}$, any 
 $n \in \mathbb{N}$ can be viewed as a sequence of values in $\\{0,1,\ldots, b - 1\\}$.
-We refer to any such sequence as a *base $b$ sequence*. In such a 'view', given 
+We refer to any such sequence as a *base* $b$ *sequence*. In such a 'view', given 
 a base $b$ sequence $S = s_0 s_1 \ldots s_k$, we can compute its corresponding 
 $m \in \mathbb{N}^+$ as 
 
@@ -554,9 +554,9 @@ Throughout, let $S = s_0 s_1 \ldots s_n$ be a byte sequence, and let
 $S^{\prime}$ be its corresponding bit sequence with bit length $n^{\prime} + 1$.
 
 We describe the semantics of `shiftByteString` and `rotateByteString`.
-Informally, bot hof these are 'bit index modifiers': given a positive $i$, the
-index of a bit in the result 'increases' relative the argument, and given a
-negative $i$, the index of a bit in the result 'decreases' relative the
+Informally, both of these are 'bit index modifiers': given a positive $i$, the
+index of a bit in the result 'increases' relative to the argument, and given a
+negative $i$, the index of a bit in the result 'decreases' relative to the
 argument. This can mean that for some bit indexes in the result, there is no
 corresponding bit in the argument: we term these *missing indexes*.
 Additionally, by such calculations, a bit index in the argument may be projected