Skip to content

Commit

Permalink
refactor(experimental): add addCodecSentinel to @solana/codecs-core
Browse files Browse the repository at this point in the history
  • Loading branch information
lorisleiva committed Apr 4, 2024
1 parent f43a2f5 commit d9c019d
Show file tree
Hide file tree
Showing 12 changed files with 351 additions and 2 deletions.
16 changes: 16 additions & 0 deletions .changeset/gorgeous-gorillas-sniff.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
---
'@solana/codecs-core': patch
'@solana/errors': patch
---

Added new `addCodecSentinel` primitive

The `addCodecSentinel` function provides a new way of delimiting the size of a codec. It allows us to add a sentinel to the end of the encoded data and to read until that sentinel is found when decoding. It accepts any codec and a `Uint8Array` sentinel responsible for delimiting the encoded data.

```ts
const codec = addCodecSentinel(getUtf8Codec(), new Uint8Array([255, 255]));
codec.encode('hello');
// 0x68656c6c6fffff
// | └-- Our sentinel.
// └-- Our encoded string.
```
29 changes: 29 additions & 0 deletions packages/codecs-core/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -385,6 +385,35 @@ const getU32Base58Decoder = () => addDecoderSizePrefix(getBase58Decoder(), getU3
const getU32Base58Codec = () => combineCodec(getU32Base58Encoder(), getU32Base58Decoder());
```

## Adding sentinels to codecs

Another way of delimiting the size of a codec is to use sentinels. The `addCodecSentinel` function allows us to add a sentinel to the end of the encoded data and to read until that sentinel is found when decoding. It accepts any codec and a `Uint8Array` sentinel responsible for delimiting the encoded data.

```ts
const codec = addCodecSentinel(getUtf8Codec(), new Uint8Array([255, 255]));
codec.encode('hello');
// 0x68656c6c6fffff
// | └-- Our sentinel.
// └-- Our encoded string.
```

Note that the sentinel _must not_ be present in the encoded data and _must_ be present in the decoded data for this to work. If this is not the case, dedicated errors will be thrown.

```ts
const sentinel = new Uint8Array([108, 108]); // 'll'
const codec = addCodecSentinel(getUtf8Codec(), sentinel);

codec.encode('hello'); // Throws: sentinel is in encoded data.
codec.decode(new Uint8Array([1, 2, 3])); // Throws: sentinel missing in decoded data.
```

Separate `addEncoderSentinel` and `addDecoderSentinel` functions are also available.

```ts
const bytes = addEncoderSentinel(getUtf8Encoder(), sentinel).encode('hello');
const value = addDecoderSentinel(getUtf8Decoder(), sentinel).decode(bytes);
```

## Adjusting the size of codecs

The `resizeCodec` helper re-defines the size of a given codec by accepting a function that takes the current size of the codec and returns a new size. This works for both fixed-size and variable-size codecs.
Expand Down
78 changes: 78 additions & 0 deletions packages/codecs-core/src/__tests__/add-codec-sentinel-test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
import {
SOLANA_ERROR__CODECS__ENCODED_BYTES_MUST_NOT_INCLUDE_SENTINEL,
SOLANA_ERROR__CODECS__SENTINEL_MISSING_IN_DECODED_BYTES,
SolanaError,
} from '@solana/errors';

import { addCodecSentinel } from '../add-codec-sentinel';
import { b, getMockCodec } from './__setup__';

describe('addCodecSentinel', () => {
it('encodes the sentinel after the main content', () => {
const mockCodec = getMockCodec();
mockCodec.getSizeFromValue.mockReturnValue(10);
mockCodec.write.mockImplementation((_, bytes, offset) => {
bytes.set(b('68656c6c6f776f726c64'), offset);
return offset + 10;
});
const codec = addCodecSentinel(mockCodec, b('ff'));

expect(codec.encode('helloworld')).toStrictEqual(b('68656c6c6f776f726c64ff'));
expect(mockCodec.write).toHaveBeenCalledWith('helloworld', expect.any(Uint8Array), 0);
});

it('decodes until the first occurence of the sentinel is found', () => {
const mockCodec = getMockCodec();
mockCodec.read.mockReturnValue(['helloworld', 10]);
const codec = addCodecSentinel(mockCodec, b('ff'));

expect(codec.decode(b('68656c6c6f776f726c64ff0000'))).toBe('helloworld');
expect(mockCodec.read).toHaveBeenCalledWith(b('68656c6c6f776f726c64'), 0);
});

it('fails if the encoded bytes contain the sentinel', () => {
const mockCodec = getMockCodec();
mockCodec.getSizeFromValue.mockReturnValue(10);
mockCodec.write.mockImplementation((_, bytes, offset) => {
bytes.set(b('68656c6c6f776f726cff'), offset);
return offset + 10;
});
const codec = addCodecSentinel(mockCodec, b('ff'));

expect(() => codec.encode('helloworld')).toThrow(
new SolanaError(SOLANA_ERROR__CODECS__ENCODED_BYTES_MUST_NOT_INCLUDE_SENTINEL, {
encodedBytes: b('68656c6c6f776f726cff'),
hexEncodedBytes: '68656c6c6f776f726cff',
hexSentinel: 'ff',
sentinel: b('ff'),
}),
);
});

it('fails if the decoded bytes do not contain the sentinel', () => {
const mockCodec = getMockCodec();
const codec = addCodecSentinel(mockCodec, b('ff'));

expect(() => codec.decode(b('68656c6c6f776f726c64000000'))).toThrow(
new SolanaError(SOLANA_ERROR__CODECS__SENTINEL_MISSING_IN_DECODED_BYTES, {
decodedBytes: b('68656c6c6f776f726c64000000'),
hexDecodedBytes: '68656c6c6f776f726c64000000',
hexSentinel: 'ff',
sentinel: b('ff'),
}),
);
});

it('returns the correct fixed size', () => {
const mockCodec = getMockCodec({ size: 10 });
const codec = addCodecSentinel(mockCodec, b('ffff'));
expect(codec.fixedSize).toBe(12);
});

it('returns the correct variable size', () => {
const mockCodec = getMockCodec();
mockCodec.getSizeFromValue.mockReturnValueOnce(10);
const codec = addCodecSentinel(mockCodec, b('ffff'));
expect(codec.getSizeFromValue('helloworld')).toBe(12);
});
});
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
import { addCodecSentinel, addDecoderSentinel, addEncoderSentinel } from '../add-codec-sentinel';
import {
Codec,
Decoder,
Encoder,
FixedSizeCodec,
FixedSizeDecoder,
FixedSizeEncoder,
VariableSizeCodec,
VariableSizeDecoder,
VariableSizeEncoder,
} from '../codec';

const sentinel = {} as Uint8Array;

{
// [addEncoderSentinel]: It knows if the encoder is fixed size or variable size.
addEncoderSentinel({} as FixedSizeEncoder<string>, sentinel) satisfies FixedSizeEncoder<string>;
addEncoderSentinel({} as VariableSizeEncoder<string>, sentinel) satisfies VariableSizeEncoder<string>;
addEncoderSentinel({} as Encoder<string>, sentinel) satisfies VariableSizeEncoder<string>;
}

{
// [addDecoderSentinel]: It knows if the decoder is fixed size or variable size.
addDecoderSentinel({} as FixedSizeDecoder<string>, sentinel) satisfies FixedSizeDecoder<string>;
addDecoderSentinel({} as VariableSizeDecoder<string>, sentinel) satisfies VariableSizeDecoder<string>;
addDecoderSentinel({} as Decoder<string>, sentinel) satisfies VariableSizeDecoder<string>;
}

{
// [addCodecSentinel]: It knows if the codec is fixed size or variable size.
addCodecSentinel({} as FixedSizeCodec<string>, sentinel) satisfies FixedSizeCodec<string>;
addCodecSentinel({} as VariableSizeCodec<string>, sentinel) satisfies VariableSizeCodec<string>;
addCodecSentinel({} as Codec<string>, sentinel) satisfies VariableSizeCodec<string>;
}
143 changes: 143 additions & 0 deletions packages/codecs-core/src/add-codec-sentinel.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
import {
SOLANA_ERROR__CODECS__ENCODED_BYTES_MUST_NOT_INCLUDE_SENTINEL,
SOLANA_ERROR__CODECS__SENTINEL_MISSING_IN_DECODED_BYTES,
SolanaError,
} from '@solana/errors';

import { containsBytes } from './bytes';
import {
Codec,
createDecoder,
createEncoder,
Decoder,
Encoder,
FixedSizeCodec,
FixedSizeDecoder,
FixedSizeEncoder,
isFixedSize,
VariableSizeCodec,
VariableSizeDecoder,
VariableSizeEncoder,
} from './codec';
import { combineCodec } from './combine-codec';
import { ReadonlyUint8Array } from './readonly-uint8array';

/**
* Creates an encoder that writes a `Uint8Array` sentinel after the encoded value.
* This is useful to delimit the encoded value when being read by a decoder.
*
* Note that, if the sentinel is found in the encoded value, an error is thrown.
*/
export function addEncoderSentinel<TFrom>(
encoder: FixedSizeEncoder<TFrom>,
sentinel: ReadonlyUint8Array,
): FixedSizeEncoder<TFrom>;
export function addEncoderSentinel<TFrom>(
encoder: Encoder<TFrom>,
sentinel: ReadonlyUint8Array,
): VariableSizeEncoder<TFrom>;
export function addEncoderSentinel<TFrom>(encoder: Encoder<TFrom>, sentinel: ReadonlyUint8Array): Encoder<TFrom> {
const write = ((value, bytes, offset) => {
// Here we exceptionally use the `encode` function instead of the `write`
// function to contain the content of the encoder within its own bounds
// and to avoid writing the sentinel as part of the encoded value.
const encoderBytes = encoder.encode(value);
if (findSentinelIndex(encoderBytes, sentinel) >= 0) {
throw new SolanaError(SOLANA_ERROR__CODECS__ENCODED_BYTES_MUST_NOT_INCLUDE_SENTINEL, {
encodedBytes: encoderBytes,
hexEncodedBytes: hexBytes(encoderBytes),
hexSentinel: hexBytes(sentinel),
sentinel,
});
}
bytes.set(encoderBytes, offset);
offset += encoderBytes.length;
bytes.set(sentinel, offset);
offset += sentinel.length;
return offset;
}) as Encoder<TFrom>['write'];

if (isFixedSize(encoder)) {
return createEncoder({ ...encoder, fixedSize: encoder.fixedSize + sentinel.length, write });
}

return createEncoder({
...encoder,
...(encoder.maxSize != null ? { maxSize: encoder.maxSize + sentinel.length } : {}),
getSizeFromValue: value => encoder.getSizeFromValue(value) + sentinel.length,
write,
});
}

/**
* Creates a decoder that continues reading until a `Uint8Array` sentinel is found.
*
* If the sentinel is not found in the byte array to decode, an error is thrown.
*/
export function addDecoderSentinel<TTo>(
decoder: FixedSizeDecoder<TTo>,
sentinel: ReadonlyUint8Array,
): FixedSizeDecoder<TTo>;
export function addDecoderSentinel<TTo>(decoder: Decoder<TTo>, sentinel: ReadonlyUint8Array): VariableSizeDecoder<TTo>;
export function addDecoderSentinel<TTo>(decoder: Decoder<TTo>, sentinel: ReadonlyUint8Array): Decoder<TTo> {
const read = ((bytes, offset) => {
const candidateBytes = offset === 0 ? bytes : bytes.slice(offset);
const sentinelIndex = findSentinelIndex(candidateBytes, sentinel);
if (sentinelIndex === -1) {
throw new SolanaError(SOLANA_ERROR__CODECS__SENTINEL_MISSING_IN_DECODED_BYTES, {
decodedBytes: candidateBytes,
hexDecodedBytes: hexBytes(candidateBytes),
hexSentinel: hexBytes(sentinel),
sentinel,
});
}
const preSentinelBytes = candidateBytes.slice(0, sentinelIndex);
// Here we exceptionally use the `decode` function instead of the `read`
// function to contain the content of the decoder within its own bounds
// and ensure that the sentinel is not part of the decoded value.
return [decoder.decode(preSentinelBytes), offset + preSentinelBytes.length + sentinel.length];
}) as Decoder<TTo>['read'];

if (isFixedSize(decoder)) {
return createDecoder({ ...decoder, fixedSize: decoder.fixedSize + sentinel.length, read });
}

return createDecoder({
...decoder,
...(decoder.maxSize != null ? { maxSize: decoder.maxSize + sentinel.length } : {}),
read,
});
}

/**
* Creates a Codec that writes a `Uint8Array` sentinel after the encoded
* value and, when decoding, continues reading until the sentinel is found.
*
* Note that, if the sentinel is found in the encoded value
* or not found in the byte array to decode, an error is thrown.
*/
export function addCodecSentinel<TFrom, TTo extends TFrom>(
codec: FixedSizeCodec<TFrom, TTo>,
sentinel: ReadonlyUint8Array,
): FixedSizeCodec<TFrom, TTo>;
export function addCodecSentinel<TFrom, TTo extends TFrom>(
codec: Codec<TFrom, TTo>,
sentinel: ReadonlyUint8Array,
): VariableSizeCodec<TFrom, TTo>;
export function addCodecSentinel<TFrom, TTo extends TFrom>(
codec: Codec<TFrom, TTo>,
sentinel: ReadonlyUint8Array,
): Codec<TFrom, TTo> {
return combineCodec(addEncoderSentinel(codec, sentinel), addDecoderSentinel(codec, sentinel));
}

function findSentinelIndex(bytes: ReadonlyUint8Array, sentinel: ReadonlyUint8Array) {
return bytes.findIndex((byte, index, arr) => {
if (sentinel.length === 1) return byte === sentinel[0];
return containsBytes(arr, sentinel, index);
});
}

function hexBytes(bytes: ReadonlyUint8Array): string {
return bytes.reduce((str, byte) => str + byte.toString(16).padStart(2, '0'), '');
}
1 change: 1 addition & 0 deletions packages/codecs-core/src/index.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
export * from './add-codec-size-prefix';
export * from './add-codec-sentinel';
export * from './assertions';
export * from './bytes';
export * from './codec';
Expand Down
13 changes: 12 additions & 1 deletion packages/codecs-data-structures/src/__tests__/struct-test.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { addCodecSizePrefix, fixCodecSize, offsetCodec, resizeCodec } from '@solana/codecs-core';
import { addCodecSentinel, addCodecSizePrefix, fixCodecSize, offsetCodec, resizeCodec } from '@solana/codecs-core';
import { getU8Codec, getU32Codec, getU64Codec } from '@solana/codecs-numbers';
import { getUtf8Codec } from '@solana/codecs-strings';

Expand Down Expand Up @@ -77,4 +77,15 @@ describe('getStructCodec', () => {
expect(person.read(b('416c6963650000000000000020000000'), 0)).toStrictEqual([alice, 16]);
expect(person.read(b('ff416c6963650000000000000020000000'), 1)).toStrictEqual([alice, 17]);
});

it('can chain sentinel codecs', () => {
const person = struct([
['firstname', addCodecSentinel(getUtf8Codec(), b('ff'))],
['lastname', addCodecSentinel(getUtf8Codec(), b('ff'))],
['age', u8()],
]);
const john = { age: 42, firstname: 'John', lastname: 'Doe' };
expect(person.encode(john)).toStrictEqual(b('4a6f686eff446f65ff2a'));
expect(person.decode(b('4a6f686eff446f65ff2a'))).toStrictEqual(john);
});
});
13 changes: 12 additions & 1 deletion packages/codecs-data-structures/src/__tests__/tuple-test.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { addCodecSizePrefix, fixCodecSize, offsetCodec } from '@solana/codecs-core';
import { addCodecSentinel, addCodecSizePrefix, fixCodecSize, offsetCodec } from '@solana/codecs-core';
import { getI16Codec, getU8Codec, getU32Codec, getU64Codec } from '@solana/codecs-numbers';
import { getUtf8Codec } from '@solana/codecs-strings';
import { SOLANA_ERROR__CODECS__INVALID_NUMBER_OF_ITEMS, SolanaError } from '@solana/errors';
Expand Down Expand Up @@ -72,4 +72,15 @@ describe('getTupleCodec', () => {
expect(person.read(b('2000000000000000416c696365000000'), 0)).toStrictEqual([['Alice', 32n], 16]);
expect(person.read(b('ff2000000000000000416c696365000000'), 1)).toStrictEqual([['Alice', 32n], 17]);
});

it('can chain sentinel codecs', () => {
const person = tuple([
addCodecSentinel(getUtf8Codec(), b('ff')),
addCodecSentinel(getUtf8Codec(), b('ff')),
u8(),
]);
const john = ['John', 'Doe', 42] as const;
expect(person.encode(john)).toStrictEqual(b('4a6f686eff446f65ff2a'));
expect(person.decode(b('4a6f686eff446f65ff2a'))).toStrictEqual(john);
});
});
1 change: 1 addition & 0 deletions packages/codecs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ The `@solana/codecs` package is composed of several smaller packages, each with
- [Mapping codecs](https://github.com/solana-labs/solana-web3.js/tree/master/packages/codecs-core#mapping-codecs).
- [Fixing the size of codecs](https://github.com/solana-labs/solana-web3.js/tree/master/packages/codecs-core#fixing-the-size-of-codecs).
- [Prefixing codecs with their size](https://github.com/solana-labs/solana-web3.js/tree/master/packages/codecs-core#prefixing-codecs-with-their-size).
- [Adding sentinels to codecs](https://github.com/solana-labs/solana-web3.js/tree/master/packages/codecs-core#adding-sentinels-to-codecs).
- [Adjusting the size of codecs](https://github.com/solana-labs/solana-web3.js/tree/master/packages/codecs-core#adjusting-the-size-of-codecs).
- [Offsetting codecs](https://github.com/solana-labs/solana-web3.js/tree/master/packages/codecs-core#offsetting-codecs).
- [Padding codecs](https://github.com/solana-labs/solana-web3.js/tree/master/packages/codecs-core#padding-codecs).
Expand Down
4 changes: 4 additions & 0 deletions packages/errors/src/codes.ts
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,8 @@ export const SOLANA_ERROR__CODECS__LITERAL_UNION_DISCRIMINATOR_OUT_OF_RANGE = 80
export const SOLANA_ERROR__CODECS__UNION_VARIANT_OUT_OF_RANGE = 8078017 as const;
export const SOLANA_ERROR__CODECS__INVALID_CONSTANT = 8078018 as const;
export const SOLANA_ERROR__CODECS__EXPECTED_ZERO_VALUE_TO_MATCH_ITEM_FIXED_SIZE = 8078019 as const;
export const SOLANA_ERROR__CODECS__ENCODED_BYTES_MUST_NOT_INCLUDE_SENTINEL = 8078020 as const;
export const SOLANA_ERROR__CODECS__SENTINEL_MISSING_IN_DECODED_BYTES = 8078021 as const;

// RPC-related errors.
// Reserve error codes in the range [8100000-8100999].
Expand Down Expand Up @@ -325,6 +327,7 @@ export type SolanaErrorCode =
| typeof SOLANA_ERROR__BLOCK_HEIGHT_EXCEEDED
| typeof SOLANA_ERROR__BLOCKHASH_STRING_LENGTH_OUT_OF_RANGE
| typeof SOLANA_ERROR__CODECS__CANNOT_DECODE_EMPTY_BYTE_ARRAY
| typeof SOLANA_ERROR__CODECS__ENCODED_BYTES_MUST_NOT_INCLUDE_SENTINEL
| typeof SOLANA_ERROR__CODECS__ENCODER_DECODER_FIXED_SIZE_MISMATCH
| typeof SOLANA_ERROR__CODECS__ENCODER_DECODER_MAX_SIZE_MISMATCH
| typeof SOLANA_ERROR__CODECS__ENCODER_DECODER_SIZE_COMPATIBILITY_MISMATCH
Expand All @@ -343,6 +346,7 @@ export type SolanaErrorCode =
| typeof SOLANA_ERROR__CODECS__LITERAL_UNION_DISCRIMINATOR_OUT_OF_RANGE
| typeof SOLANA_ERROR__CODECS__NUMBER_OUT_OF_RANGE
| typeof SOLANA_ERROR__CODECS__OFFSET_OUT_OF_RANGE
| typeof SOLANA_ERROR__CODECS__SENTINEL_MISSING_IN_DECODED_BYTES
| typeof SOLANA_ERROR__CODECS__UNION_VARIANT_OUT_OF_RANGE
| typeof SOLANA_ERROR__CRYPTO__RANDOM_VALUES_FUNCTION_UNIMPLEMENTED
| typeof SOLANA_ERROR__INSTRUCTION__EXPECTED_TO_HAVE_ACCOUNTS
Expand Down
Loading

0 comments on commit d9c019d

Please sign in to comment.