Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(experimental): add addCodecSentinel to @solana/codecs-core #2419

Merged
merged 1 commit into from
Apr 4, 2024

Conversation

lorisleiva
Copy link
Contributor

@lorisleiva lorisleiva commented Apr 3, 2024

This PR adds a new addCodecSentinel primitive.

The addCodecSentinel function provides a new way of delimiting the size of a codec. It allows us to add a sentinel to the end of the encoded data and to read until that sentinel is found when decoding. It accepts any codec and a Uint8Array sentinel responsible for delimiting the encoded data.

const codec = addCodecSentinel(getUtf8Codec(), new Uint8Array([255, 255]));
codec.encode('hello');
// 0x68656c6c6fffff
//   |        └-- Our sentinel.
//   └-- Our encoded string.

Copy link

changeset-bot bot commented Apr 3, 2024

🦋 Changeset detected

Latest commit: d9c019d

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 35 packages
Name Type
@solana/codecs-core Patch
@solana/errors Patch
@solana/accounts Patch
@solana/addresses Patch
@solana/codecs-data-structures Patch
@solana/codecs-numbers Patch
@solana/codecs-strings Patch
@solana/codecs Patch
@solana/keys Patch
@solana/options Patch
@solana/rpc-api Patch
@solana/rpc-types Patch
@solana/transactions Patch
@solana/assertions Patch
@solana/compat Patch
@solana/instructions Patch
@solana/web3.js-experimental Patch
@solana/rpc-spec Patch
@solana/rpc-subscriptions-spec Patch
@solana/rpc-subscriptions-transport-websocket Patch
@solana/rpc-subscriptions Patch
@solana/rpc-transport-http Patch
@solana/rpc Patch
@solana/signers Patch
@solana/sysvars Patch
@solana/transaction-confirmation Patch
@solana/transaction-messages Patch
@solana/programs Patch
@solana/rpc-graphql Patch
@solana/rpc-parsed-types Patch
@solana/rpc-subscriptions-api Patch
@solana/rpc-transformers Patch
@solana/functional Patch
@solana/rpc-spec-types Patch
@solana/webcrypto-ed25519-polyfill Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@lorisleiva lorisleiva marked this pull request as ready for review April 3, 2024 00:43
@lorisleiva lorisleiva self-assigned this Apr 3, 2024
Copy link
Contributor

@buffalojoec buffalojoec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! How might one chain sentinel codecs together?

Also, how come this is named in reverse order of the other codecs, like ArrayCodec and MapCodec. Here it's reversed (CodecSentinel). Assuming because of the nature of the sentinel coming after the codec bit? Since this still produces a valid codec, it did throw me off a bit.

@lorisleiva lorisleiva force-pushed the loris/add-codec-sentinel branch 2 times, most recently from 7aec5eb to 22826af Compare April 3, 2024 08:59
@lorisleiva
Copy link
Contributor Author

lorisleiva commented Apr 3, 2024

Hey @buffalojoec,

Regarding chaining sentinels

You can just use any other codec to chain them such as getStructCodec of getTupleCodec.

const sentinel = new Uint8Array([255]);
const person = getStructCodec([
    ['firstname', addCodecSentinel(getUtf8Codec(), sentinel)],
    ['lastname', addCodecSentinel(getUtf8Codec(), sentinel)],
    ['age', getU8Codec()],
]);

person.encode({ age: 42, firstname: 'John', lastname: 'Doe' });
// 0x4a6f686eff446f65ff2a
//   |         |       └-- Age.
//   |         └-- Lastname.
//   └-- Firstname.

I added a couple of tests to illustrate that.

Regarding codec naming conventions

There is a difference between primitive functions like addCodecSentinel and other type codecs such as getArrayCodec. Let me explain in a bit more detail.

Whilst all codec functions start with a verb, there are two categories of codec functions: the codec primitives and the codec type helpers.

Codec primitives

These functions are lower-level helper and live under the @solana/codecs-core package. They allow us to manipulate codecs without worrying about their concrete types. For instance, the fixCodecSize primitive transforms a VariableSizeCodec<T> into a FixedSizeCodec<T> regardless of what T might be.

The naming convention for codec primitives is to describe that codec manipulation by starting with a verb. For instance:

  • mapCodec (soon to be transformCodec) maps (or transforms) a Codec<T> into a Codec<U> by accepting T => U and U => T functions.
  • reverseCodec reverses the bytes of a codec.
  • offsetCodec offsets the reading and writing cursors of a given codec.
  • fixCodecSize fixes the size of a codec by accepting a byte length.
  • addCodecSizePrefix adds a size prefix to the provided codec.
  • addCodecSentinel adds a sentinel at the end of the provided codec to identify its end.
  • etc.

Codec type helpers

On the other hand, codec type helpers are meant to provide codecs for specific types and live under the other codec packages such as @solana/codecs-numbers, @solana/codecs-strings, etc. For instance getU32Codec will provide a Codec<number> and getBase58Codec will provide a Codec<string>. These can then be composed with codec primitives to create more complex serialisation strategies.

The naming convention for codec type helpers is "get" + Type + "Codec". For instance:

  • getU16Codec gets a number codec using 2 bytes.
  • getBase58Codec gets a base58 encoded string codec.
  • getArrayCodec gets an Array<T> codec from a given Codec<T>.
  • getStructCodec gets an object codec from provided field codecs.
  • etc.

Now, with this in mind, does addCodecSentinel still feel out of place?


EDIT: Re-reading this message, I now realised it could be interpreted as a very cold response with a mean rhetorical question at the end. Please know that it wasn't meant that way and that final question is genuine and written with care not anger. I sincerely apologise for the lack of tone in my reply which makes it possible to interpret it that way.

@buffalojoec
Copy link
Contributor

Now, with this in mind, does addCodecSentinel still feel out of place?

Sorry, just circling back to this. Yeah, I think with that provided context this does seem like a proper name. Maybe it's because I didn't know about addCodecSizePrefix that I was a bit confused when I saw add + Codec + T? Either way, thanks for the insight into the patterns. Ship it!

@lorisleiva lorisleiva force-pushed the loris/add-codec-sentinel branch from 22826af to d9c019d Compare April 4, 2024 14:06
Copy link
Contributor

@mcintyre94 mcintyre94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor Author

lorisleiva commented Apr 4, 2024

Merge activity

@lorisleiva lorisleiva merged commit 89f399d into master Apr 4, 2024
8 checks passed
@lorisleiva lorisleiva deleted the loris/add-codec-sentinel branch April 4, 2024 16:47
Copy link
Contributor

🎉 This PR is included in version 1.91.5 🎉

The release is available on:

Your semantic-release bot 📦🚀

Copy link
Contributor

github-actions bot commented May 1, 2024

Because there has been no activity on this PR for 14 days since it was merged, it has been automatically locked. Please open a new issue if it requires a follow up.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants