Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!(NODE-4802): Refactor BSON to work with cross platform JS APIs #518

Merged
merged 22 commits into from
Nov 7, 2022

Conversation

nbbeeken
Copy link
Contributor

@nbbeeken nbbeeken commented Oct 21, 2022

Description

NODE-3555 / NODE-4802

What is changing?

Refactors BSON to rely on common JS globals and runs tests for both node and non node global scenarios.

ByteUtils

A new collection of functions that change their implementation based on the environment the BSON library is imported into. If Buffer is available the ByteUtils wraps the same Buffer functions we have been using in BSON thus far. If Buffer is not available ByteUtils uses a combination of existing JS TypedArray APIs as well as some original implementations to get the needed format translations BSON requires.

Using VM context

In order to test how the BSON library behaves when Buffer exists or not, we use vm.runInContext and read in the umd bundle as a string. We can then "run" the library with our own definition of a global object. All BSON tests are run both with and without Buffer being defined. The only globals required are: TextEncoder, TextDecoder, atob, btoa.

buffer.copy

Buffer.copy calls have been translated to dst.set(src.subarray(), offset) calls. As seen here: https://github.com/nodejs/node/blob/main/lib/buffer.js?rgh-link-date=2022-10-26T15%3A06%3A09Z#L247-L262 this is the same approach taken internally by node we just have inputs that we can trust are the correct type.

isArray flag

The isArray flag internal to the the serializer has been removed, it didn't provide functionality other than switching the key encoding to ASCII, but that doesn't actually change the output for a stringified number which is always "ascii". Additionally, we were mistakenly setting this flag with the wrong variable serializeFunctions, which would cause functions with names beyond the ascii range to be serialized with broken keys. It has to be fixed in this PR b/c of typescript, but I've filed NODE-4771 so we can make a separate entry in the changelog about this.

chai .throws

When running the BSON library from another context, instanceof checks on errors will fail so I've enhanced the throws assertion to permit errors whose .name property match the expected error name.

ObjectId.toString / UUID.toString / Binary.toString

The above methods all supported passing an encoding through to the Buffer.toString implementation, but the various encodings supported are unlikely to be useful and would be out of scope to maintain for this library. The underlying buffer can be fetched and passed to the Node.js Buffer .toString implementation if so desired. However, we will continue to support a common subset of the encodings that are likely to be useful, and the TS asserts that the encodings are only the ones we do accept now. (base64/utf8/hex)

What is the motivation for this change?

Maintainability

Double check the following

  • Ran npm run lint script
  • Self-review completed using the steps outlined here
  • PR title follows the correct format: <type>(NODE-xxxx)<!>: <description>
  • Changes are covered by tests
  • New TODOs have a related JIRA ticket

src/parser/serializer.ts Outdated Show resolved Hide resolved
src/parser/serializer.ts Outdated Show resolved Hide resolved
src/parser/serializer.ts Outdated Show resolved Hide resolved
src/binary.ts Outdated Show resolved Hide resolved
src/objectid.ts Outdated Show resolved Hide resolved
src/objectid.ts Outdated Show resolved Hide resolved
test/node/bson_test.js Outdated Show resolved Hide resolved
test/register-bson.js Outdated Show resolved Hide resolved
@nbbeeken nbbeeken marked this pull request as ready for review October 25, 2022 15:31
src/binary.ts Outdated Show resolved Hide resolved
src/binary.ts Show resolved Hide resolved
src/binary.ts Show resolved Hide resolved
src/bson.ts Outdated Show resolved Hide resolved
src/objectid.ts Outdated Show resolved Hide resolved
src/parser/serializer.ts Outdated Show resolved Hide resolved
src/utils/byte_utils.ts Outdated Show resolved Hide resolved
src/utils/node_byte_utils.ts Outdated Show resolved Hide resolved
src/utils/node_byte_utils.ts Outdated Show resolved Hide resolved
src/utils/web_byte_utils.ts Outdated Show resolved Hide resolved
@dariakp dariakp added the Primary Review In Review with primary reviewer, not yet ready for team's eyes label Oct 25, 2022
src/binary.ts Outdated Show resolved Hide resolved
src/parser/serializer.ts Outdated Show resolved Hide resolved
src/utils/node_byte_utils.ts Outdated Show resolved Hide resolved
src/utils/node_byte_utils.ts Outdated Show resolved Hide resolved
@dariakp dariakp added Team Review Needs review from team and removed Primary Review In Review with primary reviewer, not yet ready for team's eyes labels Oct 31, 2022
Copy link
Contributor

@dariakp dariakp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments for everything but the byte utils test

src/utils/byte_utils.ts Outdated Show resolved Hide resolved
src/utils/node_byte_utils.ts Outdated Show resolved Hide resolved
src/utils/node_byte_utils.ts Show resolved Hide resolved
src/utils/web_byte_utils.ts Outdated Show resolved Hide resolved
src/utils/web_byte_utils.ts Outdated Show resolved Hide resolved
src/utils/web_byte_utils.ts Show resolved Hide resolved
test/node/to_bson_test.js Outdated Show resolved Hide resolved
test/node/object_id_tests.js Outdated Show resolved Hide resolved
test/register-bson.js Show resolved Hide resolved
test/register-bson.js Outdated Show resolved Hide resolved
Copy link
Contributor

@dariakp dariakp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test review

test/node/byte_utils.test.ts Outdated Show resolved Hide resolved
test/node/byte_utils.test.ts Show resolved Hide resolved
test/node/byte_utils.test.ts Show resolved Hide resolved
test/node/byte_utils.test.ts Show resolved Hide resolved
test/node/byte_utils.test.ts Outdated Show resolved Hide resolved
@nbbeeken nbbeeken requested a review from dariakp October 31, 2022 22:45
src/utils/node_byte_utils.ts Show resolved Hide resolved
test/node/byte_utils.test.ts Outdated Show resolved Hide resolved

> **TL;DR**: Web environments return Uint8Array; Node.js environments return Buffer

For those that use the BSON library on Node.js, there is no change the BSON APIs will still return and accept instances of Node.js Buffer. Since we no longer depend on the Buffer web shim for compatibility with browsers, in non-Node.js environments a Uint8Array will be returned instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we mention which APIs this actually affects? and can we say in the summary instead of what? (i.e. <what API is affected> in web environments return Uint8Array instead of <...>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://jira.mongodb.org/browse/NODE-4794

That list would be looking at all the public APIs that changed in this PR and listing them out, in the interest of moving this along we will be cleaning up this guide when we reach the end of the EPIC and the diff of this PR will still be available to us then for reference. So let's defer to NODE-4794, I've added a note about creating a list of APIs changed.

Copy link
Contributor

@dariakp dariakp Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't just about making sure we have a well documented list, it's about knowing what these specific code changes actually affect, so I don't think it's out of scope of this PR to identify what will be changing. This work will still need to be done, and it is easier to do it piecemeal with the corresponding code changes than in one big block, where we might miss something that was affected due to the sheer volume of changes.

docs/upgrade-to-v5.md Outdated Show resolved Hide resolved
docs/upgrade-to-v5.md Outdated Show resolved Hide resolved
docs/upgrade-to-v5.md Outdated Show resolved Hide resolved
@@ -15,19 +15,28 @@ for nodejs and web platforms.

> **TL;DR**: Web environments return Uint8Array; Node.js environments return Buffer

For those that use the BSON library on Node.js, there is no change the BSON APIs will still return and accept instances of Node.js Buffer. Since we no longer depend on the Buffer web shim for compatibility with browsers, in non-Node.js environments a Uint8Array will be returned instead.
For those that use the BSON library on Node.js, there is no change - the BSON APIs will still return and accept instances of Node.js Buffer. Since we no longer depend on the Buffer web shim for compatibility with browsers, in non-Node.js environments a Uint8Array will be returned instead.

This allows the BSON library to be better at platform independence while keeping its behavior consistent cross platform. The Buffer shim served the library well but brought in more than was necessary for the concerns of the code here.

### `ObjectId.toString` / `UUID.toString` / `Binary.toString`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### `ObjectId.toString` / `UUID.toString` / `Binary.toString`
### Restrict supported encodings in `ObjectId.toString` / `UUID.toString` / `Binary.toString`

@@ -15,19 +15,28 @@ for nodejs and web platforms.

> **TL;DR**: Web environments return Uint8Array; Node.js environments return Buffer

For those that use the BSON library on Node.js, there is no change the BSON APIs will still return and accept instances of Node.js Buffer. Since we no longer depend on the Buffer web shim for compatibility with browsers, in non-Node.js environments a Uint8Array will be returned instead.
For those that use the BSON library on Node.js, there is no change - the BSON APIs will still return and accept instances of Node.js Buffer. Since we no longer depend on the Buffer web shim for compatibility with browsers, in non-Node.js environments a Uint8Array will be returned instead.

This allows the BSON library to be better at platform independence while keeping its behavior consistent cross platform. The Buffer shim served the library well but brought in more than was necessary for the concerns of the code here.

### `ObjectId.toString` / `UUID.toString` / `Binary.toString`

> **TL;DR**: These `toString` methods only support the following encodings: 'hex', 'base64', 'utf8'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the 'utf8' encoding isn't supported on ObjectId.toString or UUID.toString, then this summary is incorrect

I actually think that what you currently have as a "tl;dr" can be simply the sub-heading for the corresponding set of changes. I think it would also make the table of contents easier to generate. In the case of this TL;DR, it can be removed entirely if you update L22 the way I am suggesting; then the paragraph underneath can elaborate on the specifics

Then the TL;DR of the section above (L16) can be promoted to a subheading, with both of these sections being under the title "Remove reliance on Node.js buffer", which would be the grouping heading.

So in this guide, the grouping headings will indicate the themes of the changes, and the subheadings in the groups will be the actual breaking change from the user's perspective. (What I mean is, "remove reliance on nodejs buffer" isn't the breaking change, it's the cause of the breaking changes whose summaries follow)

As for the 3rd example, I think we can group it under "Other" for now

Copy link

@lucsoft lucsoft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope this is not off-topic, but as a third party maintainer of the deno_mongo project (also web_bson), this is awesome!

@dariakp dariakp changed the title feat!(NODE-3555): Refactor BSON to work with cross platform JS APIs feat!(NODE-4802): Refactor BSON to work with cross platform JS APIs Nov 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team Review Needs review from team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants