Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding: use AliasedUint32Array for encodeInto results #46658

Closed
wants to merge 5 commits into from

Conversation

joyeecheung
Copy link
Member

The first commits comes from #46620

benchmark: stablize encode benchmark

  • Increase the number of iteration to 1e6 to reduce flakes. 1e4
    can introduce flakes even when comparing the main branch
    against itself
  • Replace the 1024 * 32 length test with 1024 * 8 since it would
    otherwise take too long to complete.
  • Check the results of the encoding methods at the end.

src: move encoding bindings to a new binding

Move the bindings used by TextEncoder to a new binding for
more self-contained code.

encoding: use AliasedUint32Array for encodeInto results

Getting the buffer from a TypedArray created from the JS land
incurs a copy. For encodeInto() results we can just use an
AliasedArray and let the binding always own the store.

                                                                               confidence improvement accuracy (*)   (**)  (***)
util/text-encoder.js op='encodeInto' type='ascii' n=1000000 len=1024                           0.41 %       ±1.33% ±1.78% ±2.33%
util/text-encoder.js op='encodeInto' type='ascii' n=1000000 len=16                    ***      3.66 %       ±1.35% ±1.79% ±2.33%
util/text-encoder.js op='encodeInto' type='ascii' n=1000000 len=256                            0.29 %       ±0.88% ±1.17% ±1.53%
util/text-encoder.js op='encodeInto' type='ascii' n=1000000 len=32                    ***      4.01 %       ±1.34% ±1.79% ±2.34%
util/text-encoder.js op='encodeInto' type='ascii' n=1000000 len=8192                          -0.32 %       ±0.66% ±0.88% ±1.15%
util/text-encoder.js op='encodeInto' type='one-byte-string' n=1000000 len=1024          *     -0.48 %       ±0.37% ±0.50% ±0.65%
util/text-encoder.js op='encodeInto' type='one-byte-string' n=1000000 len=16          ***      5.39 %       ±2.49% ±3.34% ±4.41%
util/text-encoder.js op='encodeInto' type='one-byte-string' n=1000000 len=256         ***      1.44 %       ±0.48% ±0.63% ±0.83%
util/text-encoder.js op='encodeInto' type='one-byte-string' n=1000000 len=32          ***      4.67 %       ±1.01% ±1.34% ±1.74%
util/text-encoder.js op='encodeInto' type='one-byte-string' n=1000000 len=8192                 0.15 %       ±0.24% ±0.32% ±0.42%
util/text-encoder.js op='encodeInto' type='two-byte-string' n=1000000 len=1024        ***      1.29 %       ±0.49% ±0.66% ±0.86%
util/text-encoder.js op='encodeInto' type='two-byte-string' n=1000000 len=16                   2.23 %       ±2.90% ±3.90% ±5.16%
util/text-encoder.js op='encodeInto' type='two-byte-string' n=1000000 len=256         ***      1.10 %       ±0.48% ±0.64% ±0.84%
util/text-encoder.js op='encodeInto' type='two-byte-string' n=1000000 len=32          ***      6.31 %       ±1.60% ±2.14% ±2.82%
util/text-encoder.js op='encodeInto' type='two-byte-string' n=1000000 len=8192                 0.19 %       ±0.21% ±0.28% ±0.37%

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/gyp
  • @nodejs/http
  • @nodejs/http2
  • @nodejs/net
  • @nodejs/startup

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Feb 14, 2023
@anonrig anonrig added the performance Issues and PRs related to the performance of Node.js. label Feb 14, 2023
src/encoding.h Outdated Show resolved Hide resolved
src/node_binding.cc Outdated Show resolved Hide resolved
lib/internal/encoding.js Show resolved Hide resolved
src/env-inl.h Outdated Show resolved Hide resolved
@nodejs-github-bot
Copy link
Collaborator

- Increase the number of iteration to 1e6 to reduce flakes. 1e4
  can introduce flakes even when comparing the main branch
  against itself
- Replace the 1024 * 32 length test with 1024 * 8 since it would
  otherwise take too long to complete. Remove the 16 length test
  since it's not too different from 32.
- Check the results of the encoding methods at the end.
Move the bindings used by TextEncoder to a new binding for
more self-contained code.
Getting the buffer from a TypedArray created from the JS land
incurs a copy. For encodeInto() results we can just use an
AliasedArray and let the binding always own the store.
@joyeecheung
Copy link
Member Author

Rebased to fix merge conflict

@nodejs-github-bot
Copy link
Collaborator

@joyeecheung
Copy link
Member Author

joyeecheung commented Mar 3, 2023

cc @nodejs/buffer @nodejs/cpp-reviewers can I have some reviews please? Thanks.

Copy link
Contributor

@RaisinTen RaisinTen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code change LGTM (% lint fixes) but I'm curious about this part:

Getting the buffer from a TypedArray created from the JS land
incurs a copy.

Where does the copy take place?

@joyeecheung
Copy link
Member Author

Where does the copy take place?

v8::ArrayBufferView::Buffer().

@joyeecheung
Copy link
Member Author

Fixed linter error (looks like the bot is not working again: https://ci.nodejs.org/job/node-test-pull-request/50249/)

@nodejs-github-bot
Copy link
Collaborator

joyeecheung added a commit that referenced this pull request Mar 7, 2023
- Increase the number of iteration to 1e6 to reduce flakes. 1e4
  can introduce flakes even when comparing the main branch
  against itself
- Replace the 1024 * 32 length test with 1024 * 8 since it would
  otherwise take too long to complete. Remove the 16 length test
  since it's not too different from 32.
- Check the results of the encoding methods at the end.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
joyeecheung added a commit that referenced this pull request Mar 7, 2023
Move the bindings used by TextEncoder to a new binding for
more self-contained code.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
joyeecheung added a commit that referenced this pull request Mar 7, 2023
Getting the buffer from a TypedArray created from the JS land
incurs a copy. For encodeInto() results we can just use an
AliasedArray and let the binding always own the store.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
@joyeecheung
Copy link
Member Author

Landed in 3b0c047...e5933c8

@MoLow
Copy link
Member

MoLow commented Mar 8, 2023

this commit has landed without a passing CI/GH actions, introducing a lint error: #47007

@cjihrig
Copy link
Contributor

cjihrig commented Mar 8, 2023

Linter issue is addressed in #47003.

@joyeecheung
Copy link
Member Author

joyeecheung commented Mar 8, 2023

The CI was actually passing in https://ci.nodejs.org/job/node-test-pull-request/50258/, which included the fixup commit that should remove the unused using statement causing the lint error: 4ea7f2e. But somehow git was only able to apply that fixup commit partially during landing, and I didn't see any warnings?

targos pushed a commit that referenced this pull request Mar 13, 2023
- Increase the number of iteration to 1e6 to reduce flakes. 1e4
  can introduce flakes even when comparing the main branch
  against itself
- Replace the 1024 * 32 length test with 1024 * 8 since it would
  otherwise take too long to complete. Remove the 16 length test
  since it's not too different from 32.
- Check the results of the encoding methods at the end.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
targos pushed a commit that referenced this pull request Mar 13, 2023
Move the bindings used by TextEncoder to a new binding for
more self-contained code.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
targos pushed a commit that referenced this pull request Mar 13, 2023
Getting the buffer from a TypedArray created from the JS land
incurs a copy. For encodeInto() results we can just use an
AliasedArray and let the binding always own the store.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
targos pushed a commit that referenced this pull request Mar 14, 2023
- Increase the number of iteration to 1e6 to reduce flakes. 1e4
  can introduce flakes even when comparing the main branch
  against itself
- Replace the 1024 * 32 length test with 1024 * 8 since it would
  otherwise take too long to complete. Remove the 16 length test
  since it's not too different from 32.
- Check the results of the encoding methods at the end.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
targos pushed a commit that referenced this pull request Mar 14, 2023
Move the bindings used by TextEncoder to a new binding for
more self-contained code.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
targos pushed a commit that referenced this pull request Mar 14, 2023
Getting the buffer from a TypedArray created from the JS land
incurs a copy. For encodeInto() results we can just use an
AliasedArray and let the binding always own the store.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
@danielleadams
Copy link
Contributor

danielleadams commented Apr 3, 2023

@joyeecheung This breaks the build when landing in v18.x - do you mind opening a backport PR?

@danielleadams danielleadams added the backport-requested-v18.x PRs awaiting manual backport to the v18.x-staging branch. label Apr 3, 2023
CHECK_NOT_NULL(binding);
}

void BindingData::EncodeInto(const FunctionCallbackInfo<Value>& args) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity I exploring places where fast API could be applied, came across this function and PR, I experimented a bit it seems v8::FastApiTypedArray<uint8_t>& is indeed supported and it does get caught in fast call, but I am not sure how one could extract ->Buffer() from v8::FastApiTypedArray<uint8_t>&, is that even possible? just wondering do let me know your thoughts cc @anonrig @joyeecheung

Thank You!

targos pushed a commit that referenced this pull request Nov 10, 2023
- Increase the number of iteration to 1e6 to reduce flakes. 1e4
  can introduce flakes even when comparing the main branch
  against itself
- Replace the 1024 * 32 length test with 1024 * 8 since it would
  otherwise take too long to complete. Remove the 16 length test
  since it's not too different from 32.
- Check the results of the encoding methods at the end.

PR-URL: #46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
sercher added a commit to sercher/graaljs that referenced this pull request Apr 25, 2024
- Increase the number of iteration to 1e6 to reduce flakes. 1e4
  can introduce flakes even when comparing the main branch
  against itself
- Replace the 1024 * 32 length test with 1024 * 8 since it would
  otherwise take too long to complete. Remove the 16 length test
  since it's not too different from 32.
- Check the results of the encoding methods at the end.

PR-URL: nodejs/node#46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
sercher added a commit to sercher/graaljs that referenced this pull request Apr 25, 2024
- Increase the number of iteration to 1e6 to reduce flakes. 1e4
  can introduce flakes even when comparing the main branch
  against itself
- Replace the 1024 * 32 length test with 1024 * 8 since it would
  otherwise take too long to complete. Remove the 16 length test
  since it's not too different from 32.
- Check the results of the encoding methods at the end.

PR-URL: nodejs/node#46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-requested-v18.x PRs awaiting manual backport to the v18.x-staging branch. c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. performance Issues and PRs related to the performance of Node.js.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants