Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buffer: optimize writing short strings #54310

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ronag
Copy link
Member

@ronag ronag commented Aug 10, 2024

09:54:54                                                                     confidence improvement accuracy (*)   (**)  (***)
09:54:54 buffers/buffer-write-string.js n=1000000 len=0 args='' encoding=''         ***    562.28 %       ±2.36% ±3.17% ±4.18%
09:54:54 buffers/buffer-write-string.js n=1000000 len=1 args='' encoding=''         ***    549.74 %       ±3.34% ±4.46% ±5.83%
09:54:54 buffers/buffer-write-string.js n=1000000 len=16 args='' encoding=''        ***     35.43 %       ±2.85% ±3.81% ±4.97%
09:54:54 buffers/buffer-write-string.js n=1000000 len=32 args='' encoding=''         **     -2.57 %       ±1.79% ±2.38% ±3.11%
09:54:54 buffers/buffer-write-string.js n=1000000 len=8 args='' encoding=''         ***    209.32 %       ±4.42% ±5.93% ±7.84%

@ronag ronag requested a review from anonrig August 10, 2024 18:51
@ronag
Copy link
Member Author

ronag commented Aug 10, 2024

@nodejs/buffer

@ronag ronag requested a review from mcollina August 10, 2024 18:52
@nodejs-github-bot nodejs-github-bot added buffer Issues and PRs related to the buffer subsystem. needs-ci PRs that need a full CI run. labels Aug 10, 2024
@ronag ronag force-pushed the write-short-string branch from e470948 to 82438ad Compare August 10, 2024 18:54
@avivkeller avivkeller added the performance Issues and PRs related to the performance of Node.js. label Aug 10, 2024
@ronag ronag force-pushed the write-short-string branch from 82438ad to 27a11a3 Compare August 10, 2024 18:54
lib/buffer.js Outdated Show resolved Hide resolved
@ronag ronag added the request-ci Add this label to start a Jenkins CI on a PR. label Aug 10, 2024
lib/buffer.js Show resolved Hide resolved
@github-actions github-actions bot added request-ci-failed An error occurred while starting CI via request-ci label, and manual interventon is needed. and removed request-ci Add this label to start a Jenkins CI on a PR. labels Aug 10, 2024
Copy link
Contributor

Failed to start CI
   ⚠  Something was pushed to the Pull Request branch since the last approving review.
   ✘  Refusing to run CI on potentially unsafe PR
https://github.com/nodejs/node/actions/runs/10333845691

@ronag ronag force-pushed the write-short-string branch 4 times, most recently from 644b157 to ebbe73f Compare August 10, 2024 19:26
@ronag
Copy link
Member Author

ronag commented Aug 10, 2024

@avivkeller avivkeller added needs-benchmark-ci PR that need a benchmark CI run. and removed request-ci-failed An error occurred while starting CI via request-ci label, and manual interventon is needed. labels Aug 10, 2024
@ronag ronag force-pushed the write-short-string branch 2 times, most recently from d8eb307 to 67b80d8 Compare August 10, 2024 19:30
ronag added a commit to nxtedition/node that referenced this pull request Aug 10, 2024
@ronag ronag force-pushed the write-short-string branch from 67b80d8 to a9fd9ac Compare August 10, 2024 19:35
Copy link

codecov bot commented Aug 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.09%. Comparing base (298ff4f) to head (0261644).
Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #54310      +/-   ##
==========================================
- Coverage   87.09%   87.09%   -0.01%     
==========================================
  Files         647      647              
  Lines      181816   181837      +21     
  Branches    34884    34893       +9     
==========================================
+ Hits       158360   158377      +17     
+ Misses      16764    16760       -4     
- Partials     6692     6700       +8     
Files Coverage Δ
lib/buffer.js 96.48% <100.00%> (+0.05%) ⬆️

... and 26 files with indirect coverage changes

while (true) {
const code = StringPrototypeCharCodeAt(string, n);
if (code >= 128) {
break;
Copy link
Member

@lpinca lpinca Aug 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that buffer.write('aaaaa€') now returns 3 instead of 8.

Edit: I think it is just slower now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on?

buffers/buffer-write-string.js n=1000000 len=1 args='' encoding=''         ***    526.57 %      ±36.19% ±51.92% ±76.21%
buffers/buffer-write-string.js n=1000000 len=16 args='' encoding=''        ***     52.39 %      ±18.12% ±25.97% ±38.06%
buffers/buffer-write-string.js n=1000000 len=32 args='' encoding=''                -3.44 %      ±10.55% ±14.63% ±20.33%
buffers/buffer-write-string.js n=1000000 len=8 args='' encoding=''         ***    196.38 %       ±3.09%  ±4.43%  ±6.49%

Copy link
Member

@lpinca lpinca Aug 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the benchmark does not use a string with mixed sigle byte and multibyte characters. I think the worst case would be something like this 'a'.repeat(15) + '€'. In this case the optimization is only overhead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We're trying to optimize for the most common case.

Copy link
Member

@lpinca lpinca Aug 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is an ASCII only string whose length is less than or equal to 16 characters the most common case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regression is minimal for non one byte case:

buffers/buffer-write-string-short.js n=1000000 char='' len=0         ***    606.09 %      ±26.40% ±36.19% ±49.34%
buffers/buffer-write-string-short.js n=1000000 char='' len=1                 -0.45 %      ±13.79% ±19.68% ±28.63%
buffers/buffer-write-string-short.js n=1000000 char='' len=16                -0.76 %       ±5.97%  ±8.50% ±12.32%
buffers/buffer-write-string-short.js n=1000000 char='' len=32                 0.51 %      ±11.84% ±16.61% ±23.50%
buffers/buffer-write-string-short.js n=1000000 char='' len=8           *     -8.12 %       ±6.90%  ±9.87% ±14.43%
buffers/buffer-write-string-short.js n=1000000 char='a' len=0         ***    543.24 %      ±12.04% ±17.27% ±25.35%
buffers/buffer-write-string-short.js n=1000000 char='a' len=1         ***    478.53 %      ±31.70% ±45.53% ±66.96%
buffers/buffer-write-string-short.js n=1000000 char='a' len=16        ***    112.38 %       ±6.11%  ±8.72% ±12.69%
buffers/buffer-write-string-short.js n=1000000 char='a' len=32         **     -5.18 %       ±3.21%  ±4.57%  ±6.64%
buffers/buffer-write-string-short.js n=1000000 char='a' len=8         ***    218.82 %      ±11.24% ±15.64% ±21.85%

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

@ronag ronag Aug 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better comparison:

                                                                  confidence improvement accuracy (*)     (**)    (***)
buffers/buffer-write-string-short.js n=1000000 fallback=0 len=0         ***    555.83 %      ±10.01%  ±14.34%  ±21.00%
buffers/buffer-write-string-short.js n=1000000 fallback=0 len=1         ***    447.19 %     ±126.93% ±182.34% ±268.25%
buffers/buffer-write-string-short.js n=1000000 fallback=0 len=16        ***    118.07 %      ±18.44%  ±26.17%  ±37.74%
buffers/buffer-write-string-short.js n=1000000 fallback=0 len=32                -1.22 %       ±5.77%   ±8.22%  ±11.92%
buffers/buffer-write-string-short.js n=1000000 fallback=0 len=8         ***    192.61 %      ±38.32%  ±54.53%  ±78.97%
buffers/buffer-write-string-short.js n=1000000 fallback=1 len=0         ***    522.42 %     ±106.34% ±152.67% ±224.38%
buffers/buffer-write-string-short.js n=1000000 fallback=1 len=1          **     -4.23 %       ±3.03%   ±4.21%   ±5.85%
buffers/buffer-write-string-short.js n=1000000 fallback=1 len=16        ***    -24.35 %       ±9.58%  ±13.18%  ±18.07%
buffers/buffer-write-string-short.js n=1000000 fallback=1 len=32                -4.96 %      ±12.35%  ±17.52%  ±25.26%
buffers/buffer-write-string-short.js n=1000000 fallback=1 len=8         ***    -22.52 %       ±7.75%  ±11.02%  ±15.95%

@mcollina @anonrig wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not make much sense to benchmark length 0 and 1 imho. I'm also not convinced that length <= 16 is a common case, but I have no proof (so far, there is only one comment with less 16 bytes in the PR comments).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use it a lot for primary keys that usually are less than 16 bytes. Also for building e.g. json responses require writing lots of small strings.

@ronag ronag force-pushed the write-short-string branch from a9fd9ac to af328a2 Compare August 11, 2024 07:38
ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
function main({ len, n, fallback }) {
const buf = Buffer.allocUnsafe(len);
const string = fallback && len > 0
? Buffer.from('a'.repeat(len - 1) + '€').toString()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you fix the formatting issues?

ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
@ronag ronag marked this pull request as draft August 11, 2024 15:51
@ronag
Copy link
Member Author

ronag commented Aug 11, 2024

Putting this on hold until #54311 after lands.

ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
ronag added a commit to nxtedition/node that referenced this pull request Aug 11, 2024
@avivkeller
Copy link
Member

Putting this on hold until #54311 after lands.

I've added the "blocked" label, as this PR is on hold until another is resolved. Feel free to undo.

@avivkeller avivkeller added the blocked PRs that are blocked by other issues or PRs. label Aug 12, 2024
ronag added a commit to nxtedition/node that referenced this pull request Aug 13, 2024
ronag added a commit to nxtedition/node that referenced this pull request Aug 14, 2024
nodejs-github-bot pushed a commit that referenced this pull request Aug 15, 2024
PR-URL: #54310
PR-URL: #54311
Reviewed-By: Michaël Zasso <targos@protonmail.com>
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
RafaelGSS pushed a commit that referenced this pull request Aug 19, 2024
PR-URL: #54310
PR-URL: #54311
Reviewed-By: Michaël Zasso <targos@protonmail.com>
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
RafaelGSS pushed a commit that referenced this pull request Aug 21, 2024
PR-URL: #54310
PR-URL: #54311
Reviewed-By: Michaël Zasso <targos@protonmail.com>
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Santiago Gimeno <santiago.gimeno@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
@avivkeller avivkeller removed the blocked PRs that are blocked by other issues or PRs. label Sep 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buffer Issues and PRs related to the buffer subsystem. needs-benchmark-ci PR that need a benchmark CI run. needs-ci PRs that need a full CI run. performance Issues and PRs related to the performance of Node.js.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants