fix(NODE-3451): fix performance regression from v1 #451

dariakp · 2021-08-16T23:01:08Z

Description

NODE-3451 documents a performance regression in the node driver v4, which is actually due to a performance regression in js-bson v4 deserialization method (compared to v1).

The notable culprits were:

Mandatory rewrapping of all input types, even buffer, into a Buffer class via ensureBuffer
Looping over all string bytes via validateUtf8
Suboptimal loops over object keys to identify and handle DBRefs

What changed?

The entry point deserialize method has been updated to check instanceof Buffer and skip the rewrapping in those instances; this is a temporary measure that only addresses performance for Node.js buffers
- NOTE: deserializeStream was left untouched for scope reasons
The deserializeObject method was updated to check for the presence of potential DBRef keys as it goes, removing the negative performance impact for any objects that do not contain any DBRef keys; there is some further optimization that could be done to eliminate the isDBRefLike check altogether, but since we expect these to be pretty rare, it didn't seem worth optimizing that specific edge case
The validateUtf8 method was updated to only run if the \uFFFD character is present: technically, this makes the performance worse for strings that do contain that special character, however, for all other strings, the loop over the resulting string with charCodeAt is faster; unfortunately there is not much else that can be done to optimize string deserialization without losing the validation (short of doing our own decoding)
- NOTE: the validateUtf8 call in DBPOINTER type was left untouched for scope reasons

After these changes, there may still be a residual 5% performance degradation for the typical use case relative to v1 which can be attributed to the remaining buffer and string validation.

src/bson.ts

src/parser/deserializer.ts

dariakp added 5 commits August 16, 2021 14:22

chore: fix format

481ed16

fix: do not rewrap buffer instances when deserializing

283cddc

fix: optimize DBref check in deserialization

d3e32fe

fix: try to optimize utf8 validation

cc0f32d

refactor: slight optimization

0de1578

dariakp changed the title ~~fix(NODE-3451): partially fix performance regression from v1~~ fix(NODE-3451): fix performance regression from v1 Aug 16, 2021

dariakp requested a review from nbbeeken August 17, 2021 15:22

dariakp assigned nbbeeken Aug 17, 2021

dariakp added the Primary Review In Review with primary reviewer, not yet ready for team's eyes label Aug 17, 2021

nbbeeken requested changes Aug 17, 2021

View reviewed changes

src/bson.ts Show resolved Hide resolved

src/parser/deserializer.ts Show resolved Hide resolved

src/parser/deserializer.ts Show resolved Hide resolved

nbbeeken approved these changes Aug 17, 2021

View reviewed changes

nbbeeken requested a review from emadum August 17, 2021 18:01

nbbeeken added Team Review Needs review from team and removed Primary Review In Review with primary reviewer, not yet ready for team's eyes labels Aug 17, 2021

nbbeeken marked this pull request as ready for review August 17, 2021 18:01

emadum approved these changes Aug 17, 2021

View reviewed changes

dariakp merged commit 2330ab1 into master Aug 18, 2021

dariakp deleted the NODE-3451/fix-performance-regression branch August 18, 2021 15:12

github-actions bot mentioned this pull request May 7, 2024

chore(main): release 7.0.0 [skip-ci] #687

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(NODE-3451): fix performance regression from v1 #451

fix(NODE-3451): fix performance regression from v1 #451

dariakp commented Aug 16, 2021 •

edited

Loading

fix(NODE-3451): fix performance regression from v1 #451

fix(NODE-3451): fix performance regression from v1 #451

Conversation

dariakp commented Aug 16, 2021 • edited Loading

Description

dariakp commented Aug 16, 2021 •

edited

Loading