Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perf: Don't allocate as much in Decoder::string #168

Merged
merged 1 commit into from
Oct 7, 2020
Merged

Conversation

khyperia
Copy link
Collaborator

@khyperia khyperia commented Oct 5, 2020

On a sample module, time taken just in the Decoder::string method was 15% of the entire parse time, a significant majority of which is in the extend call reallocating over and over. This is obviously a hot spot, so it should be optimized a little more.

So, instead of consuming words and building up a buffer, do a scan over the bytes, extract the string, and advance the proper length. This does only one allocation, the .to_string at the end, as opposed to the log(n) allocations before.

This new method is only 5% of the entire parse time. Parse time for the module went from 14.866765ms to 12.630514ms, a 15% improvement. 1% of that 5% time is scanning for the null byte, another 1% for utf8 validation, and 3% for the to_string allocation/copy.

(Unfortunately there is no std function for null byte scan and utf8 validation at the same time, which would speed this up even more 😢. Also, there's an optimized memchr function in std, but frustratingly, it's not exposed anywhere - it's so close right here, but they discard the position :( https://github.com/rust-lang/rust/blob/efbaa413061c2a6e52f06f00a60ee7830fcf3ea5/library/core/src/slice/cmp.rs#L273)

.position(|&c| c == 0)
.ok_or_else(|| match self.limit {
Some(_) => Error::LimitReached(self.offset + slice.len()),
None => Error::StreamExpected(self.offset),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this also be self.offset + slice.len() since previously this would only happen scanning one of the last words in the slice?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, tests assert this behavior. I don't know why, but, they do.

@khyperia khyperia merged commit db903b0 into master Oct 7, 2020
@khyperia khyperia deleted the string-perf branch October 7, 2020 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants