Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce allocations #29

Merged
merged 7 commits into from
Aug 13, 2024
Merged

Reduce allocations #29

merged 7 commits into from
Aug 13, 2024

Conversation

clipperhouse
Copy link
Contributor

@clipperhouse clipperhouse commented Aug 13, 2024

Remove second return parameter suffixRunes from many functions, as it’s not used by every caller, and requires an allocation. In many cases, just its length is needed, so use utf8.RuneCountInString. In cases where suffixRunes is needed, create it locally.

Before:

Benchmark_Stem-8   	      50950417 ns/op	   4.41 MB/s	38536284 B/op	 2066112 allocs/op

After:

Benchmark_Stem-8   	      25477538 ns/op	   8.82 MB/s	 1377490 B/op	   58406 allocs/op

~2x throughput improvement and ~30x allocation reduction.

Allocs down to 75, ~double throughput?
@kljensen kljensen self-assigned this Aug 13, 2024
@kljensen
Copy link
Owner

Thanks so much! I'll take a look shortly.

@clipperhouse
Copy link
Contributor Author

Great! I suspect the failures above have to do with old Go versions. I updated go.mod and test.yml for Go 1.19 and above (and got rid of the cache), let’s see if that fixes it.

(If you’re not ready to update Go version, that’s fine, but at least we’ll have some troubleshooting.)

@clipperhouse
Copy link
Contributor Author

Looks good over here: https://github.com/clipperhouse/snowball/actions

@kljensen
Copy link
Owner

Thanks @clipperhouse. Three quick questions. (I haven't looked at this code in a while, so apologies if these are non-sensical.)

  1. It seems like there were a lot of white space changes. Can you help me understand those?
  2. How do you feel about using suffixLength or suffixLen instead of sLen throughout?
  3. I like your benchmark above. Is there a way to incorporate that into the automated tests?

@clipperhouse
Copy link
Contributor Author

Cheers @kljensen

  1. I didn’t do any intentional whitespace changes, but the Go formatter seemed to. In particular it removed a lot of empty // comment lines. Here’s some explanation. On GitHub, you can “Hide whitespace changes” in the PR, under the little settings wheel icon at the top. I can go through and try to undo those formats if it’s important.
  2. Done!
  3. We could change the test.yml action to go test ./... -benchmem -bench=., which will run (log) the benchmarks in addition to the tests. In terms of doing anything with benchmark results, like tracking over time, I don’t know a good way to do that. I’m sure something exists.

@kljensen
Copy link
Owner

Super. I suspect I'll squash merge. Would you add yourself to the list of contributors?

@clipperhouse
Copy link
Contributor Author

Done (I assume you mean contributors in the README?) and thanks! I usually squash PRs as well.

@kljensen kljensen merged commit 2569472 into kljensen:master Aug 13, 2024
@kljensen
Copy link
Owner

Thanks @clipperhouse merged and released v0.10.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants