Hyperfine benchmarks #163

niklasmohrin · 2021-02-04T15:44:42Z

Closes #129

I started putting together a Dockerfile that gets all the clients and benches them with hyperfine. I decided to do it all in Docker, because I would not want to have all the clients on my system and to make it accessible for third parties to easily verify the results that our benchmarks produce.

At the time of writing, only tealdeer and the C client are compared, but I will add more clients.

Eventually, we could think about running benchmarks in CI, but this is not the focus of this PR.

niklasmohrin · 2021-02-04T18:55:27Z

I noticed that when using fast-tldr (or tldr-hs), the output still has the {{ and }} which is not intended by the client specification. I am not sure how to compensate these differences fairly. After all, fast-tldr runs considerably faster than tealdeer in hyperfine.

dbrgn

Ah, great idea with Docker! This way the benchmarks are reproducible. And if something (e.g. compile flags) are not fair, people can correct them.

Yeah, the features of the various clients are different. Every new feature costs a little bit performance. Tealdeer without custom formatting and no support for a config file is surely faster than the version we've arrived at by now. fast-tldr states "We strive to be the fastest tldr-client with no compromises regarding features". However, speed and features are always a compromise 🙂

I tried to test this, but right now fast-tldr complains about a missing glibc:

./fast-tldr: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found (required by ./fast-tldr)

PS: Make sure to initialize all caches, and ensure that the clients don't clash (e.g. if they use the same config path or cache path). If possible, use a client-local cache directory, to ensure proper separation.

benchmarks/Dockerfile

dbrgn · 2021-05-09T16:08:43Z

Just so I'm up to date: @niklasmohrin this is still WIP, right? (I'll wait with another review until you think it's ready.)

niklasmohrin · 2021-05-09T17:06:33Z

@dbrgn Yeah I am a bit out of the loop myself, I will have to come back to this with the zig build some time soon. I will just request the review once it's ready.

About the results: I think it would be fairest if we included some evaluation with the results, for example that fast-tldr doesn't really parse the files and just outputs variables as they are read, otherwise we really are just breaking our own legs here

We have some performance PRs on the way, I am excited about #187. Is there any news on #108 or a way to get rid of docopt otherwise?

dbrgn · 2021-05-09T17:16:22Z

Yeah, I agree, clients that simply stream a file to stdout aren't really fair comparison.

dbrgn · 2021-05-09T17:16:58Z

I'm working on #108 right now. It would be kind of nice to have benchmarks before and after, but that can also be done after the PR was merged.

niklasmohrin · 2021-05-09T17:27:52Z

Oh, before and after seems like a nice thing to have too, well then I can't rest on that argument anymore 😆

dbrgn · 2021-05-09T17:30:21Z

Yeah, but as I mentioned, we can simply backport the hyperfine docker script to an old version of the repository. That's no problem 🙂

niklasmohrin · 2021-06-02T22:40:41Z

@dbrgn I think I am satisfied with the current state of Dockerfile. I found a way to drop the disk caches from within the container, so that we can test with cold caches (which is closer to the actual use case).

Unless you have something else to add, the only thing left is getting outfieldr to build, don't know when dev will respond (only messaged earlier).

With #187 merged, we are also standing pretty strong against fast-tldr again, only falling 5-10% behind. I am wondering if we can reproduce the 30x speedup that outfieldr claim on their README.

Nevertheless, I think having all the results in a table that outlines the tradeoffs is fair to everyone

README.md

benchmarks/Dockerfile

dbrgn

Really cool, thanks @niklasmohrin!

I think the Node and Python clients should be listed as well in the comparison, because both are quite commonly being used (the Python client is the one packaged in official Arch repos) and it's important to show the difference in invocation time between interpreted and compiled languages. However, feel free to merge the current state as-is, we can always improve it later on.

Here are my numbers for the benchmarks on a recent Ryzen 5900X system:

tealdeer 14.0 ms ± 0.8 ms
tldr-c 29.5 ms ± 1.4 ms
tldr-hs 19.6 ms ± 0.6 ms
fast-tldr 14.3 ms ± 0.7 ms
tldr-bash 25.8 ms ± 0.6 ms
outfieldr 6.1 ms ± 0.5 ms
tldr-node 403.8 ms ± 9.0 ms

It's interesting that here the tealdeer numbers are better than fast-tldr, I wonder why that's the case? (Feel free to use those numbers if you want to make us "the second-fastest client on my system" 😜)

PS: I'm wondering whether in the long run the benchmarking should be moved to a separate repository to simplify maintenance (the benchmarks can be interesting as a commonly-maintained reference for all tldr clients), but for now we can keep it inside the tealdeer repository as long as we don't get a "please add/update this implementation in the benchmark" request every week 🙂

README.md

niklasmohrin · 2021-06-06T10:26:00Z

From this perspective, having node and python in the list makes sense, yeah. I didn't want to run the node app 500 times, but I have two ideas for how to present that:

We list node in the table, but with an asterisk to note that the mean is only from 50 runs or
There are actually two columns, one for a benchmark with 500 runs and one with a benchmark with only 50 runs. Node just wouldn't appear in the 500 column.

I think I'd prefer the first option, because it's less (redundant) numbers on the screen and still presents the main point that node is an order of magnitude slower. What do you think?

Having this code in a separate repo means it is its own thing, whereas here it's just documentation for how numbers in a readme were generated. I would like to keep the impression of the latter for as long as possible. I was wondering if we should set up a gh action to run the benchmarks (I think you can run privileged docker containers inside of the vm), but the ci runners are afaik quite the opposite of a cold machine, so I'm not sure (it's more work too ofc). It would add credibility though, since the numbers don't just come from a single person

dbrgn · 2021-06-06T14:19:59Z

50 runs are perfectly fine. Btw, why don't we simply omit the --runs 400 argument? Hyperfine will try to determine on its own how many runs are required to achieve statistical significance. The number of runs is not really relevant to the user, as long as the numbers are repeatable.

niklasmohrin · 2021-06-10T15:07:31Z

Without this, it would do too few runs leading to too much variance. I also used --min-runs but that would just mean that everyone ran about that number of times and then I came to the conclusion that I could just fix the number of runs 🤷‍♂️ I'll see how reliable 50 runs are for me and check back in with you

dbrgn · 2021-06-10T16:51:26Z

Doesn't hyperfine take the variance into account? I assumed so, but maybe I'm wrong...

niklasmohrin · 2021-06-10T17:32:26Z

I am not sure, maybe it does to a degree, but with the small number of runs it was still somewhat unstable when I tested

dbrgn · 2021-06-11T06:50:03Z

Ok, then let's not hold up this PR any more, we can do additional improvements later. Feel free to merge!

niklasmohrin · 2021-06-11T14:57:14Z

(I gotta add python first though)

niklasmohrin · 2021-07-29T16:42:40Z

@dbrgn I think everything is ready

dbrgn · 2021-07-30T07:52:07Z

Can you build the Dockerfile? On my machine, tldr-hs fails every time (even with --no-cache).

Step 10/39 : RUN git clone https://github.com/psibi/tldr-hs.git         && cd tldr-hs         && stack build --install-ghc
 ---> Running in ea506c72af75
Cloning into 'tldr-hs'...
Downloading lts-18.0 build plan ...
RedownloadInvalidResponse Request {
  host                 = "raw.githubusercontent.com"
  port                 = 443
  secure               = True
  requestHeaders       = []
  path                 = "/fpco/lts-haskell/master//lts-18.0.yaml"
  queryString          = ""
  method               = "GET"
  proxy                = Nothing
  rawBody              = False
  redirectCount        = 10
  responseTimeout      = ResponseTimeoutDefault
  requestVersion       = HTTP/1.1
}
 "/root/.stack/build-plan/lts-18.0.yaml" (Response {responseStatus = Status {statusCode = 404, statusMessage = "Not Found"}, responseVersion = HTTP/1.1, responseHeaders = [("Connection","keep-alive"),("Content-Length","14"),("Content-Security-Policy","default-src 'none'; style-src 'unsafe-inline'; sandbox"),("Strict-Transport-Security","max-age=31536000"),("X-Content-Type-Options","nosniff"),("X-Frame-Options","deny"),("X-XSS-Protection","1; mode=block"),("Content-Type","text/plain; charset=utf-8"),("X-GitHub-Request-Id","C986:10058:47268:7522A:6103AE47"),("Accept-Ranges","bytes"),("Date","Fri, 30 Jul 2021 07:46:15 GMT"),("Via","1.1 varnish"),("X-Served-By","cache-fra19120-FRA"),("X-Cache","MISS"),("X-Cache-Hits","0"),("X-Timer","S1627631175.106098,VS0,VE149"),("Vary","Authorization,Accept-Encoding"),("Access-Control-Allow-Origin","*"),("X-Fastly-Request-ID","0468e71bfb3d2f26e693cecec5a72f9a0374bb0b"),("Expires","Fri, 30 Jul 2021 07:51:15 GMT"),("Source-Age","0")], responseBody = (), responseCookieJar = CJ {expose = []}, responseClose' = ResponseClose})
The command '/bin/sh -c git clone https://github.com/psibi/tldr-hs.git         && cd tldr-hs         && stack build --install-ghc' returned a non-zero code: 1

niklasmohrin · 2021-07-30T09:40:01Z

Ah, I think it's the haskell:8.6, I reverted too far after trying to avoid building half the haskell ecosystem... I tested a no-cache rebuild (when I still had just haskell) and it worked there, I am gonna build again now before pushing :D

dbrgn · 2021-07-30T11:09:33Z

That works, although I'd pin the version to haskell:9.0 instead. Building works for me with that version.

dbrgn · 2021-07-30T11:25:52Z

It's interesting how big the effect of the disk cache is! After running sync; echo 3 | tee /proc/sys/vm/drop_caches, a tealdeer run takes around 14 ms. For subsequent calls, it's only 2-4 ms.

Here are my numbers:

niklasmohrin · 2021-07-30T12:01:54Z

That works, although I'd pin the version to haskell:9.0 instead. Building works for me with that version.

Prior to today, I had it pinned to 8.6, but the build would fail. I think that --install-ghc always figures out the needed version, which is why I just had it be the latest

I think running on a cold cache is most realistic, what do you think?

dbrgn · 2021-07-30T13:28:42Z

which is why I just had it be the latest

Ok, I don't know about that, you might be right. (I personally prefer to pin versions to avoid unexpected incompatibilities when a newer version of the image is pushed.)

I think running on a cold cache is most realistic, what do you think?

Yes, I agree!

dbrgn

Don't forget to squash 🙂

niklasmohrin · 2021-07-30T13:33:44Z

whoops, apparently --autosquash does nothing without --interactive 🤔

I think I would separate the docker and readme changes, because we may want to trace the table in the readme separately from the benchmark setup

dbrgn · 2021-07-30T13:41:14Z

whoops, apparently --autosquash does nothing without --interactive 🤔

Yep, that's the case.

I think I would separate the docker and readme changes, because we may want to trace the table in the readme separately from the benchmark setup

Yeah, that makes sense. I was referring to the fixup commit 🙂 Feel free to merge now.

niklasmohrin marked this pull request as draft February 4, 2021 15:44

dbrgn reviewed Feb 4, 2021

View reviewed changes

benchmarks/Dockerfile Outdated Show resolved Hide resolved

niklasmohrin force-pushed the hyperfine branch 2 times, most recently from 7026b2d to a44815e Compare June 2, 2021 22:35

niklasmohrin force-pushed the hyperfine branch from a44815e to eba31cc Compare June 3, 2021 17:14

niklasmohrin marked this pull request as ready for review June 3, 2021 17:14

niklasmohrin force-pushed the hyperfine branch from eba31cc to 23d28d5 Compare June 3, 2021 18:06

ve-nt reviewed Jun 5, 2021

View reviewed changes

README.md Outdated Show resolved Hide resolved

ve-nt suggested changes Jun 5, 2021

View reviewed changes

benchmarks/Dockerfile Outdated Show resolved Hide resolved

niklasmohrin force-pushed the hyperfine branch 2 times, most recently from 9ac45da to 32148c5 Compare June 5, 2021 16:08

niklasmohrin requested a review from dbrgn June 5, 2021 16:17

dbrgn approved these changes Jun 6, 2021

View reviewed changes

README.md Outdated Show resolved Hide resolved

niklasmohrin force-pushed the hyperfine branch from 32148c5 to d50c56e Compare June 12, 2021 15:42

niklasmohrin force-pushed the hyperfine branch from d50c56e to d087303 Compare July 29, 2021 16:41

niklasmohrin force-pushed the hyperfine branch from d087303 to 3b88cf5 Compare July 29, 2021 16:44

dbrgn approved these changes Jul 30, 2021

View reviewed changes

niklasmohrin and others added 2 commits July 30, 2021 15:31

Add Dockerfile for benchmarks

e7c5daa

Update README.md to include updated benchmark results

d1d36a9

niklasmohrin force-pushed the hyperfine branch from a76bb66 to d1d36a9 Compare July 30, 2021 13:31

niklasmohrin merged commit 123c809 into tealdeer-rs:master Jul 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hyperfine benchmarks #163

Hyperfine benchmarks #163

niklasmohrin commented Feb 4, 2021

niklasmohrin commented Feb 4, 2021

dbrgn left a comment

dbrgn commented May 9, 2021

niklasmohrin commented May 9, 2021

dbrgn commented May 9, 2021

dbrgn commented May 9, 2021

niklasmohrin commented May 9, 2021

dbrgn commented May 9, 2021

niklasmohrin commented Jun 2, 2021

dbrgn left a comment •

edited

Loading

niklasmohrin commented Jun 6, 2021

dbrgn commented Jun 6, 2021

niklasmohrin commented Jun 10, 2021

dbrgn commented Jun 10, 2021

niklasmohrin commented Jun 10, 2021

dbrgn commented Jun 11, 2021

niklasmohrin commented Jun 11, 2021

niklasmohrin commented Jul 29, 2021

dbrgn commented Jul 30, 2021

niklasmohrin commented Jul 30, 2021

dbrgn commented Jul 30, 2021

dbrgn commented Jul 30, 2021

niklasmohrin commented Jul 30, 2021

dbrgn commented Jul 30, 2021

dbrgn left a comment

niklasmohrin commented Jul 30, 2021

dbrgn commented Jul 30, 2021

Hyperfine benchmarks #163

Hyperfine benchmarks #163

Conversation

niklasmohrin commented Feb 4, 2021

niklasmohrin commented Feb 4, 2021

dbrgn left a comment

Choose a reason for hiding this comment

dbrgn commented May 9, 2021

niklasmohrin commented May 9, 2021

dbrgn commented May 9, 2021

dbrgn commented May 9, 2021

niklasmohrin commented May 9, 2021

dbrgn commented May 9, 2021

niklasmohrin commented Jun 2, 2021

dbrgn left a comment • edited Loading

Choose a reason for hiding this comment

niklasmohrin commented Jun 6, 2021

dbrgn commented Jun 6, 2021

niklasmohrin commented Jun 10, 2021

dbrgn commented Jun 10, 2021

niklasmohrin commented Jun 10, 2021

dbrgn commented Jun 11, 2021

niklasmohrin commented Jun 11, 2021

niklasmohrin commented Jul 29, 2021

dbrgn commented Jul 30, 2021

niklasmohrin commented Jul 30, 2021

dbrgn commented Jul 30, 2021

dbrgn commented Jul 30, 2021

niklasmohrin commented Jul 30, 2021

dbrgn commented Jul 30, 2021

dbrgn left a comment

Choose a reason for hiding this comment

niklasmohrin commented Jul 30, 2021

dbrgn commented Jul 30, 2021

dbrgn left a comment •

edited

Loading