Skip to content

Latest commit

 

History

History
163 lines (145 loc) · 8.48 KB

performance.org

File metadata and controls

163 lines (145 loc) · 8.48 KB

Genie - Performance

Measurements

Some simple and repeated tests were done with the ‘time’ command. Also some network measurements were taken with Wireshark. This gives some basic ideas about which parts of the process take the most time. Be aware that this is not a rigorous performance analysis.

Loading Babashka and the genie.clj client

Some tests with different scripts and options show the following:

testmsecnotes
bb minimal-script21-25minimal babashka script, baseline
bb genie.clj -h51-53Loading script, some output, no network
–list-sessions52-54Loading script, minor network calls
test-add-numbers310-320No logging, just the output.
test-add-numers -v client324-327With logging, not much extra overhead

The genie.clj script is quite large (about 800 lines), and takes about 30 msec extra to load, compared to a minimal script. Executing a script takes a lot more time, see below for futher (network) analysis.

Full session, including waiting on closed

A network measurement with Wireshark showed the following packets and timestamps (all msec) on the ‘local-wire’:

timediffnotes
216TCP connect
2171c:clone
2203s:session
26848c:exec
29123s:answer
33241s:nil
37644c:close, no FIN
477101s:closed, no FIN-ACK
4847c:FIN-ACK
4840s:FIN-ACK

Total execution time here was 316 msec. Network time is 484-216=268 msec. So client overhead = 316 - 268 = 48 msec.

The main delays:

  • prepare the script by the client and send request - 48 msec
  • execute the script - 23 msec
  • return the nil answer at the end - 41 msec
  • prepare the close operation by the client and send request - 44 msec
  • close the session and send reply - 101 msec

Full session, with close, without waiting on closed

Without waiting for the closed-reply, it is faster:

timediffnotes
287TCP connect
2870c:clone
2892s:session
33647c:exec
35923s:answer
40041s:nil
4066c:close, incl FIN
4060s:FIN-ACK

Total execution here was 165 msec. Network time is 406-287=119 msec. So client overhead = 165 - 119 = 46 msec.

We do need to send the close operation, otherwise the session will be kept alive.

Full session with Tcl script and rep binary

Testing with a Tcl script calling the rep binary of course has more process overhead. But the network session is shorter:

timediffnotes
601TCP connect
6010c:clone
6043s:session
6040c:exec
62521s:answer
66843s:nil (included with answer)
6680c:close, no FIN
769101s:closed, no FIN-ACK
7690c:FIN-ACK
7701s:FIN-ACK

The only real delays (>5 msec) are now from the server/daemon.

Comparison table

The following table shows how long each part takes in Babashka compared to the Rep binary:

actiondiff-bbdiff-repnotes
TCP connect
c:clone10
s:session33
c:exec480bb takes time, rep does not
s:answer2321similar in both cases
s:nil4143similar too
c:close, no FIN440bb takes time, rep does not
s:closed, no FIN-ACK101101closing takes a long time
c:FIN-ACK70
s:FIN-ACK01

The cause of the slowness in Babashka (bb) can of course lie both in the Babashka binary and the Genie.clj script. We need to do some further investigation here.

Test with rep binary

Another baseline is direct execution using the rep binary. Also tested on Ubuntu 18.04.

$ time rep --line='FILE:LINE:COLUMN' --no-print=value -p 7888 "(genied.client/exec-script \"/home/nico/cljlib/genie/test/test_add_numbers.clj\" 'test-add-numbers/main {:cwd \"/home/nico/cljlib/genie/client\" :script \"/home/nico/cljlib/genie/test/test_add_numbers.clj\" :opt {:test \"0\" :noreload \"0\" :main \"\" :port \"7888\" :verbose \"1\" :nocheckserver \"0\" :log-level \"info\"} :client-version \"0.1.0\" :protocol-version \"0.1.0\"} [\"1\" \"2\" \"3\"])"
The sum of [1 2 3] is 6

real	0m0,172s
user	0m0,003s
sys	0m0,001s

Also measured with Wireshark:

timediffnotes
676TCP connect
6760c:clone
6804s:session
6800c:exec
70424s:answer
74541s:nil
7450c:close, no FIN
847102s:closed, no FIN-ACK
8470c:FIN-ACK
8470s:FIN-ACK

So we see no delays caused by the client side here, only on the server side:

  • creating the session - 4 msec
  • loading the script and executing it - 24 msec
  • wait after the final output - 41 msec
  • closing the session - 102 msec

Client cmdline options

The following command line options can help to improve performance, by doing less:

--noload                              Do not load libraries and scripts
--nocheckdaemon                       Do not perform daemon checks on errors
--nosetloader                         Do not set dynamic classloader
--nomain                              Do not call main function after loading
--nonormalize                         Do not normalize parameters to script (rel. paths)

The –nomain option could be useful if you want to preload scripts, without executing them immediately.

Reading and passing stdin lines

A test with reading a file, with max-lines parameter versus total running time:

max-linestime (s)
130.6
215.5
310.5
56.5
103.5
202.1
501.11
1000.84
10000.5

A small –max-lines value causes many more round-trips between client and daemon. Even on a local connection this takes time. The default value is 1024 lines.

Further optimisations

Tests and analysis above suggest the following further optimisations:

  • Test a Tcl client with direct socket connection with bencode. Also close the session from the client but do not wait for the response. And freewraptclsh might help too.
  • Check the 40 msec delay between sending the final output and the nil-response.
  • Check the Babashka client for the cause of the 40 msec delays from the client side. Also check if these are related to the 40 msec server side delay.
  • Create a native binary of the Babashka client with GraalVM.
  • Maybe named pipes, see todo.org. But nRepl does not yet support this, and keeping track of sessions might be an issue.
  • Reuse sessions - only small improvement expected. Might be useful for keeping state.
  • Other clients based on e.g. Lua, C (compare rep binary), Go or Rust might help.
  • nRepl with TTY protocol?

Further testing

Test memory usage/leaks

Create a script that use a lot of memory. Run it a few times, check memory. Maybe also script that creates a lot of classes, they used to be harder to GC.

Profiling

Both on server/daemon side with a JVM profiler and dtrace or similar for the client(s).