Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blocksync: Some performance improvements on top of #3957 #3985

Merged
merged 3 commits into from
Sep 2, 2020

Conversation

cdecker
Copy link
Member

@cdecker cdecker commented Aug 26, 2020

Inspired by #3957 I took some time to benchmark and improve the block sync
time as well. It's not as impressive as the 30x that #3957 achieved but it has
some incremental gains. The benchmarks use the following code, with a local
bitcoind and the default bcli backend:

from pyln.client import LightningRpc
import time
import subprocess

startheight = 600000
numblocks = 1000
targetheight = startheight + numblocks
proc = subprocess.Popen([
    'lightningd/lightningd',
    '--network=bitcoin',
    '--log-level=debug',
    '--rescan=-{}'.format(startheight),
    '--lightning-dir=/tmp/ll'
])
time.sleep(5)
rpc = LightningRpc('/tmp/ll/bitcoin/lightning-rpc')

def height():
    return rpc.getinfo()['blockheight']

def rate(first, last):
    if first == last:
        return 0
    return (last[1] - first[1]) / (last[0] - first[0])

start = (time.time(), height())
heights = []
while True:
    heights.append((time.time(), height()))
    if len(heights) > 30:
        heights = heights[-30:]
    first = heights[0]
    last = heights[-1]

    print("Speed: {:2f} blocks / s, overall {:2f} blocks / s".format(
        rate(first, last),
        rate(start, last)
    ))
    time.sleep(1)

    if last[1] >= targetheight:
        break

rpc.stop()
proc.wait()

The following are the performance gains achieved by individual commits (and compared to #3957):

Scenario TTS 1000 blocks Blocks / second Speedup
pre-3957 (bf5e994) 8:01 2.073498 -
post-3957 (128adf0) 7:28 2.224472 7%
commit 1 (68f3f30) 5:42 2.923359 41%
commit 2 (dc0859e) 5:00 3.321625 60%
commit 3 (0e4c6b6) 3:07 5.357775 158%

And here are the flamegraphs that I used to guide the changes:
flamegraph-pre-3957
flamegraph-post-3957
flamegraph-68f3f30
flamegraph-dc0859e
flamegraph-0e4c6b6

@cdecker cdecker added this to the v0.9.1 milestone Aug 26, 2020
@cdecker cdecker requested a review from rustyrussell August 26, 2020 16:53
lightningd/plugin.c Show resolved Hide resolved
@cdecker
Copy link
Member Author

cdecker commented Aug 27, 2020

Fixup switches from searching \n\n to searching }\n which makes no additional assumptions on the behavior of the plugins, instead using the existing "JSON object terminated by a newline" semantics.

@cdecker
Copy link
Member Author

cdecker commented Aug 27, 2020

Performance remains the same for block sync, if not better:

flamegraph-newsep

Scenario TTS 1000 blocks Blocks / second Speedup
pre-3957 (bf5e994) 8:01 2.073498 -
post-3957 (128adf0) 7:28 2.224472 7%
commit 1 (68f3f30) 5:42 2.923359 41%
commit 2 (dc0859e) 5:00 3.321625 60%
commit 3 (0e4c6b6) 3:07 5.357775 158%
new separator (1634f45) 3:01 5.609808 170%

@cdecker cdecker force-pushed the perf branch 2 times, most recently from 1d60831 to 92ab81a Compare August 28, 2020 11:03
@cdecker cdecker requested a review from rustyrussell August 28, 2020 11:37
lightningd/plugin.c Show resolved Hide resolved
@rustyrussell
Copy link
Contributor

rustyrussell commented Aug 30, 2020

Hmm, my response to this got lost.

Anyway, I can't find any requirement on json.org or jsonrpc.org that a JSON RPC end with '\n'. Even if it said that, my ignorance suggests it's dangerous to rely on it.

We can, however, insist that we have just read a '}'. In the normal case, this is harmless, but in the case here "huge token", it's an effective hack. Needs a huge comment though:

    /* Our JSON parser is pretty good at incremental parsing, but `getrawblock` gives a giant 2MB token, which forces it to re-parse 
    every time until we have all of it.  However, we can't complete a JSON object without a '}', so we do a cheaper check here. */

@cdecker
Copy link
Member Author

cdecker commented Sep 1, 2020

Hmm, my response to this got lost.

Anyway, I can't find any requirement on json.org or jsonrpc.org that a JSON RPC end with '\n'. Even if it said that, my ignorance suggests it's dangerous to rely on it.

That is because the JSON-RPC spec is (mostly) transport agnostic, and the usual suspects (HTTP/HTTPS/HTTP2.0) provide built-in framing support, i.e., either the header contains the number of bytes to read, or the connection is closed after submitting the payload. In our case we don't have that which required us to come up with a different end-of-message marker, and so we did with a very inefficient end-of-json = end-of-message which requires either re-parsing the message on every increment, or having a resumable parser like jsmn. Notice that resumable parsers are very rare, and we got lucky. For consumers such as the pyln-client library we do provide stronger guarantees like the \n\n message separator, but I think we should slowly move to consolidate the different interfaces, and start requiring \n\n over time (gated behind a COMPAT flag).

I'm pretty sure that requiring a newline after the message is safe, even if not explicitly documented right now, but happy to start enforcing that only in conjunction with a COMPAT flag as well.

We can, however, insist that we have just read a '}'. In the normal case, this is harmless, but in the case here "huge token", it's an effective hack. Needs a huge comment though:

    /* Our JSON parser is pretty good at incremental parsing, but `getrawblock` gives a giant 2MB token, which forces it to re-parse 
    every time until we have all of it.  However, we can't complete a JSON object without a '}', so we do a cheaper check here. */

I went ahead and implemented the single } check:

Scenario TTS 1000 blocks Blocks / second Speedup
pre-3957 (bf5e994) 8:01 2.073498 -
post-3957 (128adf0) 7:28 2.224472 7%
commit 1 (68f3f30) 5:42 2.923359 41%
commit 2 (dc0859e) 5:00 3.321625 60%
commit 3 (0e4c6b6) 3:07 5.357775 158%
new separator (1634f45) 3:01 5.609808 170%
single } sep (eea01b0) 2:46 5.966785 188%

flamegraph-parensep

So it seems like checking for } is just as good as the longer separator. At least we don't need to be more restrictive in what we accept yet 😉

I added the details on why we filter to the comment, and if this looks good I can squash it all into place.

@cdecker cdecker requested a review from rustyrussell September 1, 2020 10:45
* `getrawblock` gives a giant 2MB token, which forces it to re-parse
* every time until we have all of it. However, we can't complete a
* JSON object without a '}', so we do a cheaper check here.
*/
have_full = memmem(plugin->buffer + plugin->used - offset,
plugin->len_read + offset, "}", 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be memchr :)

And we no longer need offset. I'll add a fixup (and seems like a file needs regen, too...)

@rustyrussell
Copy link
Contributor

Changed to a straight memchr, squashed fixups and rebased on (Travis-fixed) master...

@rustyrussell
Copy link
Contributor

Ack e45b22d

We were using `tal_fmt` to truncate off the last byte of the
response (newline), which required an allocation, a call to `vsnprintf` and a
copy of the block contents. This being >2MB in size on mainnet was rather
expensive.

We now just signal the end of the string by overwriting the trailing newline,
and stealing directly onto the result.
We're not going to mutate transactions in a block, so computing the txids
every time we need them is a waste, let's compute them upfront and use them
afterwards.
PR ElementsProject#3957 improved performance considerably, however we still look over the
entire message for the message separator. If instead we just look in the
incrementally read data, we remove the quadratic behavior for large messages.

This is safe since we then loop over the messages which would drain any
message separator from the buffer before we attempt the next read.

Changelog-Fixed: bcli: Significant speedups for block synchronization
@cdecker
Copy link
Member Author

cdecker commented Sep 2, 2020

ACK 64f2dcb

@cdecker cdecker merged commit d7ff74e into ElementsProject:master Sep 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants