Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net, config: optimize read and write packets #382

Merged
merged 2 commits into from
Oct 18, 2023

Conversation

djshow832
Copy link
Collaborator

@djshow832 djshow832 commented Oct 18, 2023

What problem does this PR solve?

Issue Number: close #381

Problem Summary:
WritePacket and ReadPacket become the hot path when there are many rows returned, so this code should be optimized:

  • WriteOnePacket calls 2 NewReader and that allocates memory twice.
  • ReadPacket grows slice in data = append(data, buf), which is unncessary.
  • var header [4]byte escapes to the heap because header is passed as a parameter to other functions.
  • The read/write buffer size is 16K, which means for 1000 rows, it calls syscall read/write 8 times.
  • io.ReadFull contains unnecessary boundary checks, function calls, and interface conversion.

What is changed and how it works:
Optimization:

  • Replace io.Copy with readWriter.Write to remove NewReader.
  • Do not call data = append(data, buf) for the first packet.
  • Move header into PacketIO/compressedReadWriter as a member.
  • Make conn-buffer-size configurable and set to 32K by default. It can't be too large in case there are tens of thousands of idle connections.
  • Replace io.ReadFull with customized ReadFull.

Others:

  • Add DirectWrite to compressedReadWriter because it was missed, but it doesn't matter actually.
  • Also add benchmarks of compression.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Benchmark Result

Before this PR:
BenchmarkWritePacket-8 2721517 607.5 ns/op 328 B/op 6 allocs/op
BenchmarkReadPacket-8 2829 413287 ns/op 2097727 B/op 6 allocs/op

After this PR:
BenchmarkWritePacket-8 10362842 116.3 ns/op 112 B/op 1 allocs/op
BenchmarkReadPacket-8 5344 232485 ns/op 1049217 B/op 1 allocs/op

Sysbench Result

Before this PR:

Range size QPS TiProxy CPU QPS per 100% CPU
10 30955 180% 17200
100 14655 190% 7710
1000 2112 200% 1060
10000 221 200% 110

After this PR with conn-buffer-size=131072:

Range size QPS TiProxy CPU QPS per 100% CPU
10 32818 150% 21878
100 22392 190% 11785
1000 4093 200% 2046
10000 449 200% 224

Flame Graph

Before this PR:
image

After this PR:
image

Notable changes

  • Has configuration change
  • Has HTTP API interfaces change
  • Has tiproxyctl change
  • Other user behavior changes

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot ti-chi-bot bot requested review from bb7133 and xhebox October 18, 2023 07:17
@ti-chi-bot ti-chi-bot bot added the size/XL label Oct 18, 2023
@ti-chi-bot ti-chi-bot bot added the lgtm label Oct 18, 2023
@ti-chi-bot
Copy link

ti-chi-bot bot commented Oct 18, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: xhebox

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot
Copy link

ti-chi-bot bot commented Oct 18, 2023

[LGTM Timeline notifier]

Timeline:

  • 2023-10-18 07:40:51.781857437 +0000 UTC m=+1815649.368967583: ☑️ agreed by xhebox.

@ti-chi-bot ti-chi-bot bot added the approved label Oct 18, 2023
@ti-chi-bot ti-chi-bot bot merged commit 35f3c79 into pingcap:main Oct 18, 2023
@djshow832 djshow832 deleted the optimize_readwrite branch October 18, 2023 07:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance gap between TiProxy and HAProxy grows as the dataset size grows
2 participants