Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

statistics: add bucket ndv for index histogram #20580

Merged
merged 23 commits into from
Jan 13, 2021

Conversation

winoros
Copy link
Member

@winoros winoros commented Oct 22, 2020

What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

add bucket ndv for hist to improve its accuracy.

What is changed and how it works?

Proposal: xxx

What's Changed:

Add new stats version number to solve compability problem. if the hist is in old version number, we estimate row count using the old way.

When build index hist, record the NDV. And update the NDV when merging buckets.

When estimate row counts, if the bucket ndv is zero, fallback to cm sketch currently.(After @qw4990 decouple the cmsketch and ndv. This part will be changed to use topn to estimate.).

When updating feedback, collect ndv infomation at coprocessor's reader. And use it to update the current histogram.

When dumping feedback, since the estimation logic is changed, just dump the index's feedback instead of the original way.

How it Works:

Check List

Tests

  • Unit tests

Side effects

  • Performance regression
    • Consumes more CPU
    • Consumes more MEM

Release note

  • No release note

@winoros winoros requested review from a team as code owners October 22, 2020 05:42
@winoros winoros requested review from wshwsh12 and eurekaka and removed request for a team October 22, 2020 05:42
@sre-bot
Copy link
Contributor

sre-bot commented Oct 22, 2020

@sre-bot
Copy link
Contributor

sre-bot commented Oct 22, 2020

Please follow PR Title Format:

  • pkg [, pkg2, pkg3]: what's changed

Or if the count of mainly changed packages are more than 3, use

  • *: what's changed

@winoros winoros changed the title add bucket ndv for index histogram statistics: add bucket ndv for index histogram Oct 22, 2020
@github-actions github-actions bot added the sig/execution SIG execution label Oct 22, 2020
@qw4990 qw4990 self-requested a review October 22, 2020 08:39
statistics/histogram.go Outdated Show resolved Hide resolved
@zz-jason
Copy link
Member

@winoros could you share some testing results about the selectivity estimation accuracy improvements?

@sre-bot
Copy link
Contributor

sre-bot commented Oct 26, 2020

@sre-bot
Copy link
Contributor

sre-bot commented Nov 4, 2020

@qw4990 qw4990 requested review from time-and-fate and removed request for wshwsh12 December 14, 2020 03:02
statistics/histogram.go Outdated Show resolved Hide resolved
statistics/histogram.go Outdated Show resolved Hide resolved
@qw4990 qw4990 added the type/enhancement The issue or PR belongs to an enhancement. label Dec 17, 2020
distsql/select_result.go Outdated Show resolved Hide resolved
Copy link
Contributor

@qw4990 qw4990 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot ti-srebot added the status/LGT1 Indicates that a PR has LGTM 1. label Dec 30, 2020
Copy link
Member

@time-and-fate time-and-fate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot ti-srebot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Dec 30, 2020
ti-srebot
ti-srebot previously approved these changes Dec 30, 2020
qw4990
qw4990 previously approved these changes Jan 5, 2021
@github-actions github-actions bot added the sig/sql-infra SIG: SQL Infra label Jan 7, 2021
Copy link
Member

@time-and-fate time-and-fate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@winoros
Copy link
Member Author

winoros commented Jan 7, 2021

/merge

@ti-srebot ti-srebot added the status/can-merge Indicates a PR has been approved by a committer. label Jan 7, 2021
@ti-srebot
Copy link
Contributor

Your auto merge job has been accepted, waiting for:

  • 22165

@ti-srebot
Copy link
Contributor

/run-all-tests

@ti-srebot
Copy link
Contributor

@winoros merge failed.

@winoros
Copy link
Member Author

winoros commented Jan 13, 2021

/run-all-tests tidb-test=pr/1139

@winoros
Copy link
Member Author

winoros commented Jan 13, 2021

/run-tics-test

@winoros winoros merged commit 3dd842f into pingcap:master Jan 13, 2021
@winoros winoros deleted the bucket-ndv-index-hist branch January 13, 2021 10:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/statistics sig/execution SIG execution sig/planner SIG: Planner sig/sql-infra SIG: SQL Infra status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants