Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix forever hanging when HashAgg is called by apply (#12760) #12766

Conversation

SunRunAway
Copy link
Contributor

@SunRunAway SunRunAway commented Oct 16, 2019

Automated cherry pick of #12760 on release-3.0.


What problem does this PR solve?

Fix #12759

What is changed and how it works?

The parent executors(apply) of HashAgg may call HashAgg.Next once more to make sure the returned size of chunk is 0.
HashAgg.Next handle these calls incorrectly after its all finalWorks exits.

On each call of HashAgg.Next after the job done, it returns a chunk with 0 size, but put a chk into finalInputCh. Each apply executor will call a HashAgg.Next once more, when the number of parent executors exceeds the channel buffer, the query will hang forever.

Explain the example in #12759,

  1. The size of the channel buffer is 4,
  2. The first call of HashAgg.Next which returns a one-row chunk to its parent, sends the chunk into the buffer (This is valid).
  3. Three times calls of HashAgg.Next called by each of the three apply, which returns a zero-row chunk to its parent, sends the chunk into the buffer (This should not happen).
  4. Now the channel is full.
  5. One more call of HashAgg.Next called by the session of tidb, which returns a zero-row chunk to its parent, sends the chunk into the buffer and hangs forever.

Check List

Tests

  • Unit test
  • Integration test

Code changes

  • None

Side effects

  • None

Related changes

  • None

Release note

  • Fix forever hanging when HashAgg is called by apply.

@SunRunAway SunRunAway added sig/execution SIG execution type/3.0 cherry-pick type/bugfix This PR fixes a bug. labels Oct 16, 2019
@SunRunAway
Copy link
Contributor Author

I've dropped the unit test because it does not support sql hint for hashagg in this branch.

@SunRunAway
Copy link
Contributor Author

/run-unit-test

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zz-jason zz-jason added the status/LGT1 Indicates that a PR has LGTM 1. label Oct 16, 2019
Copy link
Contributor

@XuHuaiyu XuHuaiyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@XuHuaiyu XuHuaiyu added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Oct 17, 2019
@eurekaka eurekaka added the status/can-merge Indicates a PR has been approved by a committer. label Oct 17, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Oct 17, 2019

Your auto merge job has been accepted, waiting for 12765

@sre-bot
Copy link
Contributor

sre-bot commented Oct 17, 2019

/run-all-tests

@sre-bot sre-bot merged commit 8485ccd into pingcap:release-3.0 Oct 17, 2019
@SunRunAway SunRunAway deleted the automated-cherry-pick-of-#12760-upstream-release-3.0 branch October 17, 2019 05:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/execution SIG execution status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants