Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix pd control bugs #245

Merged
merged 6 commits into from
Dec 25, 2018
Merged

Conversation

xiaojingchen
Copy link
Contributor

@xiaojingchen xiaojingchen commented Dec 21, 2018

this PR fixes #238 #239
But this is a temporary solution, the fundamental solution is making PD API returns error code 404 when no found member.
@gregwebs @weekface @tennix PTAL

@@ -316,7 +330,7 @@ func (pc *pdClient) DeleteStore(storeID uint64) error {
defer DeferClose(res.Body, &err)

// Remove an offline store should returns http.StatusOK
if res.StatusCode == http.StatusOK {
if res.StatusCode == http.StatusOK || res.StatusCode == http.StatusNotFound {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does DeleteStore request's response will return http.StatusNotFound ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the response return 500 code too, this is no use now. It's just a standard way.

@@ -338,14 +366,28 @@ func (pc *pdClient) DeleteMemberByID(memberID uint64) error {
return err
}
defer DeferClose(res.Body, &err)
if res.StatusCode == http.StatusOK {
if res.StatusCode == http.StatusOK || res.StatusCode == http.StatusNotFound {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

pkg/manager/member/pd_failover.go Show resolved Hide resolved
@xiaojingchen
Copy link
Contributor Author

/run-e2e-tests

Copy link
Member

@tennix tennix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tennix
Copy link
Member

tennix commented Dec 21, 2018

/run-e2e-tests

@gregwebs
Copy link
Contributor

gregwebs commented Dec 21, 2018

LGTM. There is still a race condition between get and delete so we need to make sure our delete method is retried once. Fixing this upstream in PD itself will make things much smoother in the long-term.

@xiaojingchen
Copy link
Contributor Author

@gregwebs
I am not clear what the race condition means,
Is it means two or more threads call the deleteMember method to delete one member at the same time, may cause no found error?

@xiaojingchen
Copy link
Contributor Author

/run-e2e-tests

@gregwebs
Copy link
Contributor

@xiaojingchen yes, its possible between get and delete that it is deleted (by a different program than tidb-operator).

@xiaojingchen
Copy link
Contributor Author

xiaojingchen commented Dec 24, 2018

@gregwebs but that error will only happen once, and the caller will try again without any problem.

@xiaojingchen
Copy link
Contributor Author

/run-e2e-tests

@xiaojingchen
Copy link
Contributor Author

/run-e2e-tests

@xiaojingchen xiaojingchen merged commit ebd499e into pingcap:master Dec 25, 2018
yahonda pushed a commit that referenced this pull request Dec 27, 2021
* add pd-recover doc

* update for how to get max alloc id

* query prometheus to get max alloc id

* Apply suggestions from code review

Co-authored-by: Lilian Lee <lilin@pingcap.com>

* update format

* zh: udpate format and optimize steps

Co-authored-by: Lilian Lee <lilin@pingcap.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parse pd member id failed
4 participants