Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix pd_control: not rely on error text #197

Merged
merged 3 commits into from
Nov 29, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 38 additions & 18 deletions pkg/controller/pd_control.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,6 @@ import (

const (
timeout = 5 * time.Second

// https://github.com/pingcap/pd/blob/master/server/coordinator.go#L42-L45
schedulerExisted = "scheduler existed"
schedulerNotFound = "scheduler not found"
)

// PDControlInterface is an interface that knows how to manage and get tidb cluster's PD client
Expand Down Expand Up @@ -328,13 +324,6 @@ func (pc *pdClient) DeleteStore(storeID uint64) error {
return err
}

// FIXME: We should not rely on error text
// Remove a tombstone store should returns "store has been removed"
bodyStr := string(body)
if strings.Contains(bodyStr, "store has been removed") {
return nil
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PD now actually returns a structured json response with an error code and an error code as a header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the older version don't have the error code and header.

Return the error is ok for this situation. The DeleteStore function is reentrant.

return fmt.Errorf("failed to delete store %d: %v", storeID, string(body))
}

Expand Down Expand Up @@ -407,15 +396,31 @@ func (pc *pdClient) BeginEvictLeader(storeID uint64) error {
if res.StatusCode == http.StatusOK {
return nil
}
err2 := readErrorBody(res.Body)
if strings.Contains(err2.Error(), schedulerExisted) {
return nil

// pd will return an error with the body contains "scheduler existed" if the scheduler already exists
// this is not the standard response.
// so these lines are just a workaround for now:
// - make a new request to get all schedulers
// - return nil if the scheduler already exists
//
// when PD returns standard json response, we should get rid of this verbose code.
evictLeaderSchedulers, err := pc.GetEvictLeaderSchedulers()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be exec before create evict leader, if the scheduler leader exists, then not need to post sheduler leader

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This replaces of strings.Contains, placed behind evicting leader is ok.

if err != nil {
return err
}
for _, s := range evictLeaderSchedulers {
if s == getLeaderEvictSchedulerStr(storeID) {
return nil
}
}

err2 := readErrorBody(res.Body)
return fmt.Errorf("failed %v to begin evict leader of store:[%d],error: %v", res.StatusCode, storeID, err2)
}

func (pc *pdClient) EndEvictLeader(storeID uint64) error {
apiURL := fmt.Sprintf("%s/%s/%s", pc.url, schedulersPrefix, getLeaderEvictSchedulerStr(storeID))
sName := getLeaderEvictSchedulerStr(storeID)
apiURL := fmt.Sprintf("%s/%s/%s", pc.url, schedulersPrefix, sName)
req, err := http.NewRequest("DELETE", apiURL, nil)
if err != nil {
return err
Expand All @@ -428,11 +433,26 @@ func (pc *pdClient) EndEvictLeader(storeID uint64) error {
if res.StatusCode == http.StatusOK || res.StatusCode == http.StatusNotFound {
return nil
}

// pd will return an error with the body contains "scheduler not found" if the scheduler is not found
// this is not the standard response.
// so these lines are just a workaround for now:
// - make a new request to get all schedulers
// - return nil if the scheduler is not found
//
// when PD returns standard json response, we should get rid of this verbose code.
err2 := readErrorBody(res.Body)
if strings.Contains(err2.Error(), schedulerNotFound) {
return nil
evictLeaderSchedulers, err := pc.GetEvictLeaderSchedulers()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This replaces of strings.Contains, placed behind evicting leader is ok.

if err != nil {
return err
}
for _, s := range evictLeaderSchedulers {
if s == sName {
return fmt.Errorf("failed %v to end leader evict scheduler of store [%d], error: %v", res.StatusCode, storeID, err2)
}
}
return fmt.Errorf("failed %v to end leader evict scheduler of store [%d],error:%v", res.StatusCode, storeID, err2)

return nil
}

func (pc *pdClient) GetEvictLeaderSchedulers() ([]string, error) {
Expand Down
1 change: 1 addition & 0 deletions pkg/controller/tidb_control.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ import (
)

const (
// https://github.com/pingcap/tidb/blob/master/owner/manager.go#L183
// NotDDLOwnerError is the error message which was returned when the tidb node is not a ddl owner
NotDDLOwnerError = "This node is not a ddl owner, can't be resigned."
)
Expand Down