Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare k8s pod running info with pd client health info , improve inspection mechanism #1484

Merged
merged 11 commits into from
Jan 8, 2020

Conversation

mikechengwei
Copy link
Contributor

@mikechengwei mikechengwei commented Jan 5, 2020

What problem does this PR solve?

#1370

What is changed and how does it work?

Compare k8s pod running info with pd client health info , if the pod run but not join the tidb cluster,
take pod info into NotJoinClusterMembers type.

Check List

Tests

  • Unit test

Code changes

  • Has Go code change

Side effects

None

Related changes

None

Add `unjoinedMembers` of PD in the status of `TidbCluster` custom resource

@mikechengwei
Copy link
Contributor Author

run-e2e-test

@aylei
Copy link
Contributor

aylei commented Jan 5, 2020

/ok-to-test

Copy link
Contributor

@aylei aylei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest LGTM, thanks!

@@ -528,6 +529,13 @@ type PDFailureMember struct {
CreatedAt metav1.Time `json:"createdAt,omitempty"`
}

// PDFailureMember is the pd failure member information
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct the comment

Members map[string]PDMember `json:"members,omitempty"`
Leader PDMember `json:"leader,omitempty"`
FailureMembers map[string]PDFailureMember `json:"failureMembers,omitempty"`
NotJoinClusterMembers map[string]NotJoinClusterMember `json:"notJoinClusterMembers,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend to name it UnjoinedClusterMember

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: simply UnjoinedMember

@@ -756,6 +762,52 @@ func getPDConfigMap(tc *v1alpha1.TidbCluster) (*corev1.ConfigMap, error) {
return cm, nil
}

func (pmm *pdMemberManager) checkNotJoinClusterNode(tc *v1alpha1.TidbCluster, set *apps.StatefulSet, pdStatus map[string]v1alpha1.PDMember) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not mean to get into naming issues, but checkSomething usually does not mutate the input pointers, and Node is ambiguous with Kubernetes Node, maybe collectUnjoinedMembers could be more accurate

}},
err: false,
expectTidbClusterFn: func(g *GomegaWithT, tc *v1alpha1.TidbCluster) {
g.Expect(tc.Status.PD.NotJoinClusterMembers["test-pd-2"]).NotTo(Equal(nil))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
g.Expect(tc.Status.PD.NotJoinClusterMembers["test-pd-2"]).NotTo(Equal(nil))
g.Expect(tc.Status.PD.NotJoinClusterMembers["test-pd-2"]).NotTo(BeNil())

}},
err: false,
expectTidbClusterFn: func(g *GomegaWithT, tc *v1alpha1.TidbCluster) {
g.Expect(len(tc.Status.PD.NotJoinClusterMembers)).To(Equal(0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
g.Expect(len(tc.Status.PD.NotJoinClusterMembers)).To(Equal(0))
g.Expect(tc.Status.PD.NotJoinClusterMembers).To(BeEmpty())

Use more descriptive ginkgo matchers could improve clarity

@aylei aylei self-assigned this Jan 5, 2020
@mikechengwei
Copy link
Contributor Author

run-e2e-test

@mikechengwei mikechengwei requested a review from aylei January 6, 2020 02:42
@aylei
Copy link
Contributor

aylei commented Jan 6, 2020

LGTM, please add a proper release note

@tennix tennix requested a review from weekface January 6, 2020 03:30
Members map[string]PDMember `json:"members,omitempty"`
Leader PDMember `json:"leader,omitempty"`
FailureMembers map[string]PDFailureMember `json:"failureMembers,omitempty"`
UnjoinedMembers map[string]UnjoinedMember `json:"unJoinedMembers,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
UnjoinedMembers map[string]UnjoinedMember `json:"unJoinedMembers,omitempty"`
UnjoinedMembers map[string]UnjoinedMember `json:"unjoinedMembers,omitempty"`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

var joined = false
for podName := range pdStatus {
if strings.EqualFold(pod.Name, podName) {
joined = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
joined = true
joined = true
break

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -528,6 +529,13 @@ type PDFailureMember struct {
CreatedAt metav1.Time `json:"createdAt,omitempty"`
}

// UnjoinedMember is the pd unjoin cluster member information
type UnjoinedMember struct {
Copy link
Contributor

@weekface weekface Jan 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of add an attribute to TidbCluster, does just emit a UnjoinedMember and JoinedMember events to TidbCluster make sense?

User can use kubectl describe to get these messages easily. ref: #1466

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think event flow can't express my meaning in a concise manner , as a user, I need to understand and integrate the event flow and sort out each step. Use status type display current status is a concise way.

@mikechengwei mikechengwei requested a review from weekface January 6, 2020 06:17
Copy link
Contributor

@weekface weekface left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@aylei aylei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aylei
Copy link
Contributor

aylei commented Jan 6, 2020

once again😁, thanks for your contribution!

@aylei
Copy link
Contributor

aylei commented Jan 6, 2020

/merge

@sre-bot
Copy link
Contributor

sre-bot commented Jan 6, 2020

Your auto merge job has been accepted, waiting for 1486

@sre-bot
Copy link
Contributor

sre-bot commented Jan 6, 2020

/run-all-tests

@sre-bot
Copy link
Contributor

sre-bot commented Jan 6, 2020

@mikechengwei merge failed.

@aylei
Copy link
Contributor

aylei commented Jan 8, 2020

/merge

@sre-bot
Copy link
Contributor

sre-bot commented Jan 8, 2020

Your auto merge job has been accepted, waiting for 1493

@sre-bot
Copy link
Contributor

sre-bot commented Jan 8, 2020

/run-all-tests

@sre-bot sre-bot merged commit d11a844 into pingcap:master Jan 8, 2020
yahonda pushed a commit that referenced this pull request Dec 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants