-
Notifications
You must be signed in to change notification settings - Fork 500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compare k8s pod running info with pd client health info , improve inspection mechanism #1484
Conversation
run-e2e-test |
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rest LGTM, thanks!
pkg/apis/pingcap/v1alpha1/types.go
Outdated
@@ -528,6 +529,13 @@ type PDFailureMember struct { | |||
CreatedAt metav1.Time `json:"createdAt,omitempty"` | |||
} | |||
|
|||
// PDFailureMember is the pd failure member information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct the comment
pkg/apis/pingcap/v1alpha1/types.go
Outdated
Members map[string]PDMember `json:"members,omitempty"` | ||
Leader PDMember `json:"leader,omitempty"` | ||
FailureMembers map[string]PDFailureMember `json:"failureMembers,omitempty"` | ||
NotJoinClusterMembers map[string]NotJoinClusterMember `json:"notJoinClusterMembers,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend to name it UnjoinedClusterMember
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: simply UnjoinedMember
@@ -756,6 +762,52 @@ func getPDConfigMap(tc *v1alpha1.TidbCluster) (*corev1.ConfigMap, error) { | |||
return cm, nil | |||
} | |||
|
|||
func (pmm *pdMemberManager) checkNotJoinClusterNode(tc *v1alpha1.TidbCluster, set *apps.StatefulSet, pdStatus map[string]v1alpha1.PDMember) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not mean to get into naming issues, but checkSomething
usually does not mutate the input pointers, and Node
is ambiguous with Kubernetes Node, maybe collectUnjoinedMembers
could be more accurate
}}, | ||
err: false, | ||
expectTidbClusterFn: func(g *GomegaWithT, tc *v1alpha1.TidbCluster) { | ||
g.Expect(tc.Status.PD.NotJoinClusterMembers["test-pd-2"]).NotTo(Equal(nil)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
g.Expect(tc.Status.PD.NotJoinClusterMembers["test-pd-2"]).NotTo(Equal(nil)) | |
g.Expect(tc.Status.PD.NotJoinClusterMembers["test-pd-2"]).NotTo(BeNil()) |
}}, | ||
err: false, | ||
expectTidbClusterFn: func(g *GomegaWithT, tc *v1alpha1.TidbCluster) { | ||
g.Expect(len(tc.Status.PD.NotJoinClusterMembers)).To(Equal(0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
g.Expect(len(tc.Status.PD.NotJoinClusterMembers)).To(Equal(0)) | |
g.Expect(tc.Status.PD.NotJoinClusterMembers).To(BeEmpty()) |
Use more descriptive ginkgo matchers could improve clarity
run-e2e-test |
LGTM, please add a proper release note |
pkg/apis/pingcap/v1alpha1/types.go
Outdated
Members map[string]PDMember `json:"members,omitempty"` | ||
Leader PDMember `json:"leader,omitempty"` | ||
FailureMembers map[string]PDFailureMember `json:"failureMembers,omitempty"` | ||
UnjoinedMembers map[string]UnjoinedMember `json:"unJoinedMembers,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UnjoinedMembers map[string]UnjoinedMember `json:"unJoinedMembers,omitempty"` | |
UnjoinedMembers map[string]UnjoinedMember `json:"unjoinedMembers,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
var joined = false | ||
for podName := range pdStatus { | ||
if strings.EqualFold(pod.Name, podName) { | ||
joined = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
joined = true | |
joined = true | |
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -528,6 +529,13 @@ type PDFailureMember struct { | |||
CreatedAt metav1.Time `json:"createdAt,omitempty"` | |||
} | |||
|
|||
// UnjoinedMember is the pd unjoin cluster member information | |||
type UnjoinedMember struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of add an attribute to TidbCluster
, does just emit a UnjoinedMember
and JoinedMember
events to TidbCluster
make sense?
User can use kubectl describe
to get these messages easily. ref: #1466
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think event flow can't express my meaning in a concise manner , as a user, I need to understand and integrate the event flow and sort out each step. Use status
type display current status is a concise way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
once again😁, thanks for your contribution! |
/merge |
Your auto merge job has been accepted, waiting for 1486 |
/run-all-tests |
@mikechengwei merge failed. |
/merge |
Your auto merge job has been accepted, waiting for 1493 |
/run-all-tests |
What problem does this PR solve?
#1370
What is changed and how does it work?
Compare k8s pod running info with pd client health info , if the pod run but not join the tidb cluster,
take pod info into
NotJoinClusterMembers
type.Check List
Tests
Code changes
Side effects
None
Related changes
None