-
Notifications
You must be signed in to change notification settings - Fork 749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add agent for testing pod networking #1448
Conversation
test/agent/README.md
Outdated
@@ -0,0 +1,71 @@ | |||
###Test Agent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to see this approach for testing 👍
secondaryRouteTableIndex := make(map[int]bool) | ||
|
||
// For each Pod validate the Pod networking | ||
for _, pod := range podNetworkingValidationInput.PodList { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of passing the list of pods from the caller, consider running this as a daemon on every node and let this pull what pods are running on the node this daemon is on, run all validation checks for each pod and report the results. You can scale this framework to large number of nodes/pods this way
log.Printf("validated route table for secondary ENI %d has right routes", index) | ||
} | ||
|
||
return validationErrors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to have a struct where you can define pod and failure type(s) - will make debugging easier
|
||
var connectivityMetric []input.TestStatus | ||
|
||
// metric server stores metrics from test client and returns the aggregated metrics to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's be nice to expose different types of metrics/failures/successes add labels on each metric and additionally expose them in prometheus format. For log running validation and some other use-cases, it will be come in handy
test/agent/README.md
Outdated
###Test Agent | ||
The test agent contains multiple binaries that are used by Ginkgo Automation tests. | ||
|
||
###List of Go Binaries in the Agent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
###List of Go Binaries in the Agent
> ### List of Go Binaries in the Agent
so it renders correctly. Same for all the other places
What type of PR is this?
Add the test agent. The test agent docker image provides testing utilities that will be called from the automation test suite to do test verification.
For instance, in order to test the networking setup of all pods on the host we can use the agent which utilizes the netlink package to test the networking setup for the host.
Server Deployment and Pods
Client Job and pods
Metric Server deployment and pod
Output of metric aggregator!
Here on
10.0.64.125
server is not running so we see failure across all the client pods for the same IP.The README.md has the complete description for the purpose for the change.
Which issue does this PR fix:
Automation Test
What does this PR do / Why do we need it:
Automation Test
If an issue # is not available please add repro steps and logs from IPAMD/CNI showing the issue:
NA
Testing done on this change:
Yes, tested locally.
Automation added to e2e:
Yes
Will this break upgrades or downgrades. Has updating a running cluster been tested?:
No
Does this change require updates to the CNI daemonset config files to work?:
No
Does this PR introduce any user-facing change?:
No
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.