-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of theia and policy recommendation CLI #25
Conversation
e9dc141
to
c05ef81
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial review
we can continue to deploy other necessary components of policy recommendation feature: | ||
|
||
We need to install the [Kubernetes Operator for Apache Spark](https://github.com/GoogleCloudPlatform/spark-on-k8s-operator) | ||
in the cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yanjunz97 - Can you assess how difficult would it to add this to helm charts as well?
If not possible, we might want to document that the Theia helm chart does not yet suppport Spark Operator deployment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should not be difficult to add it. I would take a look at it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@salv-orlando This support is added in PR #31
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we should just document that way (assuming it is simpler). Again, consider create a separate PR just for documentation.
pkg/theiactl/commands/root.go
Outdated
Use: "theiactl", | ||
Short: "theiactl is the command line tool for Theia", | ||
Long: `theiactl is the command line tool for Theia that supports the | ||
policy recommendation feature`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the time being I'd keep "short" and "long" description the same or change it to
theiactl is the command line tool for Theia which provides access to some network flow visibility capabilities
I don't think we want to expand the long description for every feature we add.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, changed.
pkg/theiactl/commands/start.go
Outdated
'YYYY-MM-DD hh:mm:ss' format, for example: 2006-01-02 15:04:05`, err) | ||
} | ||
recoJobArgs = append(recoJobArgs, "--end_time", endTime) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a simple check to make sure that if both start and end time are specified start_time < end_time ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, added
pkg/theiactl/commands/start.go
Outdated
err := json.Unmarshal([]byte(nsAllowList), &parsedNsAllowList) | ||
if err != nil { | ||
return fmt.Errorf(`parsing ns_allow_list: %v, ns_allow_list should | ||
be a list of namespace string, for example: '["kube-system","flow-aggregator","flow-visibility"]'`, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be too difficult for the CLI to parse a list of comma-separated strings? The JSON notation is not super user-friendly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. Since in CLI we don't actually use the parsed list of namespaces. Here is just a pre-check for the format of the input is valid and pass the input string to the PySpark directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few nits while looking through the doc
I added a commit to implement the two CLI commands to
Currently Updated: Unit tests for the two commands are added. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More comments about check and result commands
pkg/theiactl/commands/check.go
Outdated
}, | ||
} | ||
|
||
func getPolicyRecommendationProgress(id string) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the 'id' passed to this function the same as the SparkAppId that would be retrieved from the monitoring service?
If so, do we actually need to pass it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
id is different from SparkAppId. But it's true that we do not need this parameter in this function any more. Removed it now, thanks!
## Command-line Usage | ||
|
||
Currently, the network policy recommendation feature only supports command line interaction through Theiactl. | ||
Theiactl is the Theia command-line tool. To get more information about Theiactl, please refer to this [doc](./theiactl.md). We have 3 Theiactl commands for the policy recommendation feature: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following antctl
docs convention, I think we should use lowercase for theiactl
unless beginning of sentence?
https://github.com/antrea-io/antrea/blob/main/docs/antctl.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, addressed
c58e8b9
to
daa942a
Compare
I notice that |
/theia-test-e2e |
819f75c
to
6a9fc60
Compare
54f6ade
to
9a8b26e
Compare
|
||
// policyRecommendationRetrieveCmd represents the policy-recommendation retrieve command | ||
var policyRecommendationRetrieveCmd = &cobra.Command{ | ||
Use: "retrieve", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally prefer "results" over "retrieve". But no strong opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Jianjun, I find sonobuoy have retrieve
command to get results and another results
command to inspect results. I prefer we could keep retrieve
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Let us if others have an opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not have a strong opinion, both terms are good for me.
Use: "retrieve", | ||
Short: "Get the recommendation result of a policy recommendation Spark job", | ||
Long: `Get the recommendation result of a policy recommendation Spark job by ID. | ||
It will return the recommended network policies described in yaml.`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NetworkPolicies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, addressed.
Use: "run", | ||
Short: "Run a new policy recommendation Spark job", | ||
Long: `Run a new policy recommendation Spark job. | ||
Network policies will be recommended based on the flow records sent by Flow Aggregator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NetworkPolicies
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel we can remove this sentence (need not to talk about implementation in CLI help message).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, removed this sentence.
Long: `Run a new policy recommendation Spark job. | ||
Network policies will be recommended based on the flow records sent by Flow Aggregator. | ||
Must finish the deployment of Theia first, please follow the steps in | ||
https://github.com/antrea-io/theia/blob/main/docs/network-policy-recommendation.md`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove "please follow the steps..." too. Needs not to be in CLI help messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rigth. Also because the link on main branch might become invalid or the content might change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, removed this sentence.
"use-cluster-ip", | ||
false, | ||
`Enable this option will use ClusterIP instead of port forwarding when connecting to the ClickHouse Service. | ||
It can only be used when running in cluster. (Only works when wait is enabled)`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "(Only works when wait is enabled)" mean?
Should we check "wait" is enabled and say: "It can only be used when running in cluster and when wait is enabled"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks addressed.
"file", | ||
"f", | ||
"", | ||
"The file path where you want to save the results. (Only works when wait is enabled)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You used "result" in other places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, change to use result
.
policyRecommendationRunCmd.Flags().String( | ||
"clickhouse-endpoint", | ||
"", | ||
"The ClickHouse Service endpoint. (Only works when wait is enabled)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks addressed.
$ theia policy-recommendation status --id e998433e-accb-4888-9fc8-06563f073e86 | ||
Or | ||
$ theia policy-recommendation status e998433e-accb-4888-9fc8-06563f073e86 | ||
Use Cluster IP when checking the current status of job with ID e998433e-accb-4888-9fc8-06563f073e86 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ClusterIP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more minor comments
Long: `Run a new policy recommendation Spark job. | ||
Network policies will be recommended based on the flow records sent by Flow Aggregator. | ||
Must finish the deployment of Theia first, please follow the steps in | ||
https://github.com/antrea-io/theia/blob/main/docs/network-policy-recommendation.md`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rigth. Also because the link on main branch might become invalid or the content might change
} | ||
matchResult, err := regexp.MatchString(k8sQuantitiesReg, driverCoreRequest) | ||
if err != nil || !matchResult { | ||
return fmt.Errorf("driver-core-request should conform to the Kubernetes convention") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pedant comment: Kubernetes resource quantity convention
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks addressed.
pkg/theia/commands/utils.go
Outdated
LabelSelector: "app.kubernetes.io/instance=policy-reco,app.kubernetes.io/name=spark-operator", | ||
}) | ||
if err != nil { | ||
return fmt.Errorf("error %v when finding the policy-reco-spark-operator Pod, please check the deployment of the Spark Operator", err) | ||
} | ||
if len(pods.Items) < 1 { | ||
return fmt.Errorf("can't find the policy-reco-spark-operator Pod, please check the deployment of the Spark Operator") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also replace policy-reco-
by policy-recommendation
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I will leave the Labels as it is for now before #31 is merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any additional comment. PR looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For UT, we can consider to cover more cases later.
username, password, err := getClickHouseSecret(fakeClientset) | ||
assert.NoError(t, err) | ||
assert.Equal(t, "clickhouse_operator", string(username)) | ||
assert.Equal(t, "clickhouse_operator_password", string(password)) | ||
} | ||
|
||
func TestGetResultFromClickHouse(t *testing.T) { | ||
recoID := "db2134ea-7169-46f8-b56d-d643d4751d1d" | ||
expectedResult := "recommend-allow-acnp-kube-system-rpeal" | ||
db, mock, err := sqlmock.New(sqlmock.QueryMatcherOption(sqlmock.QueryMatcherEqual)) | ||
assert.NoError(t, err) | ||
defer db.Close() | ||
resultRow := sqlmock.NewRows([]string{"yamls"}).AddRow(expectedResult) | ||
mock.ExpectQuery("SELECT yamls FROM recommendations WHERE id = (?);").WithArgs(recoID).WillReturnRows(resultRow) | ||
result, err := getResultFromClickHouse(db, recoID) | ||
assert.NoError(t, err) | ||
assert.Equal(t, expectedResult, result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For UT, only covering happy path may be not enough, we may also consider some negative cases later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some failure cases in the second commit, we could add more cases after this PR merged.
6806eab
to
96b630b
Compare
In this commit, we add the implementation of theia, the command-line tool for Theia based on cobra framework. Policy recommendation CLI is also included. It consists of 3 commands which let users could run a new policy recommendation job, check the current status of a previous job, and retrieve the recommendation result of a previous job respectively. In order to access theia services externally, theia can make use of k8s port forwarder functionality. Port forwarder provides tooling to start/stop port forwarding channel for eather Pod+Port or Service+Port. The code is loosely based on corresponding code in kubectl. Document of theia and policy recommendation will be added in a seperate commit later. Signed-off-by: Yongming Ding <dyongming@vmware.com> Co-authored-by: Yanjun Zhou <zhouya@vmware.com> Co-authored-by: Anna Khmelnitsky <akhmelnitsky@vmware.com>
const ( | ||
flowVisibilityNS = "flow-visibility" | ||
k8sQuantitiesReg = "^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$" | ||
sparkImage = "antrea/theia-policy-recommendation:latest" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to change to "projects.registry.vmware.com/antrea/theia-policy-recommendation:latest" before getting merged?
Signed-off-by: Yongming Ding <dyongming@vmware.com>
In this commit, we add the implementation of theia, the command-line
tool for Theia based on cobra framework.
Policy recommendation CLI is also included. It consists of 3 commands
which let users could run a new policy recommendation job, check the
current status of a previous job, and retrieve the recommendation result of a
previous job respectively.