-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor, chore: refactor charm to use Deployment
for workload, also bumps training-operator 1.7->1.8
#167
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
5bb0371
to
32ecee6
Compare
* pin integration test dependencies, refactor constants in tests (#155) Pins dependencies in the integration tests to their corresponding channels for this release. Ref: canonical/bundle-kubeflow#866 Co-authored-by: Andrew Scribner <ca.scribner+1@gmail.com>
* refactor: deploy the training-operator with kubernetes resources This commit refactors the way the training-operator is deployed, as instead of using a sidecar container that runs the workload, we are now applying the Deployment and all the Kubernetes resources required by the training-operator controller to be able to mange training resources. We are introducing this change in preparation for the upcoming 1.8 version, as it introduces the hard dependency on a Kubernetes Secret mounted in a volume for the training-operator workload to start. For more details please refer to #159.
Build charmed-kubeflow-chisme for requirements-integration.txt. Part of charmed-kubeflow-chisme#104
…173) * refactor: apply a workload Service instead of using juju created one To avoid inconsistent behaviours, it is preferrable to apply and use a Service owned by the charm so it can be rendered as needed by the controller.
This commit introduces the following changes: * The charm now renders and applies a ValidatingWebhookConfiguration resource for training-operator CRDs. * The charm will render the Service to also serve on port 9443 for the webhook service. * The oci-image is updated to v1.8 of the training-operator * The training-operator Deployment now has a volume mount for mounting the secret that is used by the cert-controller to generate and rotate certificates for the ValidatingWebhookConfiguration * The training-operator Deployment will now take an argument so the webhook service can use the training-operator workload's Service instead of the default * Updates the examples directory with examples from kubeflow/training-operator v1.8-branch Fixes #159
350b34e
to
8381b07
Compare
* feat: relate to dashboard and add documentation link CKF 1.9
NohaIhab
approved these changes
Jul 9, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I have tested the upgrade path we'll have to go with described in #170 by:
- deploy training-operator 1.7/stable
- wait until active then remove the charm
- redeploy with the channel from this pr
- run a training job from the
/examples
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR merges the changes in the
KF-5692-1.8-dev-branch
intomain
training-operator
cannot be upgraded from1.7/stable
to recent version #170 (tests: skip test_upgrade due to #170 #171)Fixes #159