Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

W&B Controller Manager mandatory access to W&B endpoints #20

Open
vijay-wandb opened this issue Sep 3, 2024 · 1 comment
Open

W&B Controller Manager mandatory access to W&B endpoints #20

vijay-wandb opened this issue Sep 3, 2024 · 1 comment
Assignees
Labels

Comments

@vijay-wandb
Copy link

Description

@platform-delivery-tooling-support folks, I will describe the steps to reproduce an outstanding issue with the operator in air-gapped environments.
In short, if the Controller Manager can’t reach the W&B endpoint, the entire stack will fail to be deployed.
In other words, this is a blocker.

  1. Deploy the operator using the helm chart

helm upgrade --install -n wandb operator wandb/operator

  1. cut the access of cluster nodes from the internet (this step was used only to repro)

  2. apply the CRD

kubectl apply -f wandb.yaml

At this point, nothing will happen. Not a single container will be created.

  1. check the Controller Manager logs
{"level":"dpanic","ts":"2024-08-07T13:33:40Z","msg":"non-string key argument passed to logging, ignoring all later arguments","controller":"weightsandbiases","controllerGroup":"apps.wandb.com","controllerKind":"WeightsAndBiases","WeightsAndBiases":{"name":"wandb","namespace":"default"},"namespace":"default","name":"wandb","reconcileID":"08ea8f72-2387-4b9b-b445-0afc2b2caa58","invalid key":"Secret \"wandb-latest-cached-release\" not found","stacktrace":"github.com/wandb/operator/controllers.(*WeightsAndBiasesReconciler).Reconcile\n\t/workspace/controllers/weightsandbiases_controller.go:135\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.1/pkg/internal/controller/controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.1/pkg/internal/controller/controller.go:323\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.1/pkg/internal/controller/controller.go:274\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.1/pkg/internal/controller/controller.go:235"}
{"level":"error","ts":"2024-08-07T13:33:40Z","msg":"No cached release found for deployer spec","controller":"weightsandbiases","controllerGroup":"apps.wandb.com","controllerKind":"WeightsAndBiases","WeightsAndBiases":{"name":"wandb","namespace":"default"},"namespace":"default","name":"wandb","reconcileID":"08ea8f72-2387-4b9b-b445-0afc2b2caa58","error":"Secret \"wandb-latest-cached-release\" not...

Issue created in Slack from a [message](https://weightsandbiases.slack.com/archives/C06CKRTPKDF/p1723039059602709?thread_ts=1723039059.602709&cid=C06CKRTPKDF).
@abhinavg6
Copy link

@flamarion - As discussed in the meeting, could you plz share the exact repro steps with the exact commands and values.yaml here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants