Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flux bootstrap failing | context deadline exceeded #3974

Closed
1 task done
IamEarlJohn opened this issue Jun 14, 2023 · 19 comments
Closed
1 task done

flux bootstrap failing | context deadline exceeded #3974

IamEarlJohn opened this issue Jun 14, 2023 · 19 comments

Comments

@IamEarlJohn
Copy link

IamEarlJohn commented Jun 14, 2023

Describe the bug

Whenever I try bootstrap in flux github it always ends with client rate limiter Wait returned an error: context deadline exceeded

Steps to reproduce

  1. Install flux
  2. Bootstrap github controller

Expected behavior

Flux would bootstrap github controller properly and with no issues, same as the steps provided in fluxcd.io

Screenshots and recordings

:~$ flux bootstrap github --owner=$GITHUB_USER --repository=fleet-infra --branch=main --path=./clusters/my-cluster --personal --timeout 10m
► connecting to github.com
► cloning branch "main" from Git repository "https://github.com/IamEarlJohn/fleet-infra.git"
✔ cloned repository
► generating component manifests
'# Warning: 'patchesJson6902' is deprecated. Please use 'patches' instead. Run 'kustomize edit fix' to update your Kustomization automatically.
✔ generated component manifests
✔ committed sync manifests to "main" ("b871f2aa54076cf6c129f383cecc8c8c182bc527")
► pushing component manifests to "https://github.com/IamEarlJohn/fleet-infra.git"
► installing components in "flux-system" namespace
✔ installed components
✔ reconciled components
► determining if source secret "flux-system/flux-system" exists
✔ source secret up to date
► generating sync manifests
✔ generated sync manifests
✔ sync manifests are up to date
► applying sync manifests
✔ reconciled sync configuration
◎ waiting for Kustomization "flux-system/flux-system" to be reconciled
✗ client rate limiter Wait returned an error: context deadline exceeded
► confirming components are healthy
✔ helm-controller: deployment ready
✔ kustomize-controller: deployment ready
✔ notification-controller: deployment ready
✔ source-controller: deployment ready
✔ all components are healthy
✗ bootstrap failed with 1 health check failure(s)


:~$ kubectl -n flux-system get pods
NAME READY STATUS RESTARTS AGE
helm-controller-c8466f78b-rthtf 1/1 Running 1 (3h31m ago) 6d22h
kustomize-controller-666f8f4b5f-588n2 1/1 Running 1 (3h31m ago) 6d22h
notification-controller-55d78c78c-9wqcn 1/1 Running 1 (3h31m ago) 6d22h
source-controller-557989894-mphsv 1/1 Running 1 (3h31m ago) 6d22h

OS / Distro

Ubuntu 20.04

Flux version

flux version 2.0.0-rc.5

Flux check

:~$ flux check
► checking prerequisites
✔ Kubernetes 1.26.3 >=1.20.6-0
► checking controllers
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.34.1
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v1.0.0-rc.4
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v1.0.0-rc.4
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v1.0.0-rc.5
► checking crds
✔ alerts.notification.toolkit.fluxcd.io/v1beta2
✔ buckets.source.toolkit.fluxcd.io/v1beta2
✔ gitrepositories.source.toolkit.fluxcd.io/v1
✔ helmcharts.source.toolkit.fluxcd.io/v1beta2
✔ helmreleases.helm.toolkit.fluxcd.io/v2beta1
✔ helmrepositories.source.toolkit.fluxcd.io/v1beta2
✔ kustomizations.kustomize.toolkit.fluxcd.io/v1
✔ ocirepositories.source.toolkit.fluxcd.io/v1beta2
✔ providers.notification.toolkit.fluxcd.io/v1beta2
✔ receivers.notification.toolkit.fluxcd.io/v1
✔ all checks passed

Git provider

Github (Personal account)

Container Registry provider

No response

Additional context

:~$ flux logs
2023-06-14T01:31:52.220Z info Kustomization/flux-system.flux-system - Source is not ready, artifact not found
2023-06-14T01:41:52.219Z info Kustomization/flux-system.flux-system - Source is not ready, artifact not found
2023-06-14T01:51:52.213Z info Kustomization/flux-system.flux-system - Source is not ready, artifact not found
2023-06-14T02:01:52.212Z info Kustomization/flux-system.flux-system - Source is not ready, artifact not found
2023-06-14T02:11:52.213Z info Kustomization/flux-system.flux-system - Source is not ready, artifact not found
2023-06-14T02:21:52.213Z info Kustomization/flux-system.flux-system - Source is not ready, artifact not found
2023-06-14T02:31:52.207Z info Kustomization/flux-system.flux-system - Source is not ready, artifact not found

Code of Conduct

  • I agree to follow this project's Code of Conduct
@somtochiama
Copy link
Member

what does flux get source git flux-system say?

@IamEarlJohn
Copy link
Author

what does flux get source git flux-system say?

This is the output:

:~$ flux get source git flux-system
NAME REVISION SUSPENDED READY MESSAGE

flux-system False False failed to checkout and determine revision: unable to clone 'ssh://git@github.com/IamEarlJohn/fleet-infra': ssh: handshake failed: knownhosts: key is unknown

@IamEarlJohn
Copy link
Author

image

@somtochiama
Copy link
Member

► determining if source secret "flux-system/flux-system" exists
✔ source secret up to date

It seems you had the source secret on the cluster already. Can you try deleting it and running the same bootstrap command again so that it recreates it?

kubectl delete -n flux-system secret flux-system
flux bootstrap ....

@IamEarlJohn
Copy link
Author

► determining if source secret "flux-system/flux-system" exists
✔ source secret up to date

It seems you had the source secret on the cluster already. Can you try deleting it and running the same bootstrap command again so that it recreates it?

kubectl delete -n flux-system secret flux-system
flux bootstrap ....

I did try to delete the flux-system secret and re-run the flux bootstrap command, now it works as the screenshot below:

image

@somtochiama
Copy link
Member

@IamEarlJohn that's great. I will close this issue now.

@IamEarlJohn
Copy link
Author

IamEarlJohn commented Jun 14, 2023

Sorry but I still do have some help needed, please don't close it yet.

@somtochiama somtochiama reopened this Jun 14, 2023
@IamEarlJohn
Copy link
Author

IamEarlJohn commented Jun 14, 2023

Now, on the other hand on our private repo, the case is it only supports ssh cloning. So currently I'm encountering this:

~ # flux get source git flux-system
NAME REVISION SUSPENDED READY MESSAGE

flux-system False False failed to checkout and determine revision: unable to clone 'ssh://git@ssh.com:7999/earl/blue-green-status-check.git': dial tcp 3.19.108.9:7999: connect: connection timed out

@IamEarlJohn
Copy link
Author

IamEarlJohn commented Jun 14, 2023

This is how the command goes:

flux bootstrap git --url=ssh://git@ssh.com:7999/earl/blue-green-status-check.git --branch=main --path=clusters/my-cluster

@somtochiama
Copy link
Member

You have to give it your ssh key

# Run bootstrap for a Git repository with a passwordless private key
  flux bootstrap git --url=ssh://git@example.com/repository.git --private-key-file=<path/to/private.key> --path=clusters/my-cluster

Add the --password flag if the private key is password protected.

@IamEarlJohn
Copy link
Author

I did try providing the private key but I still got the 'context deadline exceeded error'

~ # flux get source git flux-system
NAME REVISION SUSPENDED READY MESSAGE

flux-system False False failed to checkout and determine revision: unable to clone 'ssh://git@oxfordssh.com:7999/earl/blue-green-status-check.git': dial tcp 3.19.108.9:7999: connect: connection timed out


◎ waiting for Kustomization "flux-system/flux-system" to be reconciled
✗ client rate limiter Wait returned an error: context deadline exceeded
► confirming components are healthy
✔ helm-controller: deployment ready
✔ kustomize-controller: deployment ready
✔ notification-controller: deployment ready
✔ source-controller: deployment ready
✔ all components are healthy
✗ bootstrap failed with 1 health check failure(s)

@somtochiama
Copy link
Member

There's some network connection issue and the source controller is unable to reach the git server. Please make sure that egress to the Git server is working on the cluster

@IamEarlJohn
Copy link
Author

@somtochiama hi, whenever I try to do flux uninstall and then do flux check, I'm encountering this weird error below:

~$ flux check
► checking prerequisites
✔ Kubernetes 1.27.1 >=1.20.6-0
► checking controllers
✗ no controllers found in the 'flux-system' namespace with the label selector 'app.kubernetes.io/part-of=flux'
► checking crds
✗ no crds found with the label selector 'app.kubernetes.io/part-of=flux'
✗ check failed

Can you please enlighten me what heppened? or how did it happened?

@hiddeco
Copy link
Member

hiddeco commented Jun 29, 2023

When you run flux uninstall, all Flux resources are removed from the cluster. Which yields that error, because they no longer can be found. To just check on the version of the CLI itself without consulting Kubernetes, run flux version --client.

@IamEarlJohn
Copy link
Author

Also, going back on this error about 'context deadline exceeded':

~ # flux bootstrap github --owner=$GITHUB_USER --repository=fleet-infra --branch=main --path=./clusters/my-cluster --personal
► connecting to github.com
► cloning branch "main" from Git repository "https://github.com/IamEarlJohn/fleet-infra.git"
✔ cloned repository
► generating component manifests

Warning: 'patchesJson6902' is deprecated. Please use 'patches' instead. Run 'kustomize edit fix' to update your Kustomization automatically.

✔ generated component manifests
✔ component manifests are up to date
► installing components in "flux-system" namespace
✔ installed components
✔ reconciled components
► determining if source secret "flux-system/flux-system" exists
► generating source secret
✔ public key: ************
✔ configured deploy key "flux-system-main-flux-system-./clusters/my-cluster" for "https://github.com/IamEarlJohn/fleet-infra"
► applying source secret "flux-system/flux-system"
✔ reconciled source secret
► generating sync manifests
✔ generated sync manifests
✔ sync manifests are up to date
► applying sync manifests
✔ reconciled sync configuration
◎ waiting for Kustomization "flux-system/flux-system" to be reconciled
✗ client rate limiter Wait returned an error: context deadline exceeded
► confirming components are healthy
✔ helm-controller: deployment ready
✔ kustomize-controller: deployment ready
✔ notification-controller: deployment ready
✔ source-controller: deployment ready
✔ all components are healthy
✗ bootstrap failed with 1 health check failure(s)


cchcchdevtstax4@dma.hcat-dev.us-east-1 ~ # flux get source git flux-system
NAME REVISION SUSPENDED READY MESSAGE

flux-system False False failed to checkout and determine revision: unable to clone 'ssh://git@github.com/IamEarlJohn/fleet-infra': dial tcp 140.82.112.4:22: connect: connection timed out


My question is, even github the egress should also be configured on the cluster to connect, right?

@IamEarlJohn
Copy link
Author

Although I'm using https with Github but on the output from flux get source git flux-system it is showing as ssh url.
Is that expected?

@IamEarlJohn
Copy link
Author

When you run flux uninstall, all Flux resources are removed from the cluster. Which yields that error, because they no longer can be found. To just check on the version of the CLI itself without consulting Kubernetes, run flux version --client.

@hiddeco is there any solution for this? or workaround?

@sveinpj
Copy link

sveinpj commented Aug 11, 2023

I had this issue for some days and with no luck to resolve the issue. Must have tried dozen of times with different paramters with no luck, until I narrowed it by cordon the node where flux related pods where located. Did a 'flux uninstall' and 'flux bootstrap...' again and Whoila....:-). The cordon node was running Ubuntu LTS 22.04, while all the others were running Ubuntu (normal) 23.04. Ugraded the cordon node with same version as the others nodes and it seems to work now. Also checked that all nodes now have the same dns config (/etc/netplan/xyz.yaml). Now the flux pods are balanced between the nodes and ' k get ks -A' show that everything is fine :-)

@Mihai-CMM
Copy link

On my similar case the issue was that i was not bootstrapping with --cluster-domain argument and my env is not cluster.local
Worth a check for you to see if you have the same

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants