Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Randomly failing to update existing deployments, failing to create the csql container #622

Open
pocesar opened this issue Sep 25, 2024 · 2 comments
Assignees

Comments

@pocesar
Copy link

pocesar commented Sep 25, 2024

Expected Behavior

It should be able to do rolling updates without breaking and be self sufficient in (re)creating the csql container when the selector matches or it fails to be created

Actual Behavior

Fails to update and never recovers by itself

Steps to Reproduce the Problem

  1. Create deployment with a needs-proxy: "1" label
  2. Create AuthProxyWorkload manifest to match the kind: Deployment and selector.matchLabels."needs-proxy" = "1"
  3. First kubectl apply usually works, updating deployments fail half of the time. This error precede this behavior and never recovers by itself, needing to delete the entire pod:
{
  "textPayload": "2024/09/25 20:04:39 http: TLS handshake error from 192.168.1.3:43058: EOF",
  "resource": {
    "type": "k8s_container",
    "labels": {
      "namespace_name": "cloud-sql-proxy-operator-system",
      "container_name": "manager",
      "pod_name": "cloud-sql-proxy-operator-controller-manager-..."
    }
  },
  "timestamp": "2024-09-25T20:04:39.822373239Z",
  "severity": "ERROR",
  "labels": {
    "k8s-pod/pod-template-hash": "6946569c9b",
    "k8s-pod/control-plane": "controller-manager"
  },
  "logName": "projects/.../logs/stderr",
  "receiveTimestamp": "2024-09-25T20:04:42.868056671Z"
}

It then proceeds creating the actual Deployment container, but since it doesn't have the SQL proxy listening on localhost, the new created pod will be in an infinite crash loop since it requires the DB connection.

Specifications

  • Version: 1.5.1
  • Platform: GKE

Side note: it's very hard to read the logs from this operator on GCP, everything is being put on stderr with ERROR severity and the non-structured payloads is very confusing

@hessjcg
Copy link
Collaborator

hessjcg commented Jan 6, 2025

Hello @pocesar,

Can you try upgrading to Proxy Operator v1.6.1? This new version uses the new Kubernetes supported sidecar container for the proxy. Now, the proxy container is guaranteed to be available before Kubernetes starts the main application container.

@pocesar
Copy link
Author

pocesar commented Jan 7, 2025

@hessjcg I won't be able to update the version myself, so I won't be able to validate it 😞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants