Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metric validation_gateway_solo_io_valid_config seems to get stuck sometimes. #8427

Open
Stefan-Balta opened this issue Jun 29, 2023 · 3 comments
Labels
stale Issues that are stale. These will not be prioritized without further engagement on the issue. Type: Bug Something isn't working

Comments

@Stefan-Balta
Copy link

Stefan-Balta commented Jun 29, 2023

Gloo Edge Version

1.13.x

Kubernetes Version

1.23.x

Describe the bug

When creating or deleting VirtualServices, Upstreams, UpstreamGroups and Services in batch, logs of the gloo pod will contain the following:


{"level":"warn","ts":"2023-06-29T08:39:28.234Z","logger":"gloo.v1.event_loop.setup.gloosnapshot.event_loop.envoyTranslatorSyncer","caller":"syncer/envoy_translator_syncer.go:146","msg":"Proxy had invalid config after xds sanitization","version":"1.13.20","proxy":"name:\"gateway-proxy\"  namespace:\"gloo-system\"","error":"3 errors occurred: 	
        * invalid resource gloo-system.gateway-proxy 	
        * upstream group not found, (Name: demo12, Namespace: pes) 	
        * WARN:    [Route Warning: InvalidDestinationWarning. Reason: *v1.UpstreamGroup { pes.demo12 } not found Route Warning: InvalidDestinationWarning. Reason: *v1.UpstreamGroup { pes.demo12 } not found]  "}

{"level":"warn","ts":"2023-06-29T08:43:42.181Z","logger":"gloo.v1.event_loop.setup.gloosnapshot.event_loop.envoyTranslatorSyncer","caller":"syncer/envoy_translator_syncer.go:146","msg":"Proxy had invalid config after xds sanitization","version":"1.13.20","proxy":"name:\"private-gateway-proxy\"  namespace:\"gloo-system\"","error":"2 errors occurred:
	* invalid resource pes.demo12
	* destination # 1: upstream not found: list did not find upstream pes.demo12-9898
"}

I believe this happens due to the order of creation/deletion (Upstream may be deleted before a VirtualService or a VirtualService may be created before Upstream).
The problem is that the metric validation_gateway_solo_io_valid_config may get stuck at value 0.

This happens in v1.13.20 and v1.14.9, but not in v1.12.33 and v.12.56.

glooctl check doesn't report any errors.

Steps to reproduce the bug

Create a Kubernetes manifest with the following resources, in the following order:

  1. Service
  2. Deployment
  3. Upstream
  4. UpstreamGroup
  5. VirtualService
  6. Additional VirtualService

Apply the manifest and delete it. The metric may or may not get stuck at value 0.

Expected Behavior

I expect the metric to stay at value 1.

Additional Context

No response

@Stefan-Balta Stefan-Balta added the Type: Bug Something isn't working label Jun 29, 2023
@DraganDjuricOB
Copy link

DraganDjuricOB commented Oct 10, 2023

This issue is still present, in both v1.13.27 and v1.14.21

@DanijelaPet
Copy link

The issue remains in v1.15.14.

Copy link

This issue has been marked as stale because of no activity in the last 180 days. It will be closed in the next 180 days unless it is tagged "no stalebot" or other activity occurs.

@github-actions github-actions bot added the stale Issues that are stale. These will not be prioritized without further engagement on the issue. label Jun 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Issues that are stale. These will not be prioritized without further engagement on the issue. Type: Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants