Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk endpoint failing with 503 #474

Closed
pdonorio opened this issue Feb 4, 2025 · 5 comments
Closed

Bulk endpoint failing with 503 #474

pdonorio opened this issue Feb 4, 2025 · 5 comments
Labels
waiting for feedback Indicates LaunchDarkly is waiting for customer feedback before issue is closed due to staleness.

Comments

@pdonorio
Copy link

pdonorio commented Feb 4, 2025

Is this a support request?
It seems more of a bug or unclear behavior

Describe the bug
Our relay installed via latest helm chart,
in debug mode shows constantly lines like
DEBUG: Request: method=POST url=/bulk auth=*321cf status=503 bytes=61"

But our apps can reach it,
and debugging via https://support.launchdarkly.com/hc/en-us/articles/18239626123547-Troubleshooting-SDK-connection-to-Relay we were able to get flags correctly from related pods

To reproduce
Run ld-relay with no changes to default values in helm chart

Expected behavior
Not seeing errors, or understand what this error means, as there is no explanation anywhere in this repo or in the docs.

Logs
Just that single line above, many times per minute

Relay Version(s)
8.10.5

SDK Names and Version(s)
Golang - but also Curl works

OS/platform
Kubernetes, pod runs your docker image.
Our apps run on latest ubuntu.

Additional context
Looking at the https://github.com/launchdarkly/ld-relay/blob/v8/docs/endpoints.md
/bulk says "Receives analytics events from SDKs"
What does this exactly mean?
To allow https://docs.launchdarkly.com/home/observability/live-events to work?
I tried using env "USE_EVENTS" set to false but nothing changed.

@keelerm84
Copy link
Member

Hi there 👋🏼 , happy to help with this issue.

DEBUG: Request: method=POST url=/bulk auth=*321cf status=503 bytes=61"

The relay proxy has a middleware for logging information each time one of its own endpoints are accessed. So in this case, some external SDK is hitting the relay proxy's /bulk endpoint, and is receiving a 503 as a result.

The most likely reason the relay proxy would return a 503 for the events endpoint is because that specific environment hasn't been configured. The auth=*321cf points to the environment being accessed. Some SDK is configured using that SDK key. Is the same key configured in the relay?

Looking at the https://github.com/launchdarkly/ld-relay/blob/v8/docs/endpoints.md
/bulk says "Receives analytics events from SDKs"
What does this exactly mean?

You can learn more about analytic events and how they are used here.

I tried using env "USE_EVENTS" set to false but nothing changed.

USE_EVENTS disables the relay proxy from sending events. The log message is occurring because an SDK is attempting to send events to this relay instance.

Please let me know if you are continuing to experience this issue.

@keelerm84 keelerm84 added the waiting for feedback Indicates LaunchDarkly is waiting for customer feedback before issue is closed due to staleness. label Feb 5, 2025
@pdonorio
Copy link
Author

pdonorio commented Feb 5, 2025

The most likely reason the relay proxy would return a 503 for the events endpoint is because that specific environment hasn't been configured. The auth=*321cf points to the environment being accessed. Some SDK is configured using that SDK key. Is the same key configured in the relay?

Yes, the key is the same for SDK and relay 🤔
This happens also in a second environment now.

You can learn more about analytic events and how they are used here.

Oh so this would explain why we do not see "evaluations" in the app UI when enabling relay?

Please let me know if you are continuing to experience this issue.

We do :/ is there supposed to be some other logs that show the endpoints being called for flag evaluations? In debug we only see "status" (200) and "bulk" (503) calls.

@keelerm84
Copy link
Member

Yes, the key is the same for SDK and relay 🤔
This happens also in a second environment now.

Can you provide your relay config (with any key information redacted of course), and examples of how you're configuring the SDKs themselves (also being sure to redact SDKs there as well)?

Oh so this would explain why we do not see "evaluations" in the app UI when enabling relay?

Yes, that's right. Those events are how LD services receive feedback about the SDK evaluation usage.

is there supposed to be some other logs that show the endpoints being called for flag evaluations? In debug we only see "status" (200) and "bulk" (503) calls.

The individual SDKs have logging configurations which should include some information about what they are sending, and when they are failing. I would suggest looking at those.

Also, if you try to hit the /bulk endpoint directly with a curl command, you should be able to read any error response directly from the relay. Something like

curl -i -X POST -d '{}' -H 'Content-Type: application/json' -H "Authorization:YOUR_KEY_HERE" http://your-relay:port/bulk

That may provide some additional information as well.

@pdonorio
Copy link
Author

pdonorio commented Feb 6, 2025

Oh so this would explain why we do not see "evaluations" in the app UI when enabling relay?

Yes, that's right. Those events are how LD services receive feedback about the SDK evaluation usage.

Ok, finally starting to make sense.

Also, if you try to hit the /bulk endpoint directly with a curl command, you should be able to read any error response directly from the relay. Something like

curl -i -X POST -d '{}' -H 'Content-Type: application/json' -H "Authorization:YOUR_KEY_HERE" http://your-relay:port/bulk

That may provide some additional information as well.

So I was able to quickly run this and

HTTP/1.1 503 Service Unavailable
Date: Thu, 06 Feb 2025 17:14:27 GMT
Content-Length: 61
Content-Type: text/plain; charset=utf-8

{"message":"Event proxy is not enabled for this environment"}

Then I have set the USE_EVENTS to true and now it's 200!
I might have tested without this variable a wrong key.

Will check if now everything works correctly in both environment.

@pdonorio
Copy link
Author

pdonorio commented Feb 7, 2025

I think we can close here, Relay seems to be working on both environments 🎊

Just a though: maybe it could be useful to add to docs that to debug you can call the endpoint failing to get more info.

@pdonorio pdonorio closed this as completed Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting for feedback Indicates LaunchDarkly is waiting for customer feedback before issue is closed due to staleness.
Projects
None yet
Development

No branches or pull requests

2 participants