-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent NetworkError #378
Comments
Thanks for the issue @phillipadsmith . We are currently working on making the clients more resilient and pushed out some changes in ruby and node this week to address this problem. The client should be able to gracefully handle this by retrying on "open" timeouts. Apologies for the problems this is causing. We can prioritize fixing python next. |
Any other details you are seeing the response? I want to make 100% sure we're gonna solve the right problem. |
Hi @bhelx -- thanks for the quick reply. That's all of the information I'm getting back from the exception. If you have a suggestion for how I can get a more verbose error message, I'd be happy to try it out. |
Sure, let me know if you have any more info. We'll keep moving forward with the expected fix in the mean time. |
@phillipadsmith what's the purpose of |
If you're just trying to get the first / only account that matches, this will help you when it lands: #367 |
I was hoping it would shorten the response time to only request one record. And, I'm only searching for one record (and only one should ever be returned).
Ah-ha. Yes, that's what I'm trying to do. I'll keep an eye out for that release. |
Great, just checking. That's an appropriate way to use it. There has been a little confusion over what |
@bhelx What was the fix implemented in the Ruby and Node modules? Is there a similar tactic I could implement in my version of the module, or in the application itself, e.g., catching the exception and trying the request again some limited number of times? |
@bhelx And what is the root cause of the timeouts? Is it the Recurly API not responding? I've never experienced this timeout when using the API endpoints directly with Postman or CURL or similar. |
@phillipadsmith I can't say what the root cause is just yet. At scale though, you're gonna have some random TCP-level blips. Most of the time, it's something that can be recovered from. Some http libraries handle this for you under the hood and postman may be doing that as well. We are trying our best to keep these clients dependency free for security and maintainability reasons so we are implementing some of this ourselves. |
Again sorry about this, I'll be working on implementing and testing this the next couple days. |
@bhelx No need to apologize. I appreciate the quick replies and it's helpful just to hear that I'm not crazy and that this is a known issue. :) |
Couple of quick questions, @bhelx:
True enough, but I've only got this on a test server at the moment, so we're not running any traffic against this yet -- it's just me and one other other person testing, and the exception happens quite often (almost too often, IMHO, to be a blip -- almost like there's another issue at play).
Yes, interestingly, I've never had this happen when running the app from Flask, and I've never seen it in Postman. And it never happens immediately after I start the app with Gunicorn, only after some time has passed, perhaps hours. Then it happens very frequently and very consistently until I restart the application. Is there anything about running this library in a pre-fork WSGI server that might be introducing this network error? |
This is something I personally see more on the order of 1 out of ~10,000 requests. If you are able to reproduce, could you try to print out some more information from the underlying socket error?
So your webserver is Since forking is copy-on-write, my gut would tell me that you want to create the connection after you fork. You could test this theory by trying to create the client inside of the |
@phillipadsmith i'm not sure if flask has a special way to deal with this, but also consider using a "post-fork" hook of some kind: https://docs.pylonsproject.org/projects/pyramid_cookbook/en/latest/deployment/forked_threaded_servers.html?highlight=sqlalchemy Perhaps flask has a way of letting you run some code after a fork happens. |
Thanks for those thoughts, @bhelx, appreciated.
Oh, yes, definitely able to get it 8/10 times right now.
Is there an easy way to get more verbose output from the exception? My assumption was that it was coming from the Recurly Python client. Many thanks in advance! |
Oh, yeah that suggests a different problem entirely. Although I think the fix I'm working on might help. But I think we need to address the root problem.
A stack trace would be helpful. Let it explode. When i evoke one it looks like this:
|
👍Let me see if I can get a stack trace. |
I've imported |
@bhelx Hey, I'm not getting any output from Any idea what I've got wrong here? |
I'm not really familiar with it's usage, what happens if you just remove the whole rescue block for network exceptions? that's what i did in my script. python dumps the stack trace for you. |
🤦♂ Didn't even think of that. |
@phillipadsmith did you look into my earlier comment about creating the client after fork? I'm reading some people recommend you use the
I'm not sure how to reproduce your problem so I can't test this assumption on your app. |
I will read up on those and give them a try also. I'm still trying to get a useful stack trace! Thanks @bhelx, appreciated. |
Thanks for you help, @bhelx. I've made an update to the app so that Gunicorn will receive a I haven't tested extensively, but I'm currently not able to trigger the error. I'll keep you updated. |
Hey @bhelx! Quick update: So the I'm now trying the approach of creating the client inside the I'll read up on the 🤞 |
Happy Saturday @bhelx, Just a quick update on this:
As you mention, this is not an ideal way to have something set-up for a production application, and yet this is a very low use application so I believe we're going to try it out in production. I was never able to trigger a useful stack trace in part because Flask doesn't surface its errors up to Gunicorn, and the network errors only happened when running the application using Gunicorn, so it was a bit of a rabbit hole trying to figure out how to work around that while also hoping for the error to surface again. All that said, I hope it's useful to know that there appears to be some kind of a intermitted network error being thrown by the Recurly client in this rather common configuration of Flask + Gunicorn. Let me know if you have any questions. Know that we've got a work around that I can deploy for this project, I'm happy to go back and re-test things in my development environment. Many thanks for your help & have a great weekend, |
@phillipadsmith I appreciate you digging so deeply into it. It does sound like there is some kind of issue with Gunicorn and the open connection. I'd like to keep this issue open until we fix it. When my team gets a free moment, we'll try to make a minimal working example and see if we can reproduce. If it does turn out that there is a change that people will need to make in their Gunicorn apps, then we'll document it. |
I encountered this same problem. In my case, the problem was using a single recurly.Client instance for multiple connections. Probably the recurly requests were stomping on each other. Using a recurly.Client instance per connection resolved the "Remote end closed connection without response" problems. |
@ridersofrohan I'm no longer at Recurly. I think the best person to contact would be @douglasmiller . Regarding this code though. It doesn't use requests so you can't just set a session anywhere. It uses the native http.client module. I suspect a problem here is that we are creating one connection per client:
It might make more sense to use a connection pool of some kind in the base client rather than a single connection like we do in the other clients. |
Warning: Github issues are not an official support channel for Recurly. If you have an urgent request we suggest you contact us through an official channel: support@recurly.com or https://recurly.zendesk.com
Describe the bug
I've built a simple integration using this library and I am seeing an intermittent
recurly.NetworkError
exception with the message "Remote end closed connection without response"This appears to be happening most often for this API call:
client.list_accounts()
To Reproduce
You can see the call in context here:
https://github.com/TheTyee/manage-account.thetyee.ca/blob/master/app.py#L56-L79
Expected behavior
For it to work consistently! :)
Your Environment
Which version of this library are you using?
Recurly Python client 3.0
Which version of the language are you using?
Python 3.8.2
The text was updated successfully, but these errors were encountered: