-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RC32: Frequent write errors #2849
Comments
@bww RC32 had some significant changes in the write path that might be causing this - we're actively trying to stabilize that new code, so this is really helpful. is this just running on a single node? can you tell us a little more about the average write rate when this happens? |
@toddboom Yeah, it's a single node. Write frequency is pretty light, it's collecting metrics sampled every 5 seconds from 5 or 6 different client hosts. |
@bww thanks for the quick response. do the errors go away after a while or does it stay in a high-error state once is starts? |
#2853 adds more logging to help troubleshoot write error. |
@toddboom Once it gets into a high-error state (which it invariably does) it seems to remain that way. Restarting the service doesn't really seem to do anything to help. However, clearing out the database ( Just in case it's somehow significant, I'll also mention that the service runs on a different volume than its data directory. I'll try the logging in #2853 when I get a minute and see if that yields anything useful. |
I am having the same issue for what it is worth. However, more specifically I am seeing the following:
I am running a 3 node cluster. |
After upgrading to RC32 I'm seeing write errors ranging between 25-75% of requests for no apparent reason. Sometimes there will be no problem for an hour followed by an extended period with high error rates.
I would expect either all (or essentially all) writes to succeed or all writes to fail if there's some kind of misconfiguration. A huge percentage of writes failing, but not all, is concerning.
There isn't any obvious problem with my setup that I can see and I've had earlier 0.9.x versions running successfully for months on this host.
The service responds with the following opaque error:
I can't find a way to get the service to trace anything more useful than the HTTP request log line, e.g.,
I tried setting
write-tracing
totrue
in config to get some more diagnostic information but as far as I can tell that didn't seem to do anything.I'm using the RC32 client from this repo in
/client
to send writes. The following is typical of the input I'm sending:Server OS: 64-bit Ubuntu 14.04.2 LTS, from .deb package
Client OS: tested with Ubuntu 14.04.2 LTS and OS X 10.10.3
Release: 0.9 RC32
The text was updated successfully, but these errors were encountered: