-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting 2 entries per time frame when using count with group by time #321
Comments
which version of influxdb are you running ? you could have hit this bug #308 can you login to the admin interface as root and check the shard ids under the cluster tab. If you see multiple shards with the same id then you'll have to nuke your data directory and start fresh. |
I just tried with latest (RC5) and that still happened, not sure whether that contains the #308 fix. I see 1 server, 125 shards, don't seem to see duplicates. |
can you take a screenshot of the shards and post it? On Mon, Mar 10, 2014 at 1:18 PM, Thibaut Colar notifications@github.comwrote:
|
It doesn't fit in a single screen, but here is a text copy: Servers |
Hmm, that all looks good. I'm guessing it has to do with your group by time interval being so large. Do you see dupes if you lower it to something like 24h? |
Yeah it still happens with 24 hours. Actually it seem the more samples the more likely to find some with dups.
I could send you the db data if that's helpful. |
yeah, a dump of the entire data directory would definitely help. On Mon, Mar 10, 2014 at 2:03 PM, Thibaut Colar notifications@github.comwrote:
|
What's the best way to send you that, it's not super sensitive data but i don't want it public either. (Data folder seem to be about 12MB) |
just shared a dropbox folder, can you put it in there? On Mon, Mar 10, 2014 at 2:44 PM, Thibaut Colar notifications@github.comwrote:
|
Thanks, uploaded. |
got it! I'll have a look |
BTW, it's not urgent :) |
@tcolar So, after digging into your data, we discovered that a handful of the data points you wrote in ended up in the wrong shard (based on start and end times), which should theoretically be impossible to do. We're trying to figure out exactly how that might have happened, but in the meantime, would you mind sending us a copy of your config? Also, what version of InfluxDB were you running when you initially wrote the data in? Thanks! |
I see, The Influx config is vanilla, didn't change anything. It's on a Linux 64 bits machine. My data was inserted with a batch job so I can easily recreate it from scratch and see if it happens again I believe. Note too that I did insert a lot of points at once (maybe something like 50k using writeSeries with the Go driver) didn't see any errors but letting you know. One last thing to mention is that I did nuke and recreate the db a few times before i got the data I wanted, mentioning that because suppose if deleting the db went wrong it might possibly have been part of the issue ? |
@tcolar ok, we figured it out. There was an off by one bug that was causing write requests that contained points split across multiple shards to not be separated out correctly. It's fixed in the rc.5 release, which we're pushing right now. Unfortunately, to fix this you'll need to blow away your data and reload. :( |
Yeah, no worries about rebuilding the data I rather deal with this now On Tue, Mar 11, 2014 at 4:16 PM, Paul Dix notifications@github.com wrote:
|
I would expect to get a single value(count) per time interval.
Instead I'm getting one with the count plus another one that seem to always have the value 1, see screenshot.
The text was updated successfully, but these errors were encountered: