Getting 2 entries per time frame when using count with group by time #321

tcolar · 2014-03-10T17:06:57Z

I would expect to get a single value(count) per time interval.
Instead I'm getting one with the count plus another one that seem to always have the value 1, see screenshot.

jvshahid · 2014-03-10T17:13:17Z

which version of influxdb are you running ? you could have hit this bug #308 can you login to the admin interface as root and check the shard ids under the cluster tab. If you see multiple shards with the same id then you'll have to nuke your data directory and start fresh.

tcolar · 2014-03-10T17:18:50Z

I just tried with latest (RC5) and that still happened, not sure whether that contains the #308 fix.

I see 1 server, 125 shards, don't seem to see duplicates.

pauldix · 2014-03-10T17:48:13Z

can you take a screenshot of the shards and post it?

On Mon, Mar 10, 2014 at 1:18 PM, Thibaut Colar notifications@github.comwrote:

I just tried with latest (RC5) and that still happened, not sure whether
that contains the #308 https://github.com/influxdb/influxdb/pull/308fix.

I see 1 server, 125 shards, don't seem to see duplicates.

Reply to this email directly or view it on GitHubhttps://github.com//issues/321#issuecomment-37208435
.

tcolar · 2014-03-10T17:52:31Z

It doesn't fit in a single screen, but here is a text copy:

Servers
Id Connection String
1 influx:8099
Shards
Short Term
Id Start Time End Time Servers
125 2014-03-05 16:00:00 2014-03-12 17:00:00 [1]
124 2014-02-26 16:00:00 2014-03-05 16:00:00 [1]
1 2014-02-19 16:00:00 2014-02-26 16:00:00 [1]
123 2014-02-12 16:00:00 2014-02-19 16:00:00 [1]
122 2014-02-05 16:00:00 2014-02-12 16:00:00 [1]
121 2014-01-29 16:00:00 2014-02-05 16:00:00 [1]
120 2014-01-22 16:00:00 2014-01-29 16:00:00 [1]
119 2014-01-15 16:00:00 2014-01-22 16:00:00 [1]
118 2014-01-08 16:00:00 2014-01-15 16:00:00 [1]
117 2014-01-01 16:00:00 2014-01-08 16:00:00 [1]
116 2013-12-25 16:00:00 2014-01-01 16:00:00 [1]
115 2013-12-18 16:00:00 2013-12-25 16:00:00 [1]
114 2013-12-11 16:00:00 2013-12-18 16:00:00 [1]
113 2013-12-04 16:00:00 2013-12-11 16:00:00 [1]
112 2013-11-27 16:00:00 2013-12-04 16:00:00 [1]
111 2013-11-20 16:00:00 2013-11-27 16:00:00 [1]
110 2013-11-13 16:00:00 2013-11-20 16:00:00 [1]
109 2013-11-06 16:00:00 2013-11-13 16:00:00 [1]
108 2013-10-30 17:00:00 2013-11-06 16:00:00 [1]
107 2013-10-23 17:00:00 2013-10-30 17:00:00 [1]
106 2013-10-16 17:00:00 2013-10-23 17:00:00 [1]
105 2013-10-09 17:00:00 2013-10-16 17:00:00 [1]
104 2013-10-02 17:00:00 2013-10-09 17:00:00 [1]
103 2013-09-25 17:00:00 2013-10-02 17:00:00 [1]
102 2013-09-18 17:00:00 2013-09-25 17:00:00 [1]
101 2013-09-11 17:00:00 2013-09-18 17:00:00 [1]
100 2013-09-04 17:00:00 2013-09-11 17:00:00 [1]
99 2013-08-28 17:00:00 2013-09-04 17:00:00 [1]
98 2013-08-21 17:00:00 2013-08-28 17:00:00 [1]
97 2013-08-14 17:00:00 2013-08-21 17:00:00 [1]
96 2013-08-07 17:00:00 2013-08-14 17:00:00 [1]
95 2013-07-31 17:00:00 2013-08-07 17:00:00 [1]
94 2013-07-24 17:00:00 2013-07-31 17:00:00 [1]
93 2013-07-17 17:00:00 2013-07-24 17:00:00 [1]
92 2013-07-10 17:00:00 2013-07-17 17:00:00 [1]
91 2013-07-03 17:00:00 2013-07-10 17:00:00 [1]
90 2013-06-26 17:00:00 2013-07-03 17:00:00 [1]
89 2013-06-19 17:00:00 2013-06-26 17:00:00 [1]
88 2013-06-12 17:00:00 2013-06-19 17:00:00 [1]
87 2013-06-05 17:00:00 2013-06-12 17:00:00 [1]
86 2013-05-29 17:00:00 2013-06-05 17:00:00 [1]
85 2013-05-22 17:00:00 2013-05-29 17:00:00 [1]
84 2013-05-15 17:00:00 2013-05-22 17:00:00 [1]
83 2013-05-08 17:00:00 2013-05-15 17:00:00 [1]
82 2013-05-01 17:00:00 2013-05-08 17:00:00 [1]
81 2013-04-24 17:00:00 2013-05-01 17:00:00 [1]
80 2013-04-17 17:00:00 2013-04-24 17:00:00 [1]
79 2013-04-10 17:00:00 2013-04-17 17:00:00 [1]
78 2013-04-03 17:00:00 2013-04-10 17:00:00 [1]
77 2013-03-27 17:00:00 2013-04-03 17:00:00 [1]
76 2013-03-20 17:00:00 2013-03-27 17:00:00 [1]
75 2013-03-13 17:00:00 2013-03-20 17:00:00 [1]
74 2013-03-06 16:00:00 2013-03-13 17:00:00 [1]
73 2013-02-27 16:00:00 2013-03-06 16:00:00 [1]
72 2013-02-20 16:00:00 2013-02-27 16:00:00 [1]
71 2013-02-13 16:00:00 2013-02-20 16:00:00 [1]
70 2013-02-06 16:00:00 2013-02-13 16:00:00 [1]
69 2013-01-30 16:00:00 2013-02-06 16:00:00 [1]
68 2013-01-23 16:00:00 2013-01-30 16:00:00 [1]
67 2013-01-16 16:00:00 2013-01-23 16:00:00 [1]
66 2013-01-09 16:00:00 2013-01-16 16:00:00 [1]
65 2013-01-02 16:00:00 2013-01-09 16:00:00 [1]
64 2012-12-26 16:00:00 2013-01-02 16:00:00 [1]
63 2012-12-19 16:00:00 2012-12-26 16:00:00 [1]
62 2012-12-12 16:00:00 2012-12-19 16:00:00 [1]
61 2012-12-05 16:00:00 2012-12-12 16:00:00 [1]
60 2012-11-28 16:00:00 2012-12-05 16:00:00 [1]
59 2012-11-21 16:00:00 2012-11-28 16:00:00 [1]
58 2012-11-14 16:00:00 2012-11-21 16:00:00 [1]
57 2012-11-07 16:00:00 2012-11-14 16:00:00 [1]
56 2012-10-31 17:00:00 2012-11-07 16:00:00 [1]
55 2012-10-24 17:00:00 2012-10-31 17:00:00 [1]
54 2012-10-17 17:00:00 2012-10-24 17:00:00 [1]
53 2012-10-10 17:00:00 2012-10-17 17:00:00 [1]
52 2012-10-03 17:00:00 2012-10-10 17:00:00 [1]
51 2012-09-26 17:00:00 2012-10-03 17:00:00 [1]
50 2012-09-19 17:00:00 2012-09-26 17:00:00 [1]
49 2012-09-12 17:00:00 2012-09-19 17:00:00 [1]
48 2012-09-05 17:00:00 2012-09-12 17:00:00 [1]
47 2012-08-29 17:00:00 2012-09-05 17:00:00 [1]
46 2012-08-22 17:00:00 2012-08-29 17:00:00 [1]
45 2012-08-15 17:00:00 2012-08-22 17:00:00 [1]
44 2012-08-08 17:00:00 2012-08-15 17:00:00 [1]
43 2012-08-01 17:00:00 2012-08-08 17:00:00 [1]
42 2012-07-25 17:00:00 2012-08-01 17:00:00 [1]
41 2012-07-18 17:00:00 2012-07-25 17:00:00 [1]
40 2012-07-11 17:00:00 2012-07-18 17:00:00 [1]
39 2012-07-04 17:00:00 2012-07-11 17:00:00 [1]
38 2012-06-27 17:00:00 2012-07-04 17:00:00 [1]
37 2012-06-20 17:00:00 2012-06-27 17:00:00 [1]
36 2012-06-13 17:00:00 2012-06-20 17:00:00 [1]
35 2012-06-06 17:00:00 2012-06-13 17:00:00 [1]
34 2012-05-30 17:00:00 2012-06-06 17:00:00 [1]
33 2012-05-09 17:00:00 2012-05-16 17:00:00 [1]
32 2012-04-04 17:00:00 2012-04-11 17:00:00 [1]
31 2012-03-28 17:00:00 2012-04-04 17:00:00 [1]
30 2012-03-21 17:00:00 2012-03-28 17:00:00 [1]
29 2012-03-14 17:00:00 2012-03-21 17:00:00 [1]
28 2012-03-07 16:00:00 2012-03-14 17:00:00 [1]
27 2012-02-29 16:00:00 2012-03-07 16:00:00 [1]
26 2012-02-22 16:00:00 2012-02-29 16:00:00 [1]
25 2012-02-15 16:00:00 2012-02-22 16:00:00 [1]
24 2012-02-08 16:00:00 2012-02-15 16:00:00 [1]
23 2012-01-25 16:00:00 2012-02-01 16:00:00 [1]
22 2012-01-11 16:00:00 2012-01-18 16:00:00 [1]
21 2012-01-04 16:00:00 2012-01-11 16:00:00 [1]
20 2011-12-28 16:00:00 2012-01-04 16:00:00 [1]
19 2011-12-21 16:00:00 2011-12-28 16:00:00 [1]
18 2011-12-14 16:00:00 2011-12-21 16:00:00 [1]
17 2011-12-07 16:00:00 2011-12-14 16:00:00 [1]
16 2011-11-30 16:00:00 2011-12-07 16:00:00 [1]
15 2011-11-16 16:00:00 2011-11-23 16:00:00 [1]
14 2011-11-09 16:00:00 2011-11-16 16:00:00 [1]
13 2011-11-02 17:00:00 2011-11-09 16:00:00 [1]
12 2011-10-26 17:00:00 2011-11-02 17:00:00 [1]
11 2011-10-19 17:00:00 2011-10-26 17:00:00 [1]
10 2011-10-05 17:00:00 2011-10-12 17:00:00 [1]
9 2011-09-21 17:00:00 2011-09-28 17:00:00 [1]
8 2011-09-14 17:00:00 2011-09-21 17:00:00 [1]
7 2011-08-24 17:00:00 2011-08-31 17:00:00 [1]
6 2011-08-17 17:00:00 2011-08-24 17:00:00 [1]
5 2011-08-10 17:00:00 2011-08-17 17:00:00 [1]
4 2011-08-03 17:00:00 2011-08-10 17:00:00 [1]
3 2011-07-27 17:00:00 2011-08-03 17:00:00 [1]
2 2011-07-20 17:00:00 2011-07-27 17:00:00 [1]

pauldix · 2014-03-10T17:54:51Z

Hmm, that all looks good. I'm guessing it has to do with your group by time interval being so large. Do you see dupes if you lower it to something like 24h?

tcolar · 2014-03-10T18:03:05Z

Yeah it still happens with 24 hours.
With 720 hours it actually works better (no dups) although that's fewer samples.

Actually it seem the more samples the more likely to find some with dups.
Couple more remarks:

they don't ALL have dups
It's always one or 2 entries with same time stamps (never seen more than 2)

I could send you the db data if that's helpful.

pauldix · 2014-03-10T18:36:08Z

yeah, a dump of the entire data directory would definitely help.

On Mon, Mar 10, 2014 at 2:03 PM, Thibaut Colar notifications@github.comwrote:

Yeah it still happens with 24 hours.
With 720 hours it actually works better (no dups) although that's fewer
samples.

Actually it seem the more samples the more likely to find some with dups.
Couple more remarks:

they don't ALL have dups

It's always one or entries with same time stamps (never seen more
than 2)

I could send you the db data if that's helpful.

Reply to this email directly or view it on GitHubhttps://github.com//issues/321#issuecomment-37213852
.

tcolar · 2014-03-10T18:44:10Z

What's the best way to send you that, it's not super sensitive data but i don't want it public either. (Data folder seem to be about 12MB)

pauldix · 2014-03-10T18:46:46Z

just shared a dropbox folder, can you put it in there?

On Mon, Mar 10, 2014 at 2:44 PM, Thibaut Colar notifications@github.comwrote:

What's the best way to send you that, it's not super sensitive data but i
don't want it public either. (Data folder seem to be about 12MB)

Reply to this email directly or view it on GitHubhttps://github.com//issues/321#issuecomment-37218736
.

tcolar · 2014-03-10T18:50:11Z

Thanks, uploaded.

pauldix · 2014-03-10T18:50:34Z

got it! I'll have a look

tcolar · 2014-03-10T18:51:46Z

BTW, it's not urgent :)

toddboom · 2014-03-11T22:12:37Z

@tcolar So, after digging into your data, we discovered that a handful of the data points you wrote in ended up in the wrong shard (based on start and end times), which should theoretically be impossible to do. We're trying to figure out exactly how that might have happened, but in the meantime, would you mind sending us a copy of your config? Also, what version of InfluxDB were you running when you initially wrote the data in?

Thanks!

tcolar · 2014-03-11T22:58:25Z

I see,
I was using 0.5.0-rc.1 when the data was inserted.

The Influx config is vanilla, didn't change anything. It's on a Linux 64 bits machine.

My data was inserted with a batch job so I can easily recreate it from scratch and see if it happens again I believe.

Note too that I did insert a lot of points at once (maybe something like 50k using writeSeries with the Go driver) didn't see any errors but letting you know.

One last thing to mention is that I did nuke and recreate the db a few times before i got the data I wanted, mentioning that because suppose if deleting the db went wrong it might possibly have been part of the issue ?

pauldix · 2014-03-11T23:16:29Z

@tcolar ok, we figured it out. There was an off by one bug that was causing write requests that contained points split across multiple shards to not be separated out correctly. It's fixed in the rc.5 release, which we're pushing right now.

Unfortunately, to fix this you'll need to blow away your data and reload. :(

tcolar · 2014-03-11T23:32:26Z

Yeah, no worries about rebuilding the data I rather deal with this now
rather than later & glad this was exposed and fixed early.

On Tue, Mar 11, 2014 at 4:16 PM, Paul Dix notifications@github.com wrote:

@tcolar https://github.com/tcolar ok, we figured it out. There was an
off by one bug that was causing write requests that contained points split
across multiple shards to not be separated out correctly. It's fixed in the
rc.5 release, which we're pushing right now.

Unfortunately, to fix this you'll need to blow away your data and reload.
:(

Reply to this email directly or view it on GitHubhttps://github.com//issues/321#issuecomment-37358437
.

pauldix added this to the 0.5.0 milestone Mar 11, 2014

toddboom self-assigned this Mar 11, 2014

jvshahid closed this as completed in 471039a Mar 11, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting 2 entries per time frame when using count with group by time #321

Getting 2 entries per time frame when using count with group by time #321

tcolar commented Mar 10, 2014

jvshahid commented Mar 10, 2014

tcolar commented Mar 10, 2014

pauldix commented Mar 10, 2014

tcolar commented Mar 10, 2014

pauldix commented Mar 10, 2014

tcolar commented Mar 10, 2014

pauldix commented Mar 10, 2014

tcolar commented Mar 10, 2014

pauldix commented Mar 10, 2014

tcolar commented Mar 10, 2014

pauldix commented Mar 10, 2014

tcolar commented Mar 10, 2014

toddboom commented Mar 11, 2014

tcolar commented Mar 11, 2014

pauldix commented Mar 11, 2014

tcolar commented Mar 11, 2014

Getting 2 entries per time frame when using count with group by time #321

Getting 2 entries per time frame when using count with group by time #321

Comments

tcolar commented Mar 10, 2014

jvshahid commented Mar 10, 2014

tcolar commented Mar 10, 2014

pauldix commented Mar 10, 2014

tcolar commented Mar 10, 2014

pauldix commented Mar 10, 2014

tcolar commented Mar 10, 2014

pauldix commented Mar 10, 2014

tcolar commented Mar 10, 2014

pauldix commented Mar 10, 2014

tcolar commented Mar 10, 2014

pauldix commented Mar 10, 2014

tcolar commented Mar 10, 2014

toddboom commented Mar 11, 2014

tcolar commented Mar 11, 2014

pauldix commented Mar 11, 2014

tcolar commented Mar 11, 2014