Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[0.14] Time sorting seems broken #6738

Closed
cheribral opened this issue May 27, 2016 · 9 comments
Closed

[0.14] Time sorting seems broken #6738

cheribral opened this issue May 27, 2016 · 9 comments
Assignees
Milestone

Comments

@cheribral
Copy link

Bug report

System info: [Include InfluxDB version, operating system name, and other relevant details]
Influxdb 0.14.0~n201605260800

Steps to reproduce:

  1. Query data

Expected behavior:
Points come back in order

Actual behavior:
2016-05-26T22:40:00Z 1038
2016-05-26T22:50:00Z 962
2016-05-26T23:00:00Z 3483
2016-05-26T22:50:00Z 1896
2016-05-26T23:00:00Z 1754
2016-05-26T22:40:00Z 113
2016-05-26T22:50:00Z 801
2016-05-26T23:00:00Z 12076
2016-05-26T22:50:00Z 837
2016-05-26T23:00:00Z 40338
2016-05-26T23:10:00Z 4938
2016-05-26T22:40:00Z 64
2016-05-26T22:50:00Z 826
2016-05-26T23:00:00Z 110
2016-05-26T23:10:00Z 2250

Additional info:

We noticed this on one of the servers running the nightly build. While it provided quite a bit of amusement for people passing around grafana graphs that looked like scribbles, I figured I would report it as a bug.
screenshot_2016-05-27_17-24-52

@e-dard
Copy link
Contributor

e-dard commented May 31, 2016

@cheribral thanks for the report. We're going to need more details. Do you have the raw data available? What was the query you executed? Can you reproduce this with an empty database and some sample data?

@cheribral
Copy link
Author

cheribral commented Jun 14, 2016

@e-dard Apologies for the delay.

There are 13376 series in that particular measurement, structured like:

batch_trip_writer.time_to_write_trips,customer=xxx,host=xxx,is_completed=True,is_reprocess=False,location=xxx,service=xxx,thread=4,tuple=xxx
show field keys from "batch_trip_writer.time_to_write_trips"
name: batch_trip_writer.time_to_write_trips
fieldKey    fieldType
duration    float
total_points    float

The query is:

SELECT count("duration") FROM "batch_trip_writer.time_to_write_trips" WHERE time > now() - 24h GROUP BY time(10m), "location" fill(null)

If it helps, a git bisect gives me this as the breaking commit:

0b481ff6271a404d440684d3c68ab02515674715 is the first bad commit
commit 0b481ff6271a404d440684d3c68ab02515674715
Author: Jason Wilder <mail@jasonwilder.com>
Date:   Wed May 25 08:55:46 2016 -0600

    Fix pathalogical TSM query case

    This fixes a pathalogical query condition cause by and problematic
    structuring of TSM files based on how points were written.  The
    condition can occur when there are multiple TSM files and a large
    number of points are written into the past.  The earlier existing
    TSM files must also have points in the past and close to the present
    causing their time range to eclipse the later files.

    When this condition occurs, some queries can spend an excessive amount
    of time merge all the overlapping blocks.

    The fix was to constrain the window of overlapping blocks based on
    the first one we ran into.  There was also a simple case in the Merge
    where we could skip the binary search path and just append the two
    inputs.

@e-dard e-dard modified the milestone: 1.0.0 Jun 14, 2016
jwilder added a commit that referenced this issue Jun 22, 2016
If there were blocks in later TSM files that were for overwritten
points or writes into the past, they could be returned more than
once or out of order causing the cursor values to be unsorted.

One effect of this is that graphs in graphana would render with
the line going all over the place in spots.

This might also cause duplicate data to be returned.

Fixes #6738
@jwilder
Copy link
Contributor

jwilder commented Jun 22, 2016

@cheribral Would you be able to test #6897 to see if it resolves this?

@cheribral
Copy link
Author

Just to confirm, the issue is fixed.

@cheribral
Copy link
Author

@jwilder Sorry, I was just tipped off by someone that they are still seeing this problem. I've tested with the latest nightly, and it is indeed still a problem.

screenshot_2016-07-08_11-30-25

@cheribral
Copy link
Author

@e-dard Would you like a new issue for this, or should this one be opened again? The data is coming back jumbled, and the problem still bisects to that same commit.

Also, would having some backups files to test with be helpful?

@jwilder
Copy link
Contributor

jwilder commented Jul 13, 2016

@cheribral Yes, if you could share your shard data for this that would be very helpful.

@cheribral
Copy link
Author

@jwilder Is there an influxdata email address I can use to send the access information?

@e-dard
Copy link
Contributor

e-dard commented Jul 14, 2016

@cheribral our email addresses are our first names @influxdb.com, e.g., edd and jason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants