Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce syntax for marking a partial response with chunking #7368

Merged
merged 1 commit into from
Nov 29, 2016

Conversation

jsternberg
Copy link
Contributor

The partial tag has been added to the JSON response of a series and
the result so that a client knows when more of the series or result will
be sent in a future JSON chunk.

This helps interactive clients who don't want to wait for all of the
data to know if it is done processing the current series or the current
result. Previously, the client had to guess if the next chunk would
refer to the same result or a new result and it had to match the name
and tags of the two series to know if they were the same series. Now,
the client just needs to check the partial field included with the
response to know if it should expect more.

@mention-bot
Copy link

@jsternberg, thanks for your PR! By analyzing the annotation information on this pull request, we identified @e-dard, @gunnaraasen and @joelegasse to be potential reviewers

@jsternberg jsternberg force-pushed the js-partial-in-json-response branch 3 times, most recently from deb8dfe to 79c5a3b Compare September 30, 2016 17:27
@jsternberg jsternberg force-pushed the js-partial-in-json-response branch 2 times, most recently from 706b504 to 871bb3f Compare October 25, 2016 15:51
@jwilder jwilder added this to the 1.1.0 milestone Oct 25, 2016
Copy link
Contributor

@jwilder jwilder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @joelegasse @corylanou Can you one of you take a look as well?

@jsternberg jsternberg force-pushed the js-partial-in-json-response branch from 871bb3f to aa84c08 Compare October 25, 2016 21:26
@jwilder
Copy link
Contributor

jwilder commented Oct 26, 2016

--- FAIL: TestServer_Query_ImplicitEndTime (2.01s)
    server_test.go:7039: aggregate query: unexpected results
        query:  SELECT mean(value) FROM cpu WHERE time > now() - 1m GROUP BY time(1m) FILL(none)
        params:  map[db:[db0]]
        exp:    {"results":[{"statement_id":0,"series":[{"name":"cpu","columns":["time","mean"],"values":[["2016-10-25T21:33:00Z",1]]}]}]}
        actual: {"results":[{"statement_id":0,"series":[{"name":"cpu","columns":["time","mean"],"values":[["2016-10-25T21:32:00Z",1]]}]}]}

@jsternberg
Copy link
Contributor Author

jsternberg commented Oct 26, 2016

Ah crap. It's a flaky test from something I just merged. I'll fix it.

@jsternberg jsternberg force-pushed the js-partial-in-json-response branch from aa84c08 to 9c7b4a5 Compare October 26, 2016 14:24
@jsternberg
Copy link
Contributor Author

I think I fixed the flaky test.

Copy link
Contributor

@corylanou corylanou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only concern is on the changing of the result struct. Otherwise looks good.

@@ -19,13 +19,13 @@ func init() {
&Query{
name: "create database should succeed",
command: `CREATE DATABASE db0`,
exp: `{"results":[{}]}`,
exp: `{"results":[{"statement_id":0}]}`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a pretty big breaking change. I imagine a lot of scripts are written to expect the empty result. I'm not sure how we would be affecting the user by adding this new field in the actual result itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the statement_id is a generated ID, and doesn't matter if there are no results, the field should probably have omitempty added to the struct tag as well. That would preserve the existing "empty object" result, and still allow for tracking partial series results

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the statement_id is pretty important. It tells the client which statement that empty response is for so it knows it ran. Otherwise, there's no way to know which statement that empty response references.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Statement ID's would have to start at 1, not 0 then, as 0 can be a valid statement ID it looks like. I would agree that I would like to see it omitted as it creates a lot of noise when there is no response to be had.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is statement ID important for an empty result? I don't understand the logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It tells the client that the statement was executed. How would you know what the empty response was for without the statement id?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it didn't execute you get an error. That's how it works today right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you would. But the point of this PR is to also give you information about what is being run interactively for chunking.

A different question. What benefit is there for statement ID being omitted from these results? It seems like an unnecessary complication to omit the statement id specifically for those results. Clients are supposed to check for error, not a blank JSON response. It looks like influxdb-python checks for the error field.

@jsternberg
Copy link
Contributor Author

/cc @aviau, @toddboom, and @dmke since these are the primary maintainers behind influxdb-python and influxdb-ruby and will be the ones most affected by this change.

Can you confirm if this change will be fine and if it will help in the future for client library development? Thanks.

dmke added a commit to InfluxCommunity/influxdb-ruby that referenced this pull request Oct 26, 2016
dmke added a commit to InfluxCommunity/influxdb-ruby that referenced this pull request Oct 26, 2016
@dmke
Copy link

dmke commented Oct 26, 2016

@jsternberg I believe the Ruby gem can handle this without any modification. Can you verify that a "statement_id" will be included in every (successful) response from the server?

Currently, the high level methods only care about json["results"][i]["series"] values, while
exceptions are generated from json["results"][i]["error"] values; other keys in json["results"][i] are silently ignored.

Some user-facing methods however return the raw HTTP response object which could cause trouble if the user has code validating that the response only contains "series" and/or "error" keys.

I have added some statement ids into the test stubs and the suite still passes (see Travis and diff).

@jsternberg
Copy link
Contributor Author

Thanks. Yes, if we merge this, statement_id will be included in every result. I also noticed in your diff you have a typo for statement_id.

@dmke
Copy link

dmke commented Oct 26, 2016

👍

@jwilder jwilder modified the milestones: 1.2.0, 1.1.0 Oct 27, 2016
@jsternberg jsternberg force-pushed the js-partial-in-json-response branch 4 times, most recently from d9f42db to 82e9192 Compare November 22, 2016 17:14
The `partial` tag has been added to the JSON response of a series and
the result so that a client knows when more of the series or result will
be sent in a future JSON chunk.

This helps interactive clients who don't want to wait for all of the
data to know if it is done processing the current series or the current
result. Previously, the client had to guess if the next chunk would
refer to the same result or a new result and it had to match the name
and tags of the two series to know if they were the same series. Now,
the client just needs to check the `partial` field included with the
response to know if it should expect more.

Fixed `max-row-limit` so it counts rows instead of results and it
truncates the response when the `max-row-limit` is reached.
@jsternberg jsternberg force-pushed the js-partial-in-json-response branch from 82e9192 to b4db76c Compare November 22, 2016 17:16
@jsternberg jsternberg merged commit 9337025 into master Nov 29, 2016
@jsternberg jsternberg deleted the js-partial-in-json-response branch November 29, 2016 17:17
mark-rushakoff added a commit that referenced this pull request Jan 9, 2017
These seem to have been broken in #7368.
@mark-rushakoff mark-rushakoff mentioned this pull request Jan 9, 2017
2 tasks
dmke added a commit to InfluxCommunity/influxdb-ruby that referenced this pull request Jan 26, 2017
@jwilder jwilder mentioned this pull request Mar 14, 2017
3 tasks
jwilder added a commit that referenced this pull request Mar 14, 2017
max-row-limit was set at 10000 since 1.0, but due to a bug it was
effectively 0 (disabled).  1.2 fixed this bug via #7368, but this
caused a breaking change w/ Grafana and any users upgrading from <1.2
who had not disabled the config manually.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants