-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor query engine for distributed query support #3299
Conversation
@@ -2310,12 +2309,6 @@ func TestServer_Query_LimitAndOffset(t *testing.T) { | |||
params: url.Values{"db": []string{"db0"}}, | |||
}, | |||
&Query{ | |||
name: "limit + offset equal to the number of points with group by time", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test appears to have been duped.
"github.com/influxdb/influxdb/influxql" | ||
) | ||
|
||
// RawMapper is for retrieving data, for a query, from a given shard. It implements the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ShardMapper
interface doesn't actually exist, and this comment needs correction.
ae52109
to
aa5a1c6
Compare
I just pushed up a change and it failed, looking into it now. |
With this change, the query engine code gathers information about shards and tagsets by working with individual shards, collating the information, and returning that to the client. It does not assume that any particular shard is local, and accesses all shards through abstracted Mappers, of which there are two types -- a Mapper type for Raw queries and a second type for Aggregate queries. There are corresponding Executors for each type of Mapper, but both types of Executors share the same interface.
Closing in favour of #3320. |
With this change, the query engine code gathers information about shards and tagsets by working with individual shards, collating the information, and returning that to the client. However it does not assume that any particular shard is local, and accesses all shards through abstracted Shard Mappers, of which there are two types -- a type for Raw queries and a second type for Aggregate queries. There are corresponding Executors for each type of Mapper, but both types of Executors share the
same interface.
There is also a new Query Planner, which is much simpler than the previous planner. This planner is now part of the
tsdb
package, and is responsible for determining which shards must be accessed. and accessing those shards either locally, or over the network using the cluster service.Remaining work
The only remaining work is to have the cluster service "wrap" the two new Mappers types, such that remote shards can be mapped as easily as local shards. This should be quite straightforward -- for remote nodes the Co-ordinator node will hit (through the cluster service) a remote endpoint, which will in turn instantiate a suitable Mapper. This Mapper will communicate with the Co-ordinator node over a persistent TCP connection, allowing state to be maintained between the Co-ordinator node and remote node. The new Mappers should require no changes to support this.
The split between Raw and Aggregate queries was put in place because shards are driven significantly differently in each case, and making this separation explicit makes the code easier to develop and understand. However as a result there is some code duplication between the new Mappers and Executors, and further refactoring will take place. This will occur after the distributed query support has been fully completed.
This code is ready for merging as is, and no functional difference should be visible to users of the system. The next patch will then clearly show the changes for network support. Once that change is, distributed query support will be complete, and the current restrictions on replication factor can be removed.
Testing
All pre-existing query tests pass, and some new tests have been added for the Mapeprs. Where changes have been made, it was for reasons such as error message changes. Where the changes to some tests may be questioned, in-line notes have been added.