Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-create shard groups #1897

Merged
merged 7 commits into from
Mar 10, 2015
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 19 additions & 5 deletions cmd/influxd/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,9 @@ const (

// DefaultJoinURLs represents the default URLs for joining a cluster.
DefaultJoinURLs = ""

// DefaultShardGroupPreCreatePeriod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment does nothing to explain what the setting does.

DefaultShardGroupPreCreatePeriod = 45 * time.Minute
)

// Config represents the configuration format for the influxd binary.
Expand Down Expand Up @@ -85,11 +88,12 @@ type Config struct {
} `toml:"broker"`

Data struct {
Dir string `toml:"dir"`
Port int `toml:"port"`
RetentionAutoCreate bool `toml:"retention-auto-create"`
RetentionCheckEnabled bool `toml:"retention-check-enabled"`
RetentionCheckPeriod Duration `toml:"retention-check-period"`
Dir string `toml:"dir"`
Port int `toml:"port"`
RetentionAutoCreate bool `toml:"retention-auto-create"`
RetentionCheckEnabled bool `toml:"retention-check-enabled"`
RetentionCheckPeriod Duration `toml:"retention-check-period"`
ShardGroupPreCreateCheckPeriod Duration `toml:"shard-group-pre-create-check-period"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pretty awkwardly-named config variable. :-) How about something as simple as retention-create-period?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Shard groups" is an implementation detail, one of the reasons I don't like seeing it in the key names.

} `toml:"data"`

Cluster struct {
Expand Down Expand Up @@ -148,6 +152,7 @@ func NewConfig() *Config {
c.Data.RetentionAutoCreate = true
c.Data.RetentionCheckEnabled = true
c.Data.RetentionCheckPeriod = Duration(10 * time.Minute)
c.Data.ShardGroupPreCreateCheckPeriod = Duration(DefaultShardGroupPreCreatePeriod)
c.Admin.Enabled = true
c.Admin.Port = 8083
c.ContinuousQuery.RecomputePreviousN = 2
Expand Down Expand Up @@ -229,6 +234,15 @@ func (c *Config) JoinURLs() string {
}
}

// ShardGroupPreCreateCheckPeriod returns the check interval to pre-create shard groups.
// If it was not defined in the config, it defaults to DefaultShardGroupPreCreatePeriod
func (c *Config) ShardGroupPreCreateCheckPeriod() time.Duration {
if c.Data.ShardGroupPreCreateCheckPeriod != 0 {
return time.Duration(c.Data.ShardGroupPreCreateCheckPeriod)
}
return DefaultShardGroupPreCreatePeriod
}

// Size represents a TOML parseable file size.
// Users can specify size using "m" for megabytes and "g" for gigabytes.
type Size int
Expand Down
7 changes: 7 additions & 0 deletions cmd/influxd/run.go
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,13 @@ func Run(config *Config, join, version string, logWriter *os.File) (*messaging.B
log.Printf("broker enforcing retention policies with check interval of %s", interval)
}

// Start shard group pre-create
interval := config.ShardGroupPreCreateCheckPeriod()
if err := s.StartShardGroupsPreCreate(interval); err != nil {
log.Fatalf("shard group pre-create failed: %s", err.Error())
}
log.Printf("shard group pre-create with check interval of %s", interval)

// Start the server handler. Attach to broker if listening on the same port.
if s != nil {
sh := httpd.NewHandler(s, config.Authentication.Enabled, version)
Expand Down
24 changes: 17 additions & 7 deletions commands.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,9 @@ const (
deleteUserMessageType = messaging.MessageType(0x32)

// Shard messages
createShardGroupIfNotExistsMessageType = messaging.MessageType(0x40)
deleteShardGroupMessageType = messaging.MessageType(0x41)
createShardGroupIfNotExistsMessageType = messaging.MessageType(0x40)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we pre-create shards, why do we still have the "explicit create shards" message? In case pre-creation had failed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-create happens after then initial create. I'm not real happy with the naming. One represents more of the "initial" creation now, and the other represents "ongoing" creation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see.

Is there some way to collapse both of these operations into 1 operation? Why are they different? Why not move to a design such that when a retention policy is first created, create the initial shard groups then, and wait for that to complete. And from then one, background creation happens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They take different signatures. The first shard creation happens the first time you write data. Until that point in time, we don't know what to create.

After that, we pre-create based on existing shards. So we have to have some pre-processing to determine time buckets for each one, then once that is known, it shares the same code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, OK, let me re-review.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you're saying about timestamps. But if the code computed the timestamp for the "precreate" message before it sent it out on the cluster -- it can, since it knows the shard -- and send the timestamp along, then the mesage signatures would be identical and you wouldn't need two messages.

Basically, I'm trying to get this implementation as simple as possible, without losing functionality. A "create shard group" message could be just that -- create these shards with these timestamps. Why the message is on the cluster, well, it wouldn't care.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, you should note that since we have a policy regarding shard group duration, you know both the start-time and end-time of a shard group. We can have a single "create shard" message, which has database, retention policy, group ID (which is basically the group ID of the group before the one being created). In fact the message doesn't even need the group ID, the message should just be "create this group of shards with this start-time, this end-time, for this database, and this retention policy".

deleteShardGroupMessageType = messaging.MessageType(0x41)
preCreateShardGroupIfNotExistsMessageType = messaging.MessageType(0x42)

// Series messages
dropSeriesMessageType = messaging.MessageType(0x50)
Expand Down Expand Up @@ -69,11 +70,19 @@ type createShardGroupIfNotExistsCommand struct {
Policy string `json:"policy"`
Timestamp time.Time `json:"timestamp"`
}

type deleteShardGroupCommand struct {
Database string `json:"database"`
Policy string `json:"policy"`
ID uint64 `json:"id"`
}

type preCreateShardGroupIfNotExistsCommand struct {
Database string `json:"database"`
Policy string `json:"policy"`
ID uint64 `json:"id"`
}

type createUserCommand struct {
Username string `json:"username"`
Password string `json:"password"`
Expand All @@ -94,11 +103,12 @@ type setPrivilegeCommand struct {
Database string `json:"database"`
}
type createRetentionPolicyCommand struct {
Database string `json:"database"`
Name string `json:"name"`
Duration time.Duration `json:"duration"`
ReplicaN uint32 `json:"replicaN"`
SplitN uint32 `json:"splitN"`
Database string `json:"database"`
Name string `json:"name"`
Duration time.Duration `json:"duration"`
ShardGroupDuration time.Duration `json:"shardGroupDuration"`
ReplicaN uint32 `json:"replicaN"`
SplitN uint32 `json:"splitN"`
}
type updateRetentionPolicyCommand struct {
Database string `json:"database"`
Expand Down
16 changes: 11 additions & 5 deletions database.go
Original file line number Diff line number Diff line change
Expand Up @@ -1016,6 +1016,9 @@ type RetentionPolicy struct {
// Length of time to keep data around. A zero duration means keep the data forever.
Duration time.Duration `json:"duration"`

// Length of time to create shard groups in.
ShardGroupDuration time.Duration `json:"shardGroupDuration"`

// The number of copies to make of each shard.
ReplicaN uint32 `json:"replicaN"`

Expand Down Expand Up @@ -1112,6 +1115,7 @@ func (rp *RetentionPolicy) MarshalJSON() ([]byte, error) {
var o retentionPolicyJSON
o.Name = rp.Name
o.Duration = rp.Duration
o.ShardGroupDuration = rp.ShardGroupDuration
o.ReplicaN = rp.ReplicaN
for _, g := range rp.shardGroups {
o.ShardGroups = append(o.ShardGroups, g)
Expand All @@ -1131,18 +1135,20 @@ func (rp *RetentionPolicy) UnmarshalJSON(data []byte) error {
rp.Name = o.Name
rp.ReplicaN = o.ReplicaN
rp.Duration = o.Duration
rp.ShardGroupDuration = o.ShardGroupDuration
rp.shardGroups = o.ShardGroups

return nil
}

// retentionPolicyJSON represents an intermediate struct for JSON marshaling.
type retentionPolicyJSON struct {
Name string `json:"name"`
ReplicaN uint32 `json:"replicaN,omitempty"`
SplitN uint32 `json:"splitN,omitempty"`
Duration time.Duration `json:"duration,omitempty"`
ShardGroups []*ShardGroup `json:"shardGroups,omitempty"`
Name string `json:"name"`
ReplicaN uint32 `json:"replicaN,omitempty"`
SplitN uint32 `json:"splitN,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SplitN should no longer be a setting on retention policies. They can't set this themselves as it's determined by the number of data nodes in a cluster and their RF.

Duration time.Duration `json:"duration,omitempty"`
ShardGroupDuration time.Duration `json:"shardGroupDuration"`
ShardGroups []*ShardGroup `json:"shardGroups,omitempty"`
}

// TagFilter represents a tag filter when looking up other tags or measurements.
Expand Down
Loading