Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for configurable default retention policy on database creation #4866

Merged
merged 2 commits into from
Dec 2, 2015
Merged

Added support for configurable default retention policy on database creation #4866

merged 2 commits into from
Dec 2, 2015

Conversation

pires
Copy link
Contributor

@pires pires commented Nov 21, 2015

Adds support for:

CREATE DATABASE [IF NOT EXISTS] NOAA_water_database
  [WITH
    [DURATION 24hr]
    [REPLICATION 1]
    [NAME something_other_than_default]
  ]
  • CHANGELOG.md updated
  • Rebased/mergable
  • Tests pass
  • Sign CLA

Fixes #2676

@pires
Copy link
Contributor Author

pires commented Nov 21, 2015

Is the following error related to what I did? Seems not.

--- FAIL: TestStore_PrecreateShardGroup (2.68s)
    store_test.go:548: no leader detected during remoteExec

Also, I think I should add tests to validate the default retention policy is properly set on all shards when we use WITH RETENTION. Please confirm and add more test scenarios I should add.

@otoolep
Copy link
Contributor

otoolep commented Nov 22, 2015

I took a quick look at this. I'm not sure why you followed this approach of modifying the function CreateDatabase. Why not just call CreateDatabase followed by CreateRetentionPolicy? Just make two calls, don't modify existing functions.

@pires
Copy link
Contributor Author

pires commented Nov 22, 2015

@otoolep both have crossed my mind but then I looked at CreateDatabase (meta/store.go) and got mixed feelings. What's preferable?
1 Do what I did
2 After database creation check if default retention policy exists, and if:

  • Yes then update default retention policy
  • No then create default retention policy with specificed retention period

Or am missing something?

@otoolep
Copy link
Contributor

otoolep commented Nov 23, 2015

After creating the database, can we not just call CreateRetentionPolicyIfNotExists() with the correct params? Would that work?

We tend to leave simple functions, like CreateDatabase simple if possible.

@pires
Copy link
Contributor Author

pires commented Nov 23, 2015

@otoolep just to make sure, we're talking about the default retention policy, right?

@otoolep
Copy link
Contributor

otoolep commented Nov 23, 2015

Yes, in the sense the idea is if they supply a retention policy duration with the statement, it's applied to the auto-gen default retention.

I do think some of this is not quite clear in our heads, since there might be some edge cases with the auto-gen functionality that is there now.

@pires
Copy link
Contributor Author

pires commented Nov 23, 2015

@otoolep default retention policy auto-gen happens inside CreateDatabase, hence my mixed feelings. Again, after database creation, should we create/update the default retention policy? If we agree on this, I'll reset those changes and implement the correct behavior.

@otoolep
Copy link
Contributor

otoolep commented Nov 23, 2015

Ideally, since @pauldix prefers a statement-based approach to this, I think he needs to decide if we can break the old behaviour. The answer may be no, so the solution may be messier.

I think CREATE DATABASE ... WITH ..... should probably dictate all params of the policy, not just the duration, which removes any ambiguity as far as I can see. As you can see there is still some feature-design going on here. :-)

@pires
Copy link
Contributor Author

pires commented Nov 23, 2015

@otoolep I'm glad to be involved in the feature-design process.

My (easy) choice is to break the old behavior and make database creation dictate the default retention policy, if any.
But the PR as it is right now was done based on a sounding no to breaking changes.

@otoolep
Copy link
Contributor

otoolep commented Nov 23, 2015

Perhaps just a new call CreateDatabaseWithPolicy()? Would that help? I say this because we tend to do this in other code (add a new function instead of changing signatures of existing functions).

@pires
Copy link
Contributor Author

pires commented Nov 23, 2015

@otoolep yes, it's a prettier solution indeed. Should I wait on more feedback, namely from Paul or get on with it?

@pauldix
Copy link
Member

pauldix commented Nov 23, 2015

I'd say no breaking change. Ideally the statement would include the other part of a retention policy, which is the replication factor.

I think CreateDatabaseWithPolicy is the cleanest.

@pires
Copy link
Contributor Author

pires commented Nov 23, 2015

@otoolep @pauldix will get on to it after work.

@pires
Copy link
Contributor Author

pires commented Nov 25, 2015

@otoolep @pauldix following your feedback, I'm proposing CREATE DATABASE [IF NOT EXISTS] NOAA_water_database [WITH [DURATION 24hr] [REPLICATION 1] ] - duration defaults to 0, replication defaults to 1. Agree?

@corylanou
Copy link
Contributor

We should offer the ability to specify a retention policy name as well, so this could be used beyond just the default policy.

@pires
Copy link
Contributor Author

pires commented Nov 25, 2015

@corylanou but that's what CREATE RETENTION POLICY is meant for, is it not?

@corylanou
Copy link
Contributor

Yes, but since this is very close to that, it would be nice if we supported creating a database and a retention policy, and be able to name them both. Many users may have historical reasons for not wanting to have the default RP named default (typically migrating data, etc). It would allow for the most flexibility this way.

@pires
Copy link
Contributor Author

pires commented Nov 25, 2015

@corylanou how different is having what you suggest and the user running both CREATE DATABASE and CREATE RETENTION POLICY ... ON just_created_database...?

@corylanou
Copy link
Contributor

Maybe I'm missing the point of this PR. You could always run two commands to get the same result as above. For me, this is a single command that combines creating a database and retention policy in one command. I'm not sure why we would limit that to just creating a default RP if we go through all the work.

@pires
Copy link
Contributor Author

pires commented Nov 25, 2015

@corylanou please see #2676. I'm putting this on hold until we all agree on what to deliver.

@corylanou
Copy link
Contributor

So I understand why we originally started this PR based on issue #2676. I always try to make sure to look at the bigger picture when making language changes. Also, since making a change to this particular statement later to support any RP, not just default, could result in a breaking change, any change to the InfluxQL language needs a higher level of scrutiny before changing. Looping in @dgnorton as well as he has done the most with the InfluxQL language.

@otoolep
Copy link
Contributor

otoolep commented Nov 25, 2015

Yeah, reading the ticket is the best way to get up to speed on all the work @pires has being doing for us.

The executive summary:

-- autocreation of a default retention policy is a useful feature.
-- what many people have requested is that the duration of the policy be configurable, since if they don't want Infinite retention, auto-creation isn't useful to them.
-- I suggested we include an option in the config to allow the duration to be overriden.
-- @pauldix isn't a fan of putting stuff in the config file, when it's a source of confusion. What happens if the user modifies the policy's duration, to a value different from the config?
-- @pauldix requested that this feature be implemented as a change to the query language, but in such a way that is optional.
-- In other words, if you want an retention policy created automatically when the database is created, you use a different form of the command.
-- the existing functionality would remain in place, but we'd probably deprecate it over time, in favour of the InfluxQL version.

@corylanou
Copy link
Contributor

So I understand why this PR came into existence, and the history behind it. My point is we are making a change to the query language that can with just a couple modifications support more than just default. My concern in not addressing it before we make this change is that it may not be something we can do in a future PR without it being a breaking change. I may be the only one that sees this as a useful addition to the work already done.

@otoolep
Copy link
Contributor

otoolep commented Nov 25, 2015

I believe I follow you @corylanou

However, I don't consider this a breaking change. How is it breaking? The extra params are optional.

@pauldix
Copy link
Member

pauldix commented Nov 25, 2015

He's saying if we push this as is and then later decide we want to give people the option to add a name for this default retention policy, that later change would be breaking.

I'm in support of having the name be something they can specify. However, @corylanou's concern is covered in his syntax. If we decide to let them specify name later then it would simply add another optional argument name. Thus we can add it later without breaking. e.g:

CREATE DATABASE [IF NOT EXISTS] NOAA_water_database 
  [WITH [DURATION 24hr] [REPLICATION 1] [NAME something_other_than_default] ]

Looking at the code it seems that the support for specifying the replication factor still needs to be added in so the. The function definitions for CreateDatabase should be taking a RetentionPolicy object instead of just the duration number that they're currently taking.

Thanks for all your work @pires. I think your new syntax is a good idea along with an update that I just mentioned. Does all that make sense? @corylanou and @otoolep, you guys on board or are there still concerns?

@corylanou
Copy link
Contributor

My concerns are addressed. Thanks!

@@ -997,7 +996,7 @@ func (s *Store) Databases() (dis []DatabaseInfo, err error) {
}

// CreateDatabase creates a new database in the store.
func (s *Store) CreateDatabase(name string) (*DatabaseInfo, error) {
func (s *Store) CreateDatabase(name string, retentionPeriod time.Duration) (*DatabaseInfo, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to we're clear, @pauldix is saying this should be RetentionPolicyInfo type, not just a duration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making sure I got it right ;)

@otoolep
Copy link
Contributor

otoolep commented Nov 25, 2015

I'm fine with the change, as long as the update requested by @pauldix is made.

Thanks @pires !

@pires
Copy link
Contributor Author

pires commented Nov 25, 2015

Thank you @corylanou @otoolep and @pauldix for sharing the discussion that led to the following design decision:

CREATE DATABASE [IF NOT EXISTS] NOAA_water_database
  [WITH
    [DURATION 24hr]
    [REPLICATION 1]
    [NAME something_other_than_default]
  ]

I'm resuming the PR now.

@pires
Copy link
Contributor Author

pires commented Nov 27, 2015

@pauldix @otoolep @corylanou while implementing the parsing stuff I started wondering if NAME is a good keyword for retention policy name. Thoughts?

@pauldix
Copy link
Member

pauldix commented Nov 27, 2015

I think NAME is good. Although I'm flexible if you have another idea you think fits better.

@pires
Copy link
Contributor Author

pires commented Nov 27, 2015

@pauldix I don't. Fixing tests and will update the PR for your review.

@pires
Copy link
Contributor Author

pires commented Nov 27, 2015

@pauldix @otoolep @corylanou please review updated PR.

_, _ = buf.WriteString(s.RetentionPolicyDuration.String())
_, _ = buf.WriteString("REPLICATION ")
_, _ = buf.WriteString(strconv.Itoa(s.RetentionPolicyReplication))
_, _ = buf.WriteString("NAME ")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will this render if NAME is not set?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it render a statement that is actually syntactically incorrect?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, it will set it to "default". OK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, default.

@otoolep
Copy link
Contributor

otoolep commented Nov 28, 2015

Looks great @pires -- thanks for sticking with us. I'd like one small change made (the variable renamed) but I don't need to review again. +1.

@pires
Copy link
Contributor Author

pires commented Nov 28, 2015

Thank you @otoolep, nits addressed.

@pires
Copy link
Contributor Author

pires commented Dec 2, 2015

Any updates on this?

@pauldix
Copy link
Member

pauldix commented Dec 2, 2015

This is great, thanks @pires!

pauldix added a commit that referenced this pull request Dec 2, 2015
Added support for configurable default retention policy on database creation
@pauldix pauldix merged commit 218ef13 into influxdata:master Dec 2, 2015
@pires
Copy link
Contributor Author

pires commented Dec 2, 2015

@pauldix my pleasure. I've been learning a lot from you, @otoolep and @corylanou.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants