-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
config_wrapper: Add support for named stores #7946
Conversation
Signed-off-by: Ashwanth Goli <iamashwanth@gmail.com>
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
- storage -1.9%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
- storage -1.9%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍 and great job updating the _index.md
file with the storage configuration flags 🏆
@@ -217,8 +222,20 @@ var ( | |||
}, | |||
{ | |||
Name: "local_storage_config", | |||
StructType: reflect.TypeOf(local.Config{}), | |||
StructType: reflect.TypeOf(local.FSConfig{}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch ⭐
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🌟 thanks for addressing the comments. Just adding some minor comments about the unit tests 🙂
secret_access_key: 789abc` | ||
config, _ := testContext(namedStoresConfig, nil) | ||
|
||
// should be set by common config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, since we defined storage in 1) common config, and in 2) named storage, the common config will be ignored and the named storage will be the one used, right? Any way we can assert that in the test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
common config won't be ignored as it overrides StorageConfig (if it's not configured). And users can choose to use one or both by setting objectType across periodConfigs.
aws
for using configuration from StorageConfig (which actually comes from common config)
aws.named-store-key-name
for using the named store
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah got it 👍 not necessary for this PR, but maybe it would be useful to add this information to period_config, that one can set it via the StorageConfig or via named_stored
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🏆 the check-doc
CI step is failing, so you need to re-run make doc
to update the _index.md file
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
- storage -1.9%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I think the tests are insufficient though
pkg/storage/factory.go
Outdated
// extact storeType for named stores | ||
if words := strings.SplitN(name, ".", 2); len(words) == 2 { | ||
storeType = words[0] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be its own function with a test associated
pkg/storage/factory.go
Outdated
) | ||
|
||
// extact storeType for named stores | ||
if words := strings.SplitN(name, ".", 2); len(words) == 2 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than requiring we use <type>.<name>
as a convention in the configuration, what if we allowed specifying only the store's name and resolved the type from that? As a bonus, this would also work with stores that used the reserved .
characters, such as name: "my.s3.compatible.store"
. I think this would be easier to use and understand (I had to check test cases to realize the convention of putting the .
in the store name to look it up). Note: if we made this change, all the individual types would be reserved store names (for example, you couldn't redefine another store with name aws
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also a bit worried that this convention might not be the most intuitive.
Just to make sure I understand it correctly: we want to drop the type prefix when referring to the named store, but the way we are configuring names stores can remain the same right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd make it so we can refer to a single store by the name alone and the type could be inferred. Validation would ensure that there are no duplicates across different types and there are no custom stores with legacy provider types (i.e. aws
).
Hopefully this answers your question.
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0.1%
+ iter 0%
- storage -1.4%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
pkg/storage/factory.go
Outdated
return nil | ||
} | ||
|
||
func (ns NamedStores) lookupStoreType(name string) (string, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a bit weird to me; why are we looking it up in the YAML rather than capturing this type when we instantiate the named store?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NewObjectClient
or NewChunkClient
only accept the named store reference key or one of the predefined stores name (aws, gcs...) but with this name alone we do not know which map to access among the fields of NamedStores.
so we'd have to lookup each of these maps. Instead using a bit of meta programming to reduce the amount of boilerplate code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I think it'd be far simpler to just keep a mapping in the NamedStores
struct of identifier to types than doing all this fancy footwork.
I don't want to hold up this PR further, so we can leave it as-is, but in general I'd suggest erring on the side of simplicity and readability over conciseness; it'll work out better in the long run (IMHO).
} | ||
|
||
func (ns NamedStores) validate() error { | ||
for name, awsCfg := range ns.AWS { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Howcome we're only validating these 3, out of interest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only 3 out of the supported named stores have a validation method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, just a suggestion for the future:
It'd be cleaner to iterate over each config and see if it satisfies
interface {
Validate() error
}
than special-casing these 3 stores. It also has the benefit of expressing the intent more clearly (as opposed to this code which seems to only validate these 3 for some unknown reason).
pkg/storage/factory_test.go
Outdated
@@ -50,23 +54,65 @@ func TestFactoryStop(t *testing.T) { | |||
store.Stop() | |||
} | |||
|
|||
func TestCassandraInMultipleSchemas(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why Cassandra is being tested specifically here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure, I didn't touch this
must have moved it when adding new tests
|
||
store.Stop() | ||
// FSObjectClient creates the configured dir on init, ensure that correct cfg is picked by checking for it. | ||
_, err = os.Stat(cfg.NamedStores.Filesystem["named-store"].Directory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should first check that this directory does not exist before instantiating the store, to account for false-positives.
pkg/storage/factory_test.go
Outdated
|
||
// dir specified in StorageConfig/FSConfig should not be created as we are not referring to it. | ||
_, err = os.Stat(cfg.FSConfig.Directory) | ||
require.Error(t, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems redundant
require.Error(t, err) |
pkg/storage/factory.go
Outdated
@@ -65,6 +71,98 @@ type NamedStores struct { | |||
Swift map[string]openstack.SwiftConfig `yaml:"swift"` | |||
} | |||
|
|||
func (ns NamedStores) checkForDuplicates() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please explain the rationale for approaching validation in this way?
Wouldn't it be easier to just retrieve the config and scan through all period configs and find duplicates across store types? Or are you doing it this way to spot duplicates within the same type?
This seems quite complicated, and I'm curious what problems you ran into here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
finding duplicates would require checking the keys from each of the maps defined under NamedStores. Doing it this way to avoid the amount of redundant code needed to lookup each field.
I could drop this for the simpler approach too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simpler is usually better; it's less prone to bugs.
No need to change it now, just a suggestion for the future.
pkg/storage/factory.go
Outdated
config.StorageTypeGCP, config.StorageTypeGCPColumnKey, config.StorageTypeBigTable, config.StorageTypeBigTableHashed, config.StorageTypeGCS, | ||
config.StorageTypeAzure, config.StorageTypeBOS, config.StorageTypeSwift, config.StorageTypeCassandra, | ||
config.StorageTypeFileSystem, config.StorageTypeInMemory, config.StorageTypeGrpc: | ||
return fmt.Errorf("named store %s should not match with the name of a predefined storage type", name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: i'd suggest changing these from %s
to %q
so that error messages will be a little more clear:
named store aws should not match...
vs
named store "aws" should not match...
pkg/storage/factory_test.go
Outdated
newSchemaCfg.From = config.DayTime{Time: model.Now()} | ||
storeType, ok := ns.lookupStoreType("store-1") | ||
assert.True(t, ok) | ||
assert.Equal(t, "gcs", storeType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use constants for these store type values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for being patient with the review feedback and simplifying the approach!
./tools/diff_coverage.sh ../loki-target-branch/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
- querier/queryrange -0.1%
+ iter 0%
- storage -1.7%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Nice work!
Signed-off-by: Ashwanth Goli iamashwanth@gmail.com
What this PR does / why we need it:
This PR adds support for named stores which allows users to define multiple object store configurations for a storage type. Named stores can be referred to from other sections of the config similar to how provider type is being used
i.e aws, gcs...
. Named store reference should be of the format:<store_type>.<key_name>
Below is an example configuration with two named stores defined for aws
store-1, store-2
. Schema config is referring to one of them asaws.store-1
Which issue(s) this PR fixes:
Fixes #7276
Special notes for your reviewer:
Configuration defined in common storage config has no effect on the named stores. Common storage config only overrides storageConfig if it's not explicitly set - same behavior as before. This should be a non-intrusive change.
Checklist
CONTRIBUTING.md
guideCHANGELOG.md
updateddocs/sources/upgrading/_index.md