Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Special expansion for env var name #91

Conversation

execjosh
Copy link

@execjosh execjosh commented Oct 28, 2023

Let me first say that I love this package. Thank you so much for your work!

I would really appreciate your having a look at this proposal, and I look forward to discussing it. Thank you very much in advance for your time.

What

Provide a special expansion for the name of the environment variable, including its prefix.

I am uncertain about what to name this magical expansion. I opted for $@, which is what GNU make uses to reference the target, and shell uses to reference all of the parameters. It could really be anything that is not normally a valid environment variable name; I am open to suggestions. I think $_ seems like a safe bet.

Why

It can be advantageous to set the default value to the actual name used, prefix and all.

I will explain a bit about my usecase, and why I think this change would be helpful for other people.

A project I work on, which unfortunately does not use this package, accesses secrets which are set in a secret manager. The names of those secrets are matched with environment variables. If a corresponding environment variable for a field is set, its value is used directly instead of accessing the secret manager. If, however, the corresponding environment variable is not set, then the project accesses the secret manager using the name of the environment variable.

Some new requirements have led to a need to expand on the current implementation, where basically we will have multiple groups of the same set of secrets with different prefixes. Instead of adding a lot of copypasta, I want to migrate to using this package. However, there is no way that I could see to have as a default value the actual name of the environment variable that was used at runtime.

It is of course possible to set the default value directly to the same name as the environment variable. As far as I can tell, though, there is no way to have the prefix, if any, correctly applied.

Here is some sample code which will hopefully illustrate the pain point with prefixes.

// Each field can have a different static default value here.
// The default for ASecret1 != BSecret1 != CSecret1, etc.
type FlatConfig struct {
	ASecret1 string `env:"A_SECRET1,default=sm://A_SECRET1"`
	// ... many different secrets and non-secrets for A

	BSecret1 string `env:"B_SECRET1,default=sm://B_SECRET1"`
	// ... many different fields for B, which mirror those for A

	CSecret1 string `env:"C_SECRET1,default=sm://C_SECRET1"`
	// ... many different fields for C, which mirror those for A and B

	// - Adding in a D or E, which should also mirror A, B, and C, requires lots
	//   of copypasta...
	// - Adding in a new field requires copypasta for each section...
	// - Want to keep this simple and DRY
}

// All fields of SharedConfig get the same default values.
// The default for ConfigA.Secret1 == ConfigB.Secret1 == ConfigC.Secret1, etc.
// It would be nice to add dynamic context to the default so that the default
// for ConfigA.Secret1 != ConfigB.Secret1 != ConfigC.Secret1, etc.
type NestedConfig struct {
	ConfigA SharedConfig `env:",prefix=A_"`
	ConfigB SharedConfig `env:",prefix=B_"`
	ConfigC SharedConfig `env:",prefix=C_"`
	// ... many shared configs with different prefixes
}

type SharedConfig struct {
	Secret1 string `env:"SECRET1,default=sm://needs-to-be-the-resolved-key-but-is-non-contextual"`
	// ... many different fields
}

Other Considerations

There were at least two different implementations considered which could achieve the same desired result (i.e., to set the default value to the properly prefixed key). Perhaps this behavior is a bug similar to #85? However, fixing it could potentially break current clients which depend on or expect k to not have the prefix, and, as alluded to in this comment, would probably have to wait until 1.0.

Only on the User End

I tried implementing this with a client-specific keyword for the default value along with a MutatorFunc. However, the MutatorFunc is not passed the actual key that was used (i.e., the value of parameter k is missing the prefix when there is one). This is why I felt the need for a magic expansion which evaluates to the actual key.

A New Configuration Option

I considered implementing a new configuration option, default_to_key (hopefully with a better name), that takes no value (similar to required) and is mutually exclusive with both required and default.

Something like this:

type MyStruct struct {
  Port int `env:"PORT,default_to_key"`
}

While this would be relatively simple to implement, I could not justify opening the floodgates to having a new configuration option for potentially each and every kind of special default, with this just being the first one.

A Special Keyword

I considered implementing a special keyword for use within default. I had not decided on any particular keyword, but POSIX regular expression character classes will suffice for illustration.

Potentially something like this:

type MyStruct struct {
  Port int `env:"PORT,default=[[:key-goes-here:]]"`
}

This would require extra pre-/post-processing of the default string in addition to the current shell expansion logic. I could not justify adding more complexity when leveraging the shell expansion logic is entirely sufficient.

envconfig.go Outdated Show resolved Hide resolved
Comment on lines +528 to +531
if pl, ok := l.(*prefixLookuper); ok {
return pl.prefix + key
}
return key
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly the same code as above. Perhaps it should be extracted into a small utility function like:

func actualKeyName(l Lookuper, key string) string {
	if pl, ok := l.(*prefixLookuper); ok {
		return pl.prefix + key
	}
	return key
}

Copy link
Contributor

@kinbiko kinbiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that my approval matters on this repo, just signaling that I've got nothing further after seeing the latest commits/replies. 👍

@sethvargo
Copy link
Owner

Hi @execjosh thank you for opening a PR. I'm having trouble following the use case here. If I understand correctly, you want the default value to have some kind of dependency on a prefix?

The example you provided

type A struct {
  Field string `env:"FOOBAR,default=banana"`
}

will always default the value of Field to "banana", I don't follow what alternative behavior you're expecting. If you want the default value to be the value of the environment variable named banana, then you would use:

type A struct {
  Field string `env:"FOOBAR,default=$banana"`
}

Note the addition of the prefixed $, which instructs envconfig to resolve the name against the lookuper.

Here's an example of a resolver that handles a secret lookups.

@execjosh
Copy link
Author

execjosh commented Oct 30, 2023

Hi @sethvargo. Thanks for having a look. Apologies for my explanation was a bit longwinded...

TL;DR of the usecase: need to execute secondary processing on fields which are actually intended to contain secrets and whose looked up env vars were undefined (i.e., the Lookuper.Lookup returned the empty string and false).

TL;DR magical expansion: MutatorFuncs do not get invoked unless the lookup succeeds or there is a default; and, MutatorFuncs are not provided the actual key used for the lookup; and, defaults currently have expansion implemented; so, leveraging the expansion functionality to inject the actual key into the default value is a small change; and, that could be useful in other usecases 💡

Mutator

I indeed wanted to use a MutatorFunc very similar to the example you have provided. However, the mutators only get invoked when either the lookup succeeds or there is a default value.

Problem

Mutators get the key that is written in the struct tag, not the actual key used in the lookup (this behavior might be considered a bug). Even with a default value indicating that the secret manager should be contacted, the correct secret name (i.e., the actual key used in the lookup) is unavailable.

Possible Solutions

Implement some way to provide the actual key used in the lookup as a default value. This is probably generic enough that it could benefit other usecases not considered here.

Custom Lookuper

Instead of a MutatorFunc, the secret could be looked up using a MultiLookuper with a custom Lookuper, always positioned last in the chain, that contacts the secret manager. The secret Lookuper would be invoked on every field whose env var is undefined.

Problem(s)

There is no point in contacting the secret manager for fields which are not secrets (e.g., ENVIRONMENT, PORT, or whatever). Even if the fields have defaults specified, the secret Lookuper would still be executed every time.

The Lookuper interface seems to have been intended for local lookups only and therefore does not provide a way to relay information about for example network failures (like authentication issues, timeouts, etc.)—an error cannot be returned. So the secret Lookuper would have to log for itself. The boolean return value would gain a small bit of nuance.

Possible Solutions

  1. Env var naming conventions like adding SECRET somewhere in the env var name which the Lookuper can use to decide its course of action
    • No need to change this package
    • Kind of clumsy for the users (lots of reimplementations)
  2. Implement some way in this package to mark a field as being a secret, for example a new configuration option
    • Heavy-handed for such a specific case
    • Would break things, since MutatorFuncs and Lookupers don't have access to the struct tags

@sethvargo
Copy link
Owner

Sorry, I'm still not following. I've done exactly what you're describing using MutatorFuncs and a made-up protocol (prefixing values with sm:// to fake a protocol) without an issue.

There's a lot of context and goals that aren't coming out in the description or contrived examples. This feels like an xyproblem - what problem are you trying to solve? Please be specific and provide concrete examples instead of foobars.

You've also made reference to running Mutators on unfound values - what is the use case for that?

In general, it shouldn't matter what the environment variable keys are for mutators. If you want to control a custom lookup for mutators, you look at the value. If the value for the environment variable matches the format that you expect to use for secret resolution (example), then you pass it off to the secret manager to resolve.

I'm not supportive of relying on environment variable keys as a mechanism for deciding which secret to lookup.

Provide a special expansion for the name of the environment variable, including
its prefix.
@execjosh
Copy link
Author

Thank you for the feedback!

This feels like an xyproblem - what problem are you trying to solve? Please be specific and provide concrete examples instead of foobars.

Indeed 😅 I have prepared an example, which hopefully helps illustrate the problem more concretely.

The Problem

The MutatorFuncs are not given the actual key that was used for the lookup in cases where there is a prefix. The value parameter can be used to decide whether to access an external API, but the correct key is currently unavailable.

Possible Solutions

  1. Treat it as a bugfix and pass the actual key along to the MutatorFuncs
    • Changes current MutatorFunc behavior/expectations
      • Possibly a breaking change, which, going by this comment, would have to wait for 1.0
    • Simple change
  2. Add a magic expansion for use in the default value (this PR)
    • Does not change current MutatorFunc behavior/expectations
    • Simple change
    • Adds some complexity to the logic for default
    • Could be considered too magical

I'm not supportive of relying on environment variable keys as a mechanism for deciding which secret to lookup.

In my particular usecase, the system defaults to retrieving a value from an external API that happens to be a secret manager, but could just as easily be any other external API like a cache or whatever. No env vars need to be set this way unless intentionally overriding what would have been retrieved from the external API.

@execjosh execjosh force-pushed the env-var-name-self-ref-def-val-expansion branch from c94d297 to e6eb364 Compare October 30, 2023 23:05
@sethvargo
Copy link
Owner

Treat it as a bugfix and pass the actual key along to the MutatorFuncs - Possibly a breaking change, which, going by this comment, would have to wait for 1.0

It's probably worth exploring, since I'm not sure I agree this is a bug. Mutators are intended to be written from the viewpoint of the struct, and the struct has no understanding or visibility into what prefixes it might be called with.

I can't fix this in a backwards-incompatible way, but I can do it in a way that will minimize the number of people affected. I think the real solution is to change the type signature of the MutatorFunc to return the originalKey, resolvedKey, originalValue, and currentValue. While we're there, we should probably also add a third return parameter that indicates whether processing in the mutator stack should stop.

I have this working, but it requires changing the type signature from ...MutatorFunc to ...any, and using runtime type assertions. The problem is that such a change would break anyone doing something like this:

var fns []envconfig.MutatorFunc
fns = append(fns, ...)
envconfig.ProcessWith(ctx, ..., fns...)

because the type signature for the variadic would be []any not []MutatorFunc with those changes. However, it wouldn't break people who do this:

envconfig.ProcessWith(ctx, ..., func1, func2)

POC is here. Ideally I'd just drop support for the old mutator signature entirely and maintain type safety, but I have no data on how prevalent the various usage patterns are. Searching OSS repos is a bit of a mixed bag.

Indeed 😅 I have prepared an example, which hopefully helps illustrate the problem more concretely.

The part I'm struggling to follow is why the map is keyed off of the envvar keys instead of the envvar values; that's the metaphorical equivalent of hard-coding something. Presumably you'd want something more like this, which keys off of the secret values themselves?

@execjosh
Copy link
Author

execjosh commented Oct 31, 2023

Mutators are intended to be written from the viewpoint of the struct, and the struct has no understanding or visibility into what prefixes it might be called with.

Interesting! I understand now where you're coming from. My interpretation had been different: that it was from the viewpoint of the entire configuration struct which is passed as i to Process{,With}, not just the specific struct with the actual tag. I probably made that assumption because the list of MutatorFuncs is passed along with i. I viewed the individual structs as implementation details within the entire configuration as a whole.

Your POC would certainly accommodate my usecase. It is unfortunate that it has to resort to using any. It will be interesting to see how Go changes in the future to handle this kind of case. But, I digress.

While we're there, we should probably also add a third return parameter that indicates whether processing in the mutator stack should stop.

Is my understanding correct that the mutator funcs are intended to be more of a middleware-esque chain of, well, mutations? Perhaps there could be a next argument, like is done with many other middleware implementations, which would allow each MutatorFunc in the chain to decide explicitly whether to call next or not. Control of the parameters for the next in the chain would have to be surrendered to the MutatorFuncs themselves. Mutators could end up mutating both the key and value parameters, and could perform pre-next and post-next mutations. Maybe next is not the best idea; but, it might be worth exploring a bit.

The part I'm struggling to follow is why the map is keyed off of the envvar keys instead of the envvar values; that's the metaphorical equivalent of hard-coding something. Presumably you'd want something more like this, which keys off of the secret values themselves?

Yes, keying off of the value is essentially what I initially wanted to do, following the example in the README 🙂 However, it won't work in the prefixing case, because we need more information to choose the correct key to use for the external API call. That is what led me to make this pull request 🙂

For a flat config without dynamic prefixes, it is possible to explicitly set different default values. Ideally, the same thing should be possible with a nested config, albeit dynamically, in order to keep things simple and DRY. Here is an example with some comments to illustrate:

// Each field can have a different static default value here.
// The default for ASecret1 != BSecret1 != CSecret1, etc.
type FlatConfig struct {
	ASecret1 string `env:"A_SECRET1,default=sm://A_SECRET1"`
	// ... many different secrets and non-secrets for A

	BSecret1 string `env:"B_SECRET1,default=sm://B_SECRET1"`
	// ... many different fields for B, which mirror those for A

	CSecret1 string `env:"C_SECRET1,default=sm://C_SECRET1"`
	// ... many different fields for C, which mirror those for A and B

	// - Adding in a D or E, which should also mirror A, B, and C, requires lots
	//   of copypasta...
	// - Adding in a new field requires copypasta for each section...
	// - Want to keep this simple and DRY
}

// All fields of SharedConfig get the same default values.
// The default for ConfigA.Secret1 == ConfigB.Secret1 == ConfigC.Secret1, etc.
// It would be nice to add dynamic context to the default so that the default
// for ConfigA.Secret1 != ConfigB.Secret1 != ConfigC.Secret1, etc.
type NestedConfig struct {
	ConfigA SharedConfig `env:",prefix=A_"`
	ConfigB SharedConfig `env:",prefix=B_"`
	ConfigC SharedConfig `env:",prefix=C_"`
	// ... many shared configs with different prefixes
}

type SharedConfig struct {
	Secret1 string `env:"SECRET1,default=sm://needs-to-be-the-resolved-key-but-is-non-contextual"`
	// ... many different fields
}

In other words: my usecase requires that the default be contextual.

This is what led to my assumption that passing the original instead of the resolved key to the MutatorFuncs was a bug. It so happens that the system I'm working on mirrors the keys for the secrets in the env vars. That is the reason it is advantageous, in my usecase, to reference the resolved key—either directly or indirectly through the default—in the MutatorFunc. A workaround would be to set in the environment the env vars corresponding to each and every single secret to the "correct" defaults; but, that looks an awful lot like FlatConfig. Ideally, the MutatorFunc should just be able use the value and not have to worry about the key at all.

@sethvargo
Copy link
Owner

Is my understanding correct that the mutator funcs are intended to be more of a middleware-esque chain of, well, mutations? Perhaps there could be a next argument, like is done with many other middleware implementations, which would allow each MutatorFunc in the chain to decide explicitly whether to call next or not. Control of the parameters for the next in the chain would have to be surrendered to the MutatorFuncs themselves. Mutators could end up mutating both the key and value parameters, and could perform pre-next and post-next mutations. Maybe next is not the best idea; but, it might be worth exploring a bit.

Yea - this is also an option.

I think I understand the use case a bit more, but I don't see a way to solve this in a backwards-compatible way that doesn't add significant complexity to the existing codebase. I'd rather lump this in with changes I want to make for 1.0 (around December of this year).

@execjosh
Copy link
Author

execjosh commented Nov 1, 2023

I think I understand the use case a bit more

Great!

I don't see a way to solve this in a backwards-compatible way that doesn't add significant complexity to the existing codebase. I'd rather lump this in with changes I want to make for 1.0 (around December of this year).

I'll be looking forward to December and 1.0 😃

@execjosh
Copy link
Author

execjosh commented Nov 1, 2023

FWIW, I separately investigated creating a custom Lookuper to generate the sm:// "URI" using the resolved keys. The URI would either be immediately returned (effectively a dynamic default), or, used to actually perform the external API call within the Lookuper directly (negating the need for a resolver MutatorFunc). This encounters a problem: the Lookuper must decide, based solely on the resolved key, whether that key is indeed one which necessitates contacting the external API. I could think of no "nice" way to do that, since there is not enough information encoded within the env var keys for the Lookuper to make good choices. Moreover, I feel that encoding meaning into env var keys just for the sake of these Lookupers is, as you alluded to before, probably not such a good mechanism.

I came to the conclusion that: external API calls are best implemented in MutatorFuncs, even for common usecases such as grabbing information from the metadata API or a secret manager, since it is trivial to make decisions based on the current value (e.g., checking for fake protocols like metadata://, sm://, etc.).

There is one narrow usecase where it might make sense to have external API calls in a Lookuper, though: all values are in some shared source-of-truth thing to which the Lookuper can blindly call out without making decisions based on the keys. Though since that really loosely expands the definition of "environment", it might not even be worth accommodating. There would need to be some way to propagate from the call to Lookup any errors back to the original call to Process{,With}. Currently, only success/failure can be communicated; so, some sort of out-of-band correlation needs to happen (e.g., separate log entries), which is flimsy. This, though, is probably best left for a separate discussion... 😅

@execjosh execjosh marked this pull request as draft November 1, 2023 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants