Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading variables from file #2171

Open
wants to merge 28 commits into
base: main
Choose a base branch
from
Open

Reading variables from file #2171

wants to merge 28 commits into from

Conversation

ilyakuz-db
Copy link
Contributor

@ilyakuz-db ilyakuz-db commented Jan 16, 2025

Changes

New flag --var-file which accepts JSON file with variables values. Variable values can have any type including complex type

--var-file has default value: .databricks/bundle/<target>/variable-overrides.json, i.e. CLI tries to stat and read that file every time during variable initialisation phase

It is possible to use this flag together with --var flag, priority of the value will be given to --var.

Tests

Acceptance tests

Copy link
Contributor

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, thanks!

@@ -6,4 +6,5 @@ import (

func initVariableFlag(cmd *cobra.Command) {
cmd.PersistentFlags().StringSlice("var", []string{}, `set values for variables defined in bundle config. Example: --var="foo=bar"`)
cmd.PersistentFlags().String("vars-file-path", "", `file path to a JSON file containing variables. Example: --vars-file-path="/path/to/vars.json"`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call this --var-file, shorter and same meaning.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 or --varfile, don't have to remember to put dash.

vars := map[string]any{}
err = json.Unmarshal(f, &vars)
if err != nil {
return diag.FromErr(fmt.Errorf("failed to parse variables file: %w", err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you pass a file with "foo" in there, or a list, this will return an error without suggesting what the right format is. Could you look into using jsonloader here? It would be nice to return a diag.Error with details suggesting what the correct format is, if the contents doesn't match what we expect.

// Initialize variables by assigning them values passed as command line flags
diags = diags.Extend(configureVariables(cmd, b, variables))
if len(variables) > 0 && variableFilePath != "" {
return b, diag.Errorf("cannot specify both --var and --vars-file-path flags")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like unnecessary constraint? Why not process one (e.g. file first) and then the other?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we discussed on meeting that it may cause confusion in some edge cases but with low profit:

  1. is it be obvious to give priority to second arg?
  2. since both flags might be partial - should we merge them or not
  3. non-obvious type mismatch between values of these flags (--var is always string)

Copy link
Contributor

@denik denik Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it be obvious to give priority to second arg?

What do you mean by second? Simple thing would be to have fixed priority: --var takes over --varfile. Complex but better thing would be to take into account order on the command line --varfile file1 --var ... --varfile file2, but we can leave that out of scope.

since both flags might be partial - should we merge them or not

Yes. Same as we do with env and command line parameters currently, we merge them. Command line take priority of both specify the same var.

non-obvious type mismatch between values of these flags (--var is always string)

Does not matter, we do not do any type of merging or other logic based on type. We just select the value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by second?

Yeah I meant order. I'm not sure whether it's clear to user, they should follow documentation to understand this priority

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not too hard to implement following command-line arguments order that would be great.

Otherwise fixed order is much better that complete disable --var.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in general my concern is whether it brings value to user because the use case of using both flags might be not so common but it may add some confusion and code becomes more complex

Variable resolution might feel non-obvious already and it will add more uncertainty

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if we add this feature now it won't be possible to remove later because of backward-compatibility but if we restrict now we can always allow later if customers ask for that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, passing --var twice for the same argument raises an error: #2176

So to be consistent with that I change my suggestion to raise an error if variable is defined in both --var and --varfile. But I still think it should be per-variable check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss in the slack thread? I don't mind adding merge but at the same time I don't want logic to be too complex because we already have multiple layers of overrides and with files added we will also have some type inconsistency between values

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline - we can limit the scope of this PR by erroring in case both --var and --var-file is used and we'll revisit that later.

$CLI bundle validate -o json --vars-file-path=vars.json | jq .resources.jobs.job1.job_clusters[0]

# variable flag
$CLI bundle validate -o json --var="cluster_key=mlops_stacks-cluster" | jq .resources.jobs.job1.job_clusters[0].job_cluster_key
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about testing

  • both --var and --vars-file-path?
  • file not found
  • file cannot be parsed
  • file has variable name that is not defined
  • file has variable name that is complex but default is string and vice versa
  • variable is required but it's not provided in the file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file has variable name that is not defined
file has variable name that is complex but default is string and vice versa
variable is required but it's not provided in the file

@denik Not sure about these, they are part of generic variable handling, should they be covered here? It feels like we starting to cover edge cases in place where it shouldn't be tested, WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added tests, but would be nice to hear your opinion on testing edge cases

What I don't like in that approach is that snapshot tests are very easy to add/update but at the same time hard to debug and I've seen many times how such tests became write-only thing when people didn't want to investigate the failure and just updated the snapshot

// Variables can have any type of value, including complex types
func (r *Root) InitializeAnyTypeVariables(vars map[string]any) error {
for name, val := range vars {
if _, ok := r.Variables[name]; !ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: instead of assigning to _, let's give it a name and use below to call Set() on it.

assert.Equal(t, "value", (root.Variables["string"].Value))
assert.Equal(t, 1, (root.Variables["int"].Value))
assert.Equal(t, []map[string]int{{"c": 3}}, (root.Variables["complex"].Value))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it add on top of acceptance test? I think it can be removed/merged into acceptance test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to also have simple localised check? It also serve showcase purpose

Acceptance test is nice because it's top-level and is really easy to add but at the same time they are harder to debug in case of failure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added acceptance tests, let me know what you think of keeping these as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If acceptance tests cover the same functionality, I'd remove this ones, because they have non-zero maintenance cost.

assert.Equal(t, []map[string]int{{"c": 3}}, (root.Variables["complex"].Value))
}

func TestInitializeAnyTypeVariablesUndeclared(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, it would be good to see full output, so acceptance test is preferred. That shows whether you use diagnostics correctly.

// a. Variable value obtained from arguments, example: `--var="foo=bar"`
// b. Variable value obtained from the file, example: `--vars-file-path="/path/to/file"`
// 2. Environment variable. eg: BUNDLE_VAR_foo=bar
// 3. Default value as defined in the applicable targets block
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep it in simple and with ordered rule:

  1. Command line flag. For example: `--var="foo=bar"-2
  2. Command line flag with json variables: --varfile
  3. Target variable. eg: BUNDLE_VAR_foo=bar and the test
  4. ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss here whether we need to merge values from flags

@@ -6,4 +6,5 @@ import (

func initVariableFlag(cmd *cobra.Command) {
cmd.PersistentFlags().StringSlice("var", []string{}, `set values for variables defined in bundle config. Example: --var="foo=bar"`)
cmd.PersistentFlags().String("vars-file-path", "", `file path to a JSON file containing variables. Example: --vars-file-path="/path/to/vars.json"`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 or --varfile, don't have to remember to put dash.

@ilyakuz-db ilyakuz-db requested a review from denik January 20, 2025 15:35
@ilyakuz-db ilyakuz-db changed the title feat: Read variables from file Reading variables from file Jan 21, 2025
Copy link
Contributor

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, thanks.

@@ -6,4 +6,5 @@ import (

func initVariableFlag(cmd *cobra.Command) {
cmd.PersistentFlags().StringSlice("var", []string{}, `set values for variables defined in bundle config. Example: --var="foo=bar"`)
cmd.PersistentFlags().String("var-file", "", `file path to a JSON file containing variables. Example: --var-file="/path/to/vars.json"`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cmd.PersistentFlags().String("var-file", "", `file path to a JSON file containing variables. Example: --var-file="/path/to/vars.json"`)
cmd.PersistentFlags().String("var-file", "", `path to a JSON file containing variables. Example: --var-file="/path/to/vars.json"`)

trace errcode $CLI bundle validate -o json --var-file=var_files/string_to_complex.json | jq $cluster_expr

# variable is required but it's not provided in the file
trace errcode $CLI bundle validate -o json --target without-defaults --var-file=var_files/without_required.json | jq $cluster_expr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you include the comments as echo "..." statements so they appear in the output?

That makes reading the output and understanding the expectations easier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added small helper for nice printing which we can further adjust if needed


>>> errcode $CLI bundle validate -o json --var-file=var_files/wrong_file_structure.json
Error: failed to parse variables file: var_files/wrong_file_structure.json:1:1: expected a map, found a sequence
in var_files/wrong_file_structure.json:1:1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This location is always the same because it is taken from the root value. Can be omitted.

@pietern
Copy link
Contributor

pietern commented Jan 21, 2025

@ilyakuz-db Please update the PR summary as well to reflect the actual flag name.

@ilyakuz-db
Copy link
Contributor Author

@pietern @denik as discussed offline added default path for --var-path

-p, --profile string ~/.databrickscfg profile
-t, --target string bundle target to use (if applicable)
--var strings set values for variables defined in bundle config. Example: --var="foo=bar"
--var-file string path to a JSON file containing variables. Example: --var-file="/path/to/vars.json" (default ".databricks/bundle/<target>/vars.json")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This default message is hardcoded to avoid having invalid path value in variable

@@ -1,4 +1,4 @@
Error: no value assigned to required variable a. Assignment can be done through the "--var" flag or by setting the BUNDLE_VAR_a environment variable
Error: no value assigned to required variable a. Assignment can be done using "--var" or "--var-file", by setting the BUNDLE_VAR_a environment variable, or in .databricks/bundle/<target>/vars.json file
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any other suggestions for good user message are welcome! It became quite complicated

"github.com/spf13/cobra"
)

func GetDefaultVariableFilePath(target string) string {
return ".databricks/bundle/" + target + "/vars.json"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @pietern @fjakobs to confirm the path / filename

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we will live with that name for centuries :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use variable-overrides.json.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like mentioning "overrides"

Updated

@ilyakuz-db ilyakuz-db requested review from pietern and removed request for andrewnester and shreyas-goenka January 21, 2025 17:29

=== file cannot be parsed
>>> errcode $CLI bundle validate -o json --var-file=var_files/invalid_json.json
Error: failed to parse variables file: error decoding JSON at :0:0: invalid character 'o' in literal false (expecting 'a')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error: failed to parse variables file:

would be very helpful to print the file name there

Error: failed to parse variables file var_files/invalid_json.json:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion! Updated

@denik denik self-requested a review January 22, 2025 13:28
without-defaults:

# see .databricks/bundle/default_target/ for variable values
with-default-variable-file:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need a target here? The default variables are read for any target, right? so it's the same as without-defaults.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made separate target to avoid affecting other tests by this default

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g. I put .databrick/bundle/without-defaults/variable-overrides.json file and previous test implicitly has this additional variable sources which may lead to confusion if anything goes wrong

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants