You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This gist of the experiments is to compare them. We need a diff facility to compare inputs and results across runs.
We don't have to limit this comparison to defined outputs in the pipeline. Any file changed between two runs can be diffed.
There can be three types of diffs:
Unstructured diffs: This is for binary files that we don't recognize. Only the content digest is reported.
Structured diffs: For a file format that we can parse, we can report the individual differences across runs. JSON, YAML or any other format that we can parse for results can be reported as structured diff.
Text diffs: This is for the source code files that may have lead to changes in other files.
The workflow is as follows:
User has a bunch of files, source, params, data, model, etc.
User modifies some of these manually. e.g. updating the source code.
User modifies some of these with xvc exp run --input-param command.
User runs a command (or pipeline) on the files.
Xvc clones/rechecks/copies files from original to a directory in .xvc/exp/KEYWORD-RANDOMSTRING-TIMESTAMP directory.
Xvc links the original cache.
Xvc creates a .xvc-exp directory to store experiment specific data.
Xvc modifies the files with the given modification option.
--input-param params.yaml params.my-param 123,124,135 creates 3 experiments, each changing params.yaml::params.my-param to a given value.
Xvc runs the given command (or pipeline) in the directory
Xvc stores the updated artifacts in the common cache, symlinking the results.
User asks for results diffed from the original.
Xvc compares each of the directories for the changed files.
Xvc shows unstructured files digest strings.
Xvc shows structured files changed values.
Xvc shows text file diffs similar to Git.
All results must be reported in JSON. Tables may be built from this JSON.
The second facility xvc exp provides is to modify structured files quickly for each experiment.
xvc exp run --input-param file.yaml dict.key value1,value2,value3 will parse file.yaml, update dict.key with value1 and run an experiment, update with value2 and run another, update with value3 and run another.
xvc exp run --input-param file.json dict.key '0;5;100' will run experiments with 0,5,10,15,20,...,100 (inclusive).
Files to be modified are JSON, YAML1.2 and TOML files. (Anything serde can read/write is possible in theory.)
We can extend this functionality to regex. --input-regex file.txt 'my_var = (.*)' 0;0.1;1 updates $1 in regex with the values.
We can also use --command-template for this. xvc exp run --command-template 'python train.py ${{EXP_VALUE}}' 0;0.2;10 will run python train.py with parameters 0, 0.2, 0.4, .... in different experiments.
If there are more than one --input-param, --input-regex, --command-template parameters, we build permutations of values. xvc exp run --input-param file.yaml dict.key 1,2,3 --input-param another.yaml another.key 5,6,7 will run 9 experiments.
There may be three subcommands for xvc exp run.
xvc exp run pipeline --name: (xvcerp) Runs a pipeline command with the given parameters. (xvc pipeline run --name)
xvc exp run command 'cmd': Runs a generic command as experiment
xvc exp run template 'cmd ${{EXP_VALUE_1}} ${{EXP_VALUE_2}} 1,2,3 4,5,6 runs a command by substituing values to the command string.
--input-param and --input-regex options are available to all three of these. Maybe instead of --input-param, it's better to use --update-param and --update-regex. Maybe we can merge these, but I don't like to have corner cases.
--keyword will set the KEYWORD portion of experiment names. By default, this is exp. User may want to set to a searchable name.
The updated params, and run commands are stored in .xvc-exp directory. It may contain the exact script that was run.
This discussion was converted from issue #184 on January 24, 2023 06:57.
Heading
Bold
Italic
Quote
Code
Link
Numbered list
Unordered list
Task list
Attach files
Mention
Reference
Menu
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
This gist of the experiments is to compare them. We need a
diff
facility to compare inputs and results across runs.We don't have to limit this comparison to defined outputs in the pipeline. Any file changed between two runs can be diffed.
There can be three types of diffs:
The workflow is as follows:
xvc exp run --input-param
command..xvc/exp/KEYWORD-RANDOMSTRING-TIMESTAMP
directory..xvc-exp
directory to store experiment specific data.--input-param params.yaml params.my-param 123,124,135
creates 3 experiments, each changingparams.yaml::params.my-param
to a given value.All results must be reported in JSON. Tables may be built from this JSON.
The second facility
xvc exp
provides is to modify structured files quickly for each experiment.xvc exp run --input-param file.yaml dict.key value1,value2,value3
will parsefile.yaml
, updatedict.key
withvalue1
and run an experiment, update withvalue2
and run another, update withvalue3
and run another.xvc exp run --input-param file.json dict.key '0;5;100'
will run experiments with0,5,10,15,20,...,100
(inclusive).Files to be modified are JSON, YAML1.2 and TOML files. (Anything serde can read/write is possible in theory.)
We can extend this functionality to regex.
--input-regex file.txt 'my_var = (.*)' 0;0.1;1
updates$1
in regex with the values.We can also use
--command-template
for this.xvc exp run --command-template 'python train.py ${{EXP_VALUE}}' 0;0.2;10
will runpython train.py
with parameters 0, 0.2, 0.4, .... in different experiments.If there are more than one
--input-param
,--input-regex
,--command-template
parameters, we build permutations of values.xvc exp run --input-param file.yaml dict.key 1,2,3 --input-param another.yaml another.key 5,6,7
will run 9 experiments.There may be three subcommands for
xvc exp run
.xvc exp run pipeline --name
: (xvcerp
) Runs a pipeline command with the given parameters. (xvc pipeline run --name
)xvc exp run command 'cmd'
: Runs a generic command as experimentxvc exp run template 'cmd ${{EXP_VALUE_1}} ${{EXP_VALUE_2}} 1,2,3 4,5,6
runs a command by substituing values to the command string.--input-param
and--input-regex
options are available to all three of these. Maybe instead of--input-param
, it's better to use--update-param
and--update-regex
. Maybe we can merge these, but I don't like to have corner cases.--keyword
will set theKEYWORD
portion of experiment names. By default, this isexp
. User may want to set to a searchable name.The updated params, and run commands are stored in
.xvc-exp
directory. It may contain the exact script that was run.Beta Was this translation helpful? Give feedback.
All reactions