You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
assume I am doing unsupervised clustering of text/images,
let's say I have an embedding stage and then clustering stage in the pipeline.
Also assume for simplicity I have no targets/labels - I'm data-mining.
I want to run multiple variations of the pipeline.
In this scenario I care about agreement/disagreement across variations. For example, I only want to spend time inspecting the results if they are sufficiently distinct from baseline.
What's the best approach for doing this with dvc? The very basic constraint I am bumping into is the fact that dvc-experiment variations' outs never even exist at the same time in the workspace.
I can only think of taking "outs" of baseline experiment and duplicating them as silver-labels. Then these silver-labels will exist across all variations and I can metric-compare variations-outs against silver-labels. But this gets out of hand fast because as soon as I find sufficiently distinct new variation, I will need a second set of silver-labels.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Here is an illustration of the scenario:
In this scenario I care about agreement/disagreement across variations. For example, I only want to spend time inspecting the results if they are sufficiently distinct from baseline.
What's the best approach for doing this with dvc? The very basic constraint I am bumping into is the fact that dvc-experiment variations' outs never even exist at the same time in the workspace.
I can only think of taking "outs" of baseline experiment and duplicating them as silver-labels. Then these silver-labels will exist across all variations and I can metric-compare variations-outs against silver-labels. But this gets out of hand fast because as soon as I find sufficiently distinct new variation, I will need a second set of silver-labels.
Beta Was this translation helpful? Give feedback.
All reactions