Support for Common Hyper-parameter Tuning Libs #60

ncilfone · 2021-06-23T19:11:20Z

Some sort of Adapter pattern to interface with some of the more common hyper-parameter tuning libraries.

NNI: https://github.com/microsoft/nni
Optuna: https://github.com/optuna/optuna
Talos: https://github.com/autonomio/talos
HyperOpt: https://github.com/hyperopt/hyperopt

Katib (supports k8s Job and MPIJob CRD): https://github.com/kubeflow/katib

ncilfone · 2021-06-29T13:00:51Z

Nevergrad: https://github.com/facebookresearch/nevergrad
scikit-optimize: https://scikit-optimize.github.io/stable/
Ray Tune: https://docs.ray.io/en/latest/tune/index.html
Ax: https://ax.dev/

Support for SM?: https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html

ncilfone · 2021-06-29T13:04:57Z

@dorukkilitcioglu would love to hear any feedback/ideas you have on this. Hoping to tackle in the next few weeks.

dorukkilitcioglu · 2021-06-29T14:18:00Z

In terms of Black-Box Optimization, I've been looking into Nevergrad and Ax, both seem pretty well equipped, though I prefer Nevergrad at this point.

In general, are you looking to map a spock config to the parameter classes that these tools use? So you only write your spock config, ask spock to tune it using Optuna (for example), give it a budget, and when you come back you have 100 different runs with their associated spock configs and some objective value.

You could also start small and use random search as the initial POC, to figure out how you want to handle the input/output of this process.

ncilfone · 2021-06-29T14:27:32Z

My current thought is to be tool run agnostic and just provide an adapter to each supported backend that returns whatever structure is needed for the parameter ranges/types/scales etc. Basically allow you to specify all the ranges etc with spock, regardless of backend, using something like a new decorator (e.g. @spockTuner) and then just map that to an output for the user to use downstream in the manner that the library wants...

The only things this kinda breaks with spock is the ability to save the state of each hyperparameter 'run' config as the backend library will be handling the evolution of the parameter set. But that might be overkill as you might only want to save the range configs and then the final config...

ncilfone · 2021-06-29T14:31:05Z

Or as you alluded to we could just fully wrap 1 or 2 backends to be set and forget... Not sure which is the better option

ncilfone · 2021-07-12T19:11:39Z

@dorukkilitcioglu can you peak at #62 and see what you think? There is a simple example here that shows the basic syntax.

Lmk

dorukkilitcioglu · 2021-07-14T13:57:39Z

Looking at the simple example, there's a lot going on. I like it overall, and I feel like it looks more dense than it actually is because Logistic Regression training is super easy and it's as if half the code is tuning :D

My understanding is that there are multiple steps you have to take:

Set up the tuner (Optuna) with some high-level parameters
Modify the config builder so it's on tuning mode
Generate fixed params? I'm not sure what this does. I'm guessing that after the .tuner() call we don't have a spockspace object anymore, so we can't directly get a parameter from there without sampling?
Save the config and then sample the config? Shouldn't this logically be the other way around? Like you'd first generate a sample using your favorite optimizer, and then save the parameters that it generates?
Tell the optimizer the result. This is fairly standard - is this tuner-agnostic in terms of the API call?

Also, does the tune_dict have to be saved in a different data structure, or are the results being saved somewhere automatically?

I think there's a slight room for simplification (why can't tune_dict just be a part of attrs_obj, so you only interact with attrs_obj?), but I think the number of steps is very reasonable. I'm not sure about the specifics, so maybe you're already handling this, but storing the current state of the hyperparam exploration (internal state of the optimizer, plus all of the results so far) in a file as the optimization is going on could be useful (plus it would make it easier to continue tuning in the future).

ncilfone · 2021-07-14T17:29:54Z

Yeah LR might be a bit under-kill as an example but I was just mimicking some optuna docs for simplicity..

My understanding is that there are multiple steps you have to take:

Set up the tuner (Optuna) with some high-level parameters

There is a difference between @spock and the @spockTuner decorators. So anything you want to be hyper-params needs to be decorated with @spockTuner (This is for backend reasons and to limit the types allowed)

Modify the config builder so it's on tuning mode

All good here..

Generate fixed params? I'm not sure what this does. I'm guessing that after the .tuner() call we don't have a spockspace object anymore, so we can't directly get a parameter from there without sampling?

Yeah, the two options to get a spockspace back are generate() and sample(). Generate just returns a spockspace with all the fixed parameters (anything that was decorated with @spock). Since I parametrized the number of trials I needed to get a fixed parameter out before I started stepping on sample() which returns the fixed params + a sampled set of hyper-params. Just wanted to show that both are available and work as expected. So you can get the parameter without sampling you just need to use the generate() interface

Save the config and then sample the config? Shouldn't this logically be the other way around? Like you'd first generate a sample using your favorite optimizer, and then save the parameters that it generates?

Logically yes, however since sample() returns a spockspace the save() call has to be previous in the chain to get the object back out for chained calls -- this actually follows the fixed logic as well where one calls .save().generate()

Tell the optimizer the result. This is fairly standard - is this tuner-agnostic in terms of the API call?

It's kinda tuner agnostic in its current state. The return object in the second position from the sample() call will be a dictionary containing whatever you need for the tuner backend (only optuna for now which is why I only say kinda as I haven't fully tested any other backends)

Also, does the tune_dict have to be saved in a different data structure, or are the results being saved somewhere automatically?

Not sure... this comes from the fact that in order to sample with the define-and-run style interface in optuna you need the study object but also when calling tell you need the currently generated trial and the study. Seemed easiest to package that up into a dict...

I think there's a slight room for simplification (why can't tune_dict just be a part of attrs_obj, so you only interact with attrs_obj?), but I think the number of steps is very reasonable. I'm not sure about the specifics, so maybe you're already handling this, but storing the current state of the hyperparam exploration (internal state of the optimizer, plus all of the results so far) in a file as the optimization is going on could be useful (plus it would make it easier to continue tuning in the future).

Good point. Haven't dealt with the results, etc yet and how that's handled. Only the fact that the .save() call can dump the state of every call to sample. Will have to build out what you've said in some way....

ncilfone · 2021-07-14T17:32:17Z

I think there's a slight room for simplification (why can't tune_dict just be a part of attrs_obj, so you only interact with attrs_obj?)

Forgot to address this one. Actually makes sense. I think I can just add it as an @property on the object for simple access when needed.

* Common hyperparameter tuning interface #60 * Added Optuna support * Refactored backend to support split of fixed and tuneable parameters * Added black/isort * Handles usage pattern of drop-in argparser replacement where no configs (from cmd line or as input into ConfigArgBuilder) are passed thus falling back on all defaults or definitions from the command line. fix-up of all cmdline usage pattern. there were certain edge cases that were not getting caught correctly if it wasn't overriding an existing payload from a yaml file. #61 * Unit tests Signed-off-by: Nicholas Cilfone <nicholas.cilfone@fmr.com>

ncilfone added the enhancement New feature or request label Jun 23, 2021

ncilfone self-assigned this Jun 23, 2021

ncilfone pinned this issue Jun 29, 2021

ncilfone mentioned this issue Jul 12, 2021

Hyper-parameter Tuning Support -- Optuna #62

Merged

ncilfone linked a pull request Jul 13, 2021 that will close this issue

Hyper-parameter Tuning Support -- Optuna #62

Merged

ncilfone closed this as completed in #62 Jul 28, 2021

ncilfone unpinned this issue Oct 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Common Hyper-parameter Tuning Libs #60

Support for Common Hyper-parameter Tuning Libs #60

ncilfone commented Jun 23, 2021

ncilfone commented Jun 29, 2021 •

edited

Loading

ncilfone commented Jun 29, 2021

dorukkilitcioglu commented Jun 29, 2021

ncilfone commented Jun 29, 2021

ncilfone commented Jun 29, 2021

ncilfone commented Jul 12, 2021

dorukkilitcioglu commented Jul 14, 2021

ncilfone commented Jul 14, 2021

ncilfone commented Jul 14, 2021

Support for Common Hyper-parameter Tuning Libs #60

Support for Common Hyper-parameter Tuning Libs #60

Comments

ncilfone commented Jun 23, 2021

ncilfone commented Jun 29, 2021 • edited Loading

ncilfone commented Jun 29, 2021

dorukkilitcioglu commented Jun 29, 2021

ncilfone commented Jun 29, 2021

ncilfone commented Jun 29, 2021

ncilfone commented Jul 12, 2021

dorukkilitcioglu commented Jul 14, 2021

ncilfone commented Jul 14, 2021

ncilfone commented Jul 14, 2021

ncilfone commented Jun 29, 2021 •

edited

Loading