-
Notifications
You must be signed in to change notification settings - Fork 18
Visualization of model hyperparameter optimization curves
The goal of this project is to provide users of the mlr
package with a way of visualizing what happens during the tuning process that identifies the best hyperparameters for given data. This will enable users to assess the impact of different parameters and provide pointers to authors of learning methods what parameters have an impact in practice and how to improve their approaches.
Many machine learning algorithms have lots of parameters that need to be set in order to achieve optimal performance on a given data set. Doing this manually is a tedious and error-prone task. The mlr
package implements not only a interface to dozens of different learning algorithms in R, but also a set of generic hyperparameter optimisation methods -- given a learner, its parameters and data, it will automatically identify the best parameter setting for the particular case.
While good parameter settings can be determined efficiently, mlr
currently provides no means of visualizing this process. The user is given a result without much explanation of how this result was arrived at. Understanding what happens during the process is not only interesting from the user's point of view, but also crucial for understanding what happens and linking this back to an understanding of the behaviour of the machine learning algorithm on the data. Such understanding can inform improvements for the particular approach.
This project will create visualizations of hyperparameter tuning for mlr
. It
will allow the plotting of a hyperparameter against a scoring function, showing the effect of
tuning the specified hyperparameter. It will furthermore include support for plotting multiple
hyperparameters and scoring functions, along with ablation analysis (a method for identifying the most important parameters).
The path taken from the starting parameter configuration to the end result is stored in an optimization path data structure that is part of the ParamHelpers
package. The data structure should contain all the necessary information, but may need to be extended to accommodate more detail.
The plotting should use ggplot2
/ggvis
, in line with the other visualizations in mlr
. Providing interactive functionality, e.g. through shiny
, would be desirable.
Applicants should have:
- Experience using or developing in R, and development tools such as git.
- Experience with visualization methods.
- A background in computer science or engineering will be beneficial.
Implement a simple visualization that plots the points on an optimization path with respect to the achieved performance. The mlr
tutorial gives details on how to get started.
Visualizing Hyperparameter Optimization by Mason
Bernd Bischl (bernd_bischl@gmx.net) is one of the primary author of mlr and ParamHelpers and has mentored for GSoC before.
Lars Kotthoff (larsko@cs.ubc.ca) is one of the primary authors of mlr and has mentored for GSoC before.