Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MultiObjective Optimization Notebook #155

Merged
merged 8 commits into from
May 28, 2021

Conversation

jonpsy
Copy link
Member

@jonpsy jonpsy commented May 25, 2021

No description provided.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@jonpsy
Copy link
Member Author

jonpsy commented May 25, 2021

Also, since we are waiting for custom inputs from user, I'm not sure how the CI will handle this. Any ideas there?

@zoq
Copy link
Member

zoq commented May 25, 2021

Also, since we are waiting for custom inputs from user, I'm not sure how the CI will handle this. Any ideas there?

I would just go with some predefined parameters and add a comment to let the user know how and what to change.

@jonpsy
Copy link
Member Author

jonpsy commented May 25, 2021

@zoq Thanks for the help! Do you have any further queries? I think this notebook is good to go.

@zoq
Copy link
Member

zoq commented May 25, 2021

@zoq Thanks for the help! Do you have any further queries? I think this notebook is good to go.

I'll test it out later, right now I have to manfully pull the latest ensmallen branch right? I have the conda package for ensmallen ready, I just have to do the same thing for mlpack.

@zoq
Copy link
Member

zoq commented May 25, 2021

Maybe it already works, you can add zoq to the channels https://github.com/mlpack/examples/blob/master/binder/environment.yml as well as ensmallen at the end?

@jonpsy
Copy link
Member Author

jonpsy commented May 25, 2021

Yeah, I tested it manually and it worked. Anyway the integration test should confirm it for you. I've written a small script to automate the process of pulling the latest ensmallen https://gist.github.com/jonpsy/de2e1f7e49a93a393beadcdae679784a. Just checkout this branch and go to terminal and copy all the code from the script, it will do everything for you. After that, just run this notebook and voila.

Let me know if you find difficulty somewhere?

@jonpsy
Copy link
Member Author

jonpsy commented May 25, 2021

Looks like our build broke
The command "./download_data_set.py" failed and exited with 1 during . :)

@zoq
Copy link
Member

zoq commented May 25, 2021

Restarted let's see.

@jonpsy
Copy link
Member Author

jonpsy commented May 25, 2021

Maybe it already works, you can add zoq to the channels https://github.com/mlpack/examples/blob/master/binder/environment.yml as well as ensmallen at the end?

channels:
- conda-forge
- zoq
- ensmallen

Something like this?

@zoq
Copy link
Member

zoq commented May 25, 2021

Almost, zoq is correct but ensmallen add the end of the file, since it's the package we like to install.

@jonpsy
Copy link
Member Author

jonpsy commented May 25, 2021

Sorry for dragging this on, you mean under dependencies right?

dependencies:
# Jupyter
- jupyterlab=3
- ....
- ensmallen

@zoq
Copy link
Member

zoq commented May 25, 2021

Correct, right below mlpack is good.

@jonpsy
Copy link
Member Author

jonpsy commented May 26, 2021

Looks like the test pass 🥳

"metadata": {},
"outputs": [],
"source": [
"#define ARMA_DONT_USE_WRAPPER"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already part of #include <mlpack/xeus-cling.hpp> so we can remove the extra line here.

"metadata": {},
"outputs": [],
"source": [
"opt.Optimize(objectives, coords);"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this takes some time, would be a good idea to add some callbacks (https://ensmallen.org/docs.html#callback-documentation) to show some output in between and at the end (PrintLoss, Report).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

report

Looks kinda redundant. PrintLoss() is doing nothing, and Report() has the output as shown in the picture. I think we need to work on MOO callbacks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, yes I think we have to adapt the callbacks so that they work for multi-objective optimizers as well.

"outputs": [],
"source": [
"plt::figure_size(800, 800);\n",
"plt::scatter(frontX, frontY, 50);\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we could plot the optimization process as well, something like - https://www.machinelearningplus.com/wp-content/uploads/2020/10/ef-300x230.gif

Copy link
Member Author

@jonpsy jonpsy May 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need to query the generated Pareto Front per x number of generations, looks like the job of a callback?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this can be a custom callback, that is implemented as part of the notebook.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, please try and run the notebook once. Our final pareto front isn't as cool as theirs :(.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I used https://lab.mlpack.org/ entered your fork and branch to run it, worked pretty well, except that the optimization step took a really long time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reg. our results, I do find it kinda odd for the results to be this slow in xeus-cling, my mid-tier laptop can knock it off within seconds. Also, I'm worried about the diversity of the solution, if you recall one of the goals of MOEA is to have a uniform distribution of solutions, what's happening in our algorithm is producing a "cluster" of solutions.

Arguably, since each of the clusters is far apart the solution set is indeed "diverse" so we can't really blame the algorithm. I'm not sure if NSGA-II is to blame or whether our implementation is wrong. It's also possible that our population size is too low to observe the actual Pareto Front but I'm not going to let our users wait an eternity for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, seems odd to me, we should run some more experiments with different parameters. But I think for now this is good, the solution is still valid.

"cell_type": "markdown",
"metadata": {},
"source": [
"### 5. Final Thoughts"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's a good idea to add a section about "How to read the Efficient Frontier?
" see https://www.machinelearningplus.com/machine-learning/portfolio-optimization-python-example/ for an example to the example as well, what do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've written something similar under the Plotting section. The documentation you suggested explains efficient frontier by comparing it with other fronts, the problem is we can't generate multiple pareto fronts as of now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have seen that, but maybe we can improve that section a little bit e.g. by going into some more details what the plot is trying to visualize.

jonpsy and others added 2 commits May 26, 2021 19:55
…ipynb

Co-authored-by: Marcus Edel <marcus.edel@fu-berlin.de>
This reverts commit af0085e2884af9c5a69af0f8746944d57d5593ba.

Added callback
@zoq
Copy link
Member

zoq commented May 27, 2021

@jonpsy Let me know if you like to change anything here, or if this is good to go and we update the notebook in another PR.

@jonpsy
Copy link
Member Author

jonpsy commented May 27, 2021

I'll make a few changes, it should be done before kick-off meeting. We can discuss some nitpicks in the meeting and then we can approve by night. Cool?

jonpsy added 2 commits May 27, 2021 21:55
- Add dominance relation in Optimize section.
- Explain X and Y-Axis in Plotting section.
- Use the parameters from the blog.
@jonpsy
Copy link
Member Author

jonpsy commented May 27, 2021

@zoq Please feel free to merge

Copy link
Member

@zoq zoq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks for putting this together!

@kartikdutt18 kartikdutt18 merged commit a465236 into mlpack:master May 28, 2021
@jonpsy jonpsy deleted the nsga2-notebook branch May 28, 2021 05:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants