-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Downsampling of observed data for single-fidelity Bayesian optimization #607
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #607 +/- ##
==========================================
+ Coverage 64.98% 65.17% +0.18%
==========================================
Files 353 360 +7
Lines 26238 26424 +186
==========================================
+ Hits 17051 17222 +171
- Misses 9187 9202 +15
... and 33 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly comments regarding documentation
@@ -197,6 +197,13 @@ full range of arguments. We list the most important ones: | |||
each round. ``opt_maxiter`` is the maximum number of L-BFGS iterations. We | |||
run ``opt_nstarts`` such optimizations from random starting points and pick | |||
the best. | |||
* ``max_size_data_for_model``, ``max_size_top_fraction``: GP computations scale |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
based on this description, it is not clear to me what max_size_top_fraction
is for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I add something
@@ -0,0 +1,2 @@ | |||
syne-tune[gpsearchers,blackbox-repository,aws] | |||
tqdm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is the dependency tqdm
coming from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tqdm
is not among the Syne Tune dependencies, but it is needed for benchmarking.
is fit to, via ``max_size_data_for_model`` in ``search_options``. If the data | ||
is larger, it is downsampled to this size. Sampling is controlled by another | ||
argument ``max_size_top_fraction``. Namely, this fraction of entries in the | ||
downsampled set are filled by the best points in the full set, while the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"best points" refers to configurations with highest/lowest metric?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I add a comment
the dataset gets large. You can use ``opt_skip_init_length``, | ||
``opt_skip_period`` to this end. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure whether it is discussed somewhere else but at this point it is unclear how to set them (bool, None, 0?) or what exactly they refer to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a link
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: multifid/singlefid -> multi_fidelity. I prefer full names. abbreviations might not be easy to understand for everyone. it might also lead to inconsistent way of abbreviating things. this is more a suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
We currently only limit the dataset size for multi-fidelity BO. This closes the gap. Has been asked for by a student working with Ralf Herbrich at HPI.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.