-
Notifications
You must be signed in to change notification settings - Fork 335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BaseSearch and RandomSearch #25
Conversation
I'm curious to hear your thoughts on this, and am totally open to reorganize/iterate. 🔧 I imagine the structure will depend on how this will be used. I suppose having GridSearch and RandomSearch as separate methods might feel a bit clunky if a user doesn't really have a preference and just wants the best parameters. Maybe in that case there is simply a search method that defaults to GridSearch, and they can toggle to RandomSearch with bounds and iterations if they want a little more control. In general, I like that users can set a low number of iterations for RandomSearch in situations where an exhaustive search of GridSearch might not be practical. |
Got it! Thanks @SioKCronin! 😃 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @SioKCronin, nice work! Thanks a lot!
It's not yet the end of the week but I got some unexpected free time so I'm now writing this review 😸
As for the structure of the search classes, I believe having separate classes for RandomSearch
and GridSearch
is already fine. Your implementation is good 👍. I'm more for having classes that are as atomic as possible. In this manner, it's easier to construct higher-level methods that combines these atoms in whatever way we please.
This ties in to the use-case where someone doesn't really have a preference on which search method to use. A naive suggestion would be to expound in the documentation the benefits of both, and the necessary tradeoffs in using one over the other. Just to make the user conscious of their choice.
In the same manner, we can write a higher-level method that just calls GridSearch
and a simple toggle for random search. Although I don't know where we can put this higher-level method.
Lastly, for the previous commit, 48a279f, I added support for Python 2.7. Usually we add something like this at the top-level
# Import from __future__
from __future__ import with_statement
from __future__ import absolute_import
from __future__ import print_function
from past.builtins import xrange
This enables the use of print methods and parenthesis when working with Python 2. The last import in xrange
enables you to use Python 2's xrange
in Python 3.
This seems a bit confusing but in Python 2, the range
method gives you a list. But in Python 3, you are given an iterator.
# In Python 2
type(range(5)) # range gives you a list
>>> <type 'list'>
type(xrange(5)) # xrange gives you an iterator
>>> <type 'xrange'>
# In Python 3
type(range(5)) # range gives you an iterator
>>> <type 'range'>
type(xrange(5)) # xrange is not supported
>>> NameError: name 'xrange' is not defined
Iterating through an iterator is much faster than iterating through a list. On the other hand, the xrange
method in Python 2 gives you an iterator. Python 3 doesn't support xrange
unless we import it from past.builtins
. This means that we have to change all range
methods into xrange
so that it outputs an iterator even if we use Python 2 or 3.
# Import xrange from past.builtins
from past.builtins import xrange
# In Python 2
type(xrange(5)) # xrange gives you an iterator
>>> <type 'xrange'>
# In Python 3
type(xrange(5)) # xrange gives you an iterator
>>> <type 'range'>
All in all, thank you so much for your contributions! It's really awesome 👍
pyswarms/utils/search/base_search.py
Outdated
|
||
Attributes | ||
---------- | ||
optimizer: PySwarms class |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can write optimizer: PySwarms class
in a modular format: optimizer : pyswarms.single
. This already defines both pyswarms.single.LocalBestPSO
and pyswarms.single.GlobalBestPSO
😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it ✅
pyswarms/utils/search/base_search.py
Outdated
self.iters = iters | ||
|
||
def generate_score(self, options): | ||
"""Generates score for optimizer's performance on objective function.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if this is not a user-facing method, maybe we can add docstrings here? Just for others who would look at the code. 👍
"""Generates score for optimizer's performance on objective function.
Parameters
-------------
options : dict
<my description>
"""
The same goes for other methods 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it ✅
pyswarms/utils/search/base_search.py
Outdated
idx, self.best_score = max(enumerate(scores), key=op.itemgetter(1)) | ||
else: | ||
idx, self.best_score = min(enumerate(scores), key=op.itemgetter(1)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fine even if we remove the else
construct because the min
serves as the
default behaviour. Can we confirm this based on your unit tests?
# Catches that maximum bool flag
if maximum:
idx, self.best_score = max(enumerate(scores), key=op.itemgetter(1))
# Default behaviour
idx, self.best_score = min(enumerate(scores), key=op.itemgetter(1))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. I set this up to first run the default assignments, and then reassign if the maximum
bool flag is caught (just flipping the order of what you show here). 🔄
|
||
class GridSearch(object): | ||
class GridSearch(SearchBase): | ||
"""Exhaustive search of optimal performance on selected objective function | ||
over all combinations of specified hyperparameter values.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm... I may be wrong or my GitHub is not working properly, but does this need to have an __init__
method similar to RandomSearch()
? 😕 😄
You also mentioned about the possibility that GridSearch
will take so long when we're looking for a lot of hyperparameters and take too many iterations 👍 . Maybe we can add a logger.warn?
check the implementations in GlobalBestPSO
and LocalBestPSO
.
There is a logger.info('My information')
call in place of print()
functions. I also initialized it with logger = logging.getlogger(__name__)
. You can do the same thing but call logger.warn()
instead.
Thus we can have something like:
if condition_is_met:
logger.warn('Using GridSearch will take so much time')
# of course you can write a better warning than mine hehe
This will still run a search, only with an additional warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha! Yes, I think I thought I was being clever because if no new attributes are added in the GridSearch __init__
it could just just inherit SearchBase's __init__
, which would be retrieved when creating a GridSearch instance. But looking at it now, I think it's clearer to call the super().__init__
explicitly.
+ 57 def __init__(self, optimizer, n_particles, dimensions, options,
+ 58 objective_func, iters,bounds=None, velocity_clamp=None):
+ 59 """Initializes the paramsearch."""
+ 60
+ 61 # Assign attributes
+ 62 super().__init__(optimizer, n_particles, dimensions, options,
+ 63 objective_func, iters, bounds=bounds,
+ 64 velocity_clamp=velocity_clamp)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing me to 'logger.warning'. In looking at search
I realized I'm not sure how much of an issue runtime will be, even for GridSearch with lots of hyperparameter options. My hunch was that runtime on big search spaces would need to be flagged for users, but I don't have a good sense of when a user would input a really long list for each of the 'c' weights, for instance.
What I'd like to do next is write up an example for the docs on Tuning PSO hyperparameters
that shows these two methods in action, and use that writing as a chance to read the literature on conventions for setting PSO hyperparameters. I think this will help me have a better sense of kinds of situations might come up. 📚
What I can see happening is including logger.warn
, but also having clear documentation that speaks to the conventions for setting options
(and perhaps resources linked in the Example) that would steer users away from setting search values that don't quite make sense, and might take too long to run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I realized that it's kinda hard to pinpoint when to issue a warning based on user input. I believe your plan is good for now 👍
Regarding the __init__
, the reason why I often add the super()
is that it enables me to pass defaults into the base class. So just in case I want defaults in GridSearch
etc., I just declare them on GridSearch
's init, then do the super. But for now i guess this implementation is fine
@ljvmiranda921 Your reviews are the best!! 🥇 Thanks for the clear suggestions. I'll have time tomorrow morning to make these changes and will respond fully to the questions you flagged. More soon! 😄 |
Hey @SioKCronin, Good job, thanks a lot for your efforts! We are supposed to merge already but I apologize for forgetting to mention one thing: have you added an Just add an $ python -m unittest tests.utils.search.test_gridsearch or $ python -m unittest tests.utils.search.test_randomsearch In Travis-CI, I just call Once everything is clear, we can merge right away. If you wish to add examples and more documentation, just issue another Pull Request 👍 Thank you so much and I hope everything's good over there! |
This commit adds a BaseSearch class for GridSearch and RandomSearch, which defines an __init__ that sets all attributes for instances as well as score and search methods. This commit also adds RandomSearch, which provides search for optimal performance on objective function over combinations of hyperparameter values with specified bounds for specified number of combination iterations. This commit includes 'xrange' functionality for RandomSearch to support Python 2.7 Author: SioKCronin Email: siobhankcronin@gmail.com
Cool. I've added the |
Great, @SioKCronin! I have already merged your branch. Thanks a lot! |
This commit adds a SearchBase class for GridSearch and RandomSearch,
which defines an init that sets all attributes for instances
as well as score and search methods.
This commit also adds RandomSearch, which provides search for optimal
performance on objective function over combinations of hyperparameter
values with specified bounds for specified number of combination
iterations.
Author: SioKCronin
Email: siobhankcronin@gmail.com