-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider making bhmm core compatible with sklearn #42
Comments
Small addition: Even if explicit subclassing is necessary in order to use our estimators with sklearn algorithms, it would be simple to do that in another package that uses sklearn. Something like (just a sketch, I haven't tried that code): from bhmm import MLHMM
from sklearn.base import BaseEstimator
class myMLHMM(MLHMM, BaseEstimator):
pass
hmm = myMLHMM()
# do some fancy sklearn stuff with hmm object here That way we could avoid explicit dependency on sklearn in bhmm and still provide all functionalities. |
The current release (0.3.0) defines scikit-learn as a dependency (both in setup.py and conda recipe), however it is never being used in the code. Is this intended? |
I think we use sklearn's Gaussian Mixture Model estimator for the Am 06/07/15 um 16:26 schrieb Martin K. Scherer:
Prof. Dr. Frank Noe Phone: (+49) (0)30 838 75354 Mail: Arnimallee 6, 14195 Berlin, Germany |
Addition: sklearn is used in l. 37-39 of this file: |
Sounds reasonable, if it is easy to extract.
|
OK, please have a look Am 07/07/15 um 00:53 schrieb Martin K. Scherer:
Prof. Dr. Frank Noe Phone: (+49) (0)30 838 75354 Mail: Arnimallee 6, 14195 Berlin, Germany |
It also uses KMeans for initializing the means during EM. So we would
need to extract this stuff too.
The downside of extracting it is also we do not easily receive
updates/bugfixes for that code. For those two reason I would dis-advise
extracting.
|
I'll have a look. We certainly don't need k-means initialization at the I definitely want to avoid unnecessary dependencies for pyemma. There's Am 07/07/15 um 01:10 schrieb Martin K. Scherer:
Prof. Dr. Frank Noe Phone: (+49) (0)30 838 75354 Mail: Arnimallee 6, 14195 Berlin, Germany |
We can instead include Moritz' code. It's faster anyway. Am 07/07/15 um 01:10 schrieb Martin K. Scherer:
Prof. Dr. Frank Noe Phone: (+49) (0)30 838 75354 Mail: Arnimallee 6, 14195 Berlin, Germany |
Following #40:
"could optimally be made compatible with scikit-learn, which seems to be the modern way new machine learning codes are written."
http://scikit-learn.org/stable/modules/generated/sklearn.base.BaseEstimator.html#sklearn.base.BaseEstimator
Perhaps we could also implement the ClassifierMixin behavior (score) in order to do cross-validation.
This is both very easy to do, but I don't see why we would need a dependency on sklearn for that. I would like to avoid dependencies on heavy packages unless we don't use some of their functionalities (currently we do use the sklearn's Gaussian Mixture Model, but I guess this is a temporary solution).
@kylebeauchamp is dependency on sklearn necessary to have an effective implementation of the BaseEstimator class (i.e. is issubclass somewhere checked explicitly in sklearn algorithms), or is duck-typing sufficient?
The text was updated successfully, but these errors were encountered: