Skip to content

An Efficient, Scalable and Optimized Python Framework for Deep Forest (2021.2.1)

License

Notifications You must be signed in to change notification settings

LAMDA-NJU/Deep-Forest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

96e762b · Mar 30, 2023
Sep 17, 2022
May 14, 2022
Mar 30, 2023
Sep 17, 2022
Jul 15, 2021
Jul 20, 2021
Feb 9, 2021
Jan 31, 2021
Apr 17, 2021
Feb 12, 2021
Feb 5, 2021
Feb 3, 2021
Sep 17, 2022
Feb 5, 2021
Jul 20, 2021
Sep 17, 2022
Apr 17, 2021
May 12, 2022
Sep 18, 2022
Feb 5, 2021
Sep 17, 2022
Oct 1, 2022

Repository files navigation

Deep Forest (DF) 21

github readthedocs codecov python pypi style

DF21 is an implementation of Deep Forest 2021.2.1. It is designed to have the following advantages:

  • Powerful: Better accuracy than existing tree-based ensemble methods.
  • Easy to Use: Less efforts on tunning parameters.
  • Efficient: Fast training speed and high efficiency.
  • Scalable: Capable of handling large-scale data.

DF21 offers an effective & powerful option to the tree-based machine learning algorithms such as Random Forest or GBDT.

For a quick start, please refer to How to Get Started. For a detailed guidance on parameter tunning, please refer to Parameters Tunning.

DF21 is optimized for what a tree-based ensemble excels at (i.e., tabular data), if you want to use the multi-grained scanning part to better handle structured data like images, please refer to the origin implementation for details.

Installation

DF21 can be installed using pip via PyPI which is the package installer for Python. You can use pip to install packages from the Python Package Index and other indexes. Refer this for the documentation of pip. Use this command to download DF21 :

pip install deep-forest

Quickstart

Classification

from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

from deepforest import CascadeForestClassifier

X, y = load_digits(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)
model = CascadeForestClassifier(random_state=1)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
acc = accuracy_score(y_test, y_pred) * 100
print("\nTesting Accuracy: {:.3f} %".format(acc))
>>> Testing Accuracy: 98.667 %

Regression

from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

from deepforest import CascadeForestRegressor

X, y = load_boston(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)
model = CascadeForestRegressor(random_state=1)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print("\nTesting MSE: {:.3f}".format(mse))
>>> Testing MSE: 8.068

Resources

Reference

@article{zhou2019deep,
    title={Deep forest},
    author={Zhi-Hua Zhou and Ji Feng},
    journal={National Science Review},
    volume={6},
    number={1},
    pages={74--86},
    year={2019}}

@inproceedings{zhou2017deep,
    title = {{Deep Forest:} Towards an alternative to deep neural networks},
    author = {Zhi-Hua Zhou and Ji Feng},
    booktitle = {IJCAI},
    pages = {3553--3559},
    year = {2017}}

Thanks to all our contributors

contributors