diff --git a/README.md b/README.md
index c5bef1a6..6c81d6b5 100644
--- a/README.md
+++ b/README.md
@@ -1,59 +1,51 @@
-
-
-# LightAutoML - automatic model creation framework
+
+[![GitHub License](https://img.shields.io/github/license/sb-ai-lab/LightAutoML)](https://github.com/sb-ai-lab/RePlay/blob/main/LICENSE)
+[![PyPI - Version](https://img.shields.io/pypi/v/lightautoml)](https://pypi.org/project/lightautoml)
+![pypi - Downloads](https://img.shields.io/pypi/dm/lightautoml?color=green&label=PyPI%20downloads&logo=pypi&logoColor=green)
[![Telegram](https://img.shields.io/badge/chat-on%20Telegram-2ba2d9.svg)](https://t.me/lightautoml)
-![PyPI - Downloads](https://img.shields.io/pypi/dm/lightautoml?color=green&label=PyPI%20downloads&logo=pypi&logoColor=orange&style=plastic)
-![Read the Docs](https://img.shields.io/readthedocs/lightautoml?style=plastic)
-[![Black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
+
+[![GitHub Workflow Status (with event)](https://img.shields.io/github/actions/workflow/status/sb-ai-lab/lightautoml/CI.yml)](https://github.com/sb-ai-lab/lightautoml/actions/workflows/CI.yml?query=branch%3Amain)
![Poetry-Lock](https://img.shields.io/github/workflow/status/sb-ai-lab/LightAutoML/Poetry%20run/master?label=Poetry-Lock)
-
+![Read the Docs](https://img.shields.io/readthedocs/lightautoml)
+[![Black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
LightAutoML (LAMA) is an AutoML framework which provides automatic model creation for the following tasks:
- binary classification
-- multiclass classification
+- multiclass classification
+- multilabel classification
- regression
Current version of the package handles datasets that have independent samples in each row. I.e. **each row is an object with its specific features and target**.
-Multitable datasets and sequences are a work in progress :)
-
-**Note**: we use [`AutoWoE`](https://pypi.org/project/autowoe) library to automatically create interpretable models.
**Authors**: [Alexander Ryzhkov](https://kaggle.com/alexryzhkov), [Anton Vakhrushev](https://kaggle.com/btbpanda), [Dmitry Simakov](https://kaggle.com/simakov), Rinchin Damdinov, Vasilii Bunakov, Alexander Kirilin, Pavel Shvets.
-**Documentation** of LightAutoML is available [here](https://lightautoml.readthedocs.io/), you can also [generate](https://github.com/AILab-MLTools/LightAutoML/blob/master/.github/CONTRIBUTING.md#building-documentation) it.
-
-# (New features) GPU and Spark pipelines
-Full GPU and Spark pipelines for LightAutoML currently available for developers testing (still in progress). The code and tutorials for:
-- GPU pipeline is [available here](https://github.com/Rishat-skoltech/LightAutoML_GPU)
-- Spark pipeline is [available here](https://github.com/sb-ai-lab/SLAMA)
# Table of Contents
-* [Installation LightAutoML from PyPI](#installation)
+* [Installation](#installation)
+* [Documentation](https://lightautoml.readthedocs.io/)
* [Quick tour](#quicktour)
* [Resources](#examples)
-* [Contributing to LightAutoML](#contributing)
-* [License](#apache)
-* [For developers](#developers)
+* [Advanced features](#advancedfeatures)
* [Support and feature requests](#support)
+* [Contributing to LightAutoML](#contributing)
+* [License](#license)
+
+**Documentation** of LightAutoML is available [here](https://lightautoml.readthedocs.io/), you can also [generate](https://github.com/AILab-MLTools/LightAutoML/blob/master/.github/CONTRIBUTING.md#building-documentation) it.
+
# Installation
-To install LAMA framework on your machine from PyPI, execute following commands:
+To install LAMA framework on your machine from PyPI:
```bash
-
-# Install base functionality:
-
+# Base functionality:
pip install -U lightautoml
-# For partial installation use corresponding option.
-# Extra dependecies: [nlp, cv, report]
-# Or you can use 'all' to install everything
-
+# For partial installation use corresponding option
+# Extra dependecies: [nlp, cv, report] or use 'all' to install all dependecies
pip install -U lightautoml[nlp]
-
```
Additionally, run following commands to enable pdf report generation:
@@ -77,7 +69,7 @@ sudo yum install redhat-rpm-config libffi-devel cairo pango gdk-pixbuf2
# Quick tour
Let's solve the popular Kaggle Titanic competition below. There are two main ways to solve machine learning problems using LightAutoML:
-* Use ready preset for tabular data:
+### Use ready preset for tabular data
```python
import pandas as pd
from sklearn.metrics import f1_score
@@ -105,9 +97,82 @@ pd.DataFrame({
}).to_csv('submit.csv', index = False)
```
-LighAutoML framework has a lot of ready-to-use parts and extensive customization options, to learn more check out the [resources](#Resources) section.
+### LightAutoML as a framework: build your own custom pipeline
-[Back to top](#toc)
+```python
+import pandas as pd
+from sklearn.metrics import f1_score
+
+from lightautoml.automl.presets.tabular_presets import TabularAutoML
+from lightautoml.tasks import Task
+
+df_train = pd.read_csv('../input/titanic/train.csv')
+df_test = pd.read_csv('../input/titanic/test.csv')
+N_THREADS = 4
+
+reader = PandasToPandasReader(Task("binary"), cv=5, random_state=42)
+
+# create a feature selector
+selector = ImportanceCutoffSelector(
+ LGBSimpleFeatures(),
+ BoostLGBM(
+ default_params={'learning_rate': 0.05, 'num_leaves': 64,
+ 'seed': 42, 'num_threads': N_THREADS}
+ ),
+ ModelBasedImportanceEstimator(),
+ cutoff=0
+)
+
+# build first level pipeline for AutoML
+pipeline_lvl1 = MLPipeline([
+ # first model with hyperparams tuning
+ (
+ BoostLGBM(
+ default_params={'learning_rate': 0.05, 'num_leaves': 128,
+ 'seed': 1, 'num_threads': N_THREADS}
+ ),
+ OptunaTuner(n_trials=20, timeout=30)
+ ),
+ # second model without hyperparams tuning
+ BoostLGBM(
+ default_params={'learning_rate': 0.025, 'num_leaves': 64,
+ 'seed': 2, 'num_threads': N_THREADS}
+ )
+], pre_selection=selector, features_pipeline=LGBSimpleFeatures(), post_selection=None)
+
+# build second level pipeline for AutoML
+pipeline_lvl2 = MLPipeline(
+ [
+ BoostLGBM(
+ default_params={'learning_rate': 0.05, 'num_leaves': 64,
+ 'max_bin': 1024, 'seed': 3, 'num_threads': N_THREADS},
+ freeze_defaults=True
+ )
+ ],
+ pre_selection=None,
+ features_pipeline=LGBSimpleFeatures(),
+ post_selection=None
+)
+
+# build AutoML pipeline
+automl = AutoML(reader, [
+ [pipeline_lvl1],
+ [pipeline_lvl2],
+ ],
+ skip_conn=False
+)
+
+# train AutoML and get predictions
+oof_pred = automl.fit_predict(df_train, roles = {'target': 'Survived', 'drop': ['PassengerId']})
+test_pred = automl.predict(df_test)
+
+pd.DataFrame({
+ 'PassengerId':df_test.PassengerId,
+ 'Survived': (test_pred.data[:, 0] > 0.5)*1
+}).to_csv('submit.csv', index = False)
+```
+
+LighAutoML framework has a lot of ready-to-use parts and extensive customization options, to learn more check out the [resources](#Resources) section.
# Resources
@@ -165,96 +230,25 @@ LighAutoML framework has a lot of ready-to-use parts and extensive customization
- (English) [LightAutoML vs Titanic: 80% accuracy in several lines of code (Medium)](https://alexmryzhkov.medium.com/lightautoml-preset-usage-tutorial-2cce7da6f936)
- (English) [Hands-On Python Guide to LightAutoML – An Automatic ML Model Creation Framework (Analytic Indian Mag)](https://analyticsindiamag.com/hands-on-python-guide-to-lama-an-automatic-ml-model-creation-framework/?fbclid=IwAR0f0cVgQWaLI60m1IHMD6VZfmKce0ZXxw-O8VRTdRALsKtty8a-ouJex7g)
-[Back to top](#toc)
+
+# Advanced features
+### GPU and Spark pipelines
+Full GPU and Spark pipelines for LightAutoML currently available for developers testing (still in progress). The code and tutorials for:
+- GPU pipeline is [available here](https://github.com/Rishat-skoltech/LightAutoML_GPU)
+- Spark pipeline is [available here](https://github.com/sb-ai-lab/SLAMA)
# Contributing to LightAutoML
If you are interested in contributing to LightAutoML, please read the [Contributing Guide](.github/CONTRIBUTING.md) to get started.
-[Back to top](#toc)
-
-
-# License
-This project is licensed under the Apache License, Version 2.0. See [LICENSE](https://github.com/AILab-MLTools/LightAutoML/blob/master/LICENSE) file for more details.
-
-[Back to top](#toc)
-
-
-# For developers
-
-## Build your own custom pipeline:
-
-```python
-import pandas as pd
-from sklearn.metrics import f1_score
-
-from lightautoml.automl.presets.tabular_presets import TabularAutoML
-from lightautoml.tasks import Task
-
-df_train = pd.read_csv('../input/titanic/train.csv')
-df_test = pd.read_csv('../input/titanic/test.csv')
-
-# define that machine learning problem is binary classification
-task = Task("binary")
-
-reader = PandasToPandasReader(task, cv=N_FOLDS, random_state=RANDOM_STATE)
-
-# create a feature selector
-model0 = BoostLGBM(
- default_params={'learning_rate': 0.05, 'num_leaves': 64,
- 'seed': 42, 'num_threads': N_THREADS}
-)
-pipe0 = LGBSimpleFeatures()
-mbie = ModelBasedImportanceEstimator()
-selector = ImportanceCutoffSelector(pipe0, model0, mbie, cutoff=0)
-
-# build first level pipeline for AutoML
-pipe = LGBSimpleFeatures()
-# stop after 20 iterations or after 30 seconds
-params_tuner1 = OptunaTuner(n_trials=20, timeout=30)
-model1 = BoostLGBM(
- default_params={'learning_rate': 0.05, 'num_leaves': 128,
- 'seed': 1, 'num_threads': N_THREADS}
-)
-model2 = BoostLGBM(
- default_params={'learning_rate': 0.025, 'num_leaves': 64,
- 'seed': 2, 'num_threads': N_THREADS}
-)
-pipeline_lvl1 = MLPipeline([
- (model1, params_tuner1),
- model2
-], pre_selection=selector, features_pipeline=pipe, post_selection=None)
-
-# build second level pipeline for AutoML
-pipe1 = LGBSimpleFeatures()
-model = BoostLGBM(
- default_params={'learning_rate': 0.05, 'num_leaves': 64,
- 'max_bin': 1024, 'seed': 3, 'num_threads': N_THREADS},
- freeze_defaults=True
-)
-pipeline_lvl2 = MLPipeline([model], pre_selection=None, features_pipeline=pipe1,
- post_selection=None)
-
-# build AutoML pipeline
-automl = AutoML(reader, [
- [pipeline_lvl1],
- [pipeline_lvl2],
-], skip_conn=False)
-
-# train AutoML and get predictions
-oof_pred = automl.fit_predict(df_train, roles = {'target': 'Survived', 'drop': ['PassengerId']})
-test_pred = automl.predict(df_test)
-
-pd.DataFrame({
- 'PassengerId':df_test.PassengerId,
- 'Survived': (test_pred.data[:, 0] > 0.5)*1
-}).to_csv('submit.csv', index = False)
-```
-
-[Back to top](#toc)
-
# Support and feature requests
Seek prompt advice at [Telegram group](https://t.me/lightautoml).
Open bug reports and feature requests on GitHub [issues](https://github.com/AILab-MLTools/LightAutoML/issues).
+
+
+# License
+This project is licensed under the Apache License, Version 2.0. See [LICENSE](https://github.com/AILab-MLTools/LightAutoML/blob/master/LICENSE) file for more details.
+
+[Back to top](#toc)
diff --git a/imgs/GENERALL2X2.jpg b/docs/imgs/GENERALL2X2.jpg
similarity index 100%
rename from imgs/GENERALL2X2.jpg
rename to docs/imgs/GENERALL2X2.jpg
diff --git a/imgs/lime.jpg b/docs/imgs/lime.jpg
similarity index 100%
rename from imgs/lime.jpg
rename to docs/imgs/lime.jpg
diff --git a/imgs/LightAutoML_logo_big.png b/imgs/LightAutoML_logo_big.png
deleted file mode 100644
index 2e799956..00000000
Binary files a/imgs/LightAutoML_logo_big.png and /dev/null differ
diff --git a/imgs/LightAutoML_logo_small.png b/imgs/LightAutoML_logo_small.png
deleted file mode 100644
index 8d268e39..00000000
Binary files a/imgs/LightAutoML_logo_small.png and /dev/null differ
diff --git a/imgs/Star_scheme_tables.png b/imgs/Star_scheme_tables.png
deleted file mode 100644
index c275d3f5..00000000
Binary files a/imgs/Star_scheme_tables.png and /dev/null differ
diff --git a/imgs/TabularAutoML_model_descr.png b/imgs/TabularAutoML_model_descr.png
deleted file mode 100644
index 4c24cada..00000000
Binary files a/imgs/TabularAutoML_model_descr.png and /dev/null differ
diff --git a/imgs/TabularUtilizedAutoML_model_descr.png b/imgs/TabularUtilizedAutoML_model_descr.png
deleted file mode 100644
index c2330881..00000000
Binary files a/imgs/TabularUtilizedAutoML_model_descr.png and /dev/null differ
diff --git a/imgs/autoint.png b/imgs/autoint.png
deleted file mode 100644
index e898ee92..00000000
Binary files a/imgs/autoint.png and /dev/null differ
diff --git a/imgs/denselight.png b/imgs/denselight.png
deleted file mode 100644
index 6e58464a..00000000
Binary files a/imgs/denselight.png and /dev/null differ
diff --git a/imgs/densenet.png b/imgs/densenet.png
deleted file mode 100644
index 86757951..00000000
Binary files a/imgs/densenet.png and /dev/null differ
diff --git a/imgs/fttransformer.png b/imgs/fttransformer.png
deleted file mode 100644
index 61e3712c..00000000
Binary files a/imgs/fttransformer.png and /dev/null differ
diff --git a/imgs/node.png b/imgs/node.png
deleted file mode 100644
index ca0a4805..00000000
Binary files a/imgs/node.png and /dev/null differ
diff --git a/imgs/resnet.png b/imgs/resnet.png
deleted file mode 100644
index 5d809448..00000000
Binary files a/imgs/resnet.png and /dev/null differ
diff --git a/imgs/swa.png b/imgs/swa.png
deleted file mode 100644
index 63d6df84..00000000
Binary files a/imgs/swa.png and /dev/null differ
diff --git a/imgs/tutorial_11_case_problem_statement.png b/imgs/tutorial_11_case_problem_statement.png
deleted file mode 100644
index 6b08b010..00000000
Binary files a/imgs/tutorial_11_case_problem_statement.png and /dev/null differ
diff --git a/imgs/tutorial_11_general_problem_statement.png b/imgs/tutorial_11_general_problem_statement.png
deleted file mode 100644
index c95b6e16..00000000
Binary files a/imgs/tutorial_11_general_problem_statement.png and /dev/null differ
diff --git a/imgs/tutorial_11_history_step_params.png b/imgs/tutorial_11_history_step_params.png
deleted file mode 100644
index 3fa11113..00000000
Binary files a/imgs/tutorial_11_history_step_params.png and /dev/null differ
diff --git a/imgs/tutorial_11_transformers_params.png b/imgs/tutorial_11_transformers_params.png
deleted file mode 100644
index 5212a24f..00000000
Binary files a/imgs/tutorial_11_transformers_params.png and /dev/null differ
diff --git a/imgs/tutorial_1_initial_report.png b/imgs/tutorial_1_initial_report.png
deleted file mode 100644
index 2648e9dc..00000000
Binary files a/imgs/tutorial_1_initial_report.png and /dev/null differ
diff --git a/imgs/tutorial_1_laml_big.png b/imgs/tutorial_1_laml_big.png
deleted file mode 100644
index e4de6247..00000000
Binary files a/imgs/tutorial_1_laml_big.png and /dev/null differ
diff --git a/imgs/tutorial_1_ml_pipeline.png b/imgs/tutorial_1_ml_pipeline.png
deleted file mode 100644
index ffc24bf3..00000000
Binary files a/imgs/tutorial_1_ml_pipeline.png and /dev/null differ
diff --git a/imgs/tutorial_1_pipeline.png b/imgs/tutorial_1_pipeline.png
deleted file mode 100644
index ce0ce896..00000000
Binary files a/imgs/tutorial_1_pipeline.png and /dev/null differ
diff --git a/imgs/tutorial_1_unfolded_report.png b/imgs/tutorial_1_unfolded_report.png
deleted file mode 100644
index e7517033..00000000
Binary files a/imgs/tutorial_1_unfolded_report.png and /dev/null differ
diff --git a/imgs/tutorial_2_initial_report.png b/imgs/tutorial_2_initial_report.png
deleted file mode 100644
index 277f37ac..00000000
Binary files a/imgs/tutorial_2_initial_report.png and /dev/null differ
diff --git a/imgs/tutorial_2_pipeline.png b/imgs/tutorial_2_pipeline.png
deleted file mode 100644
index e50a29e6..00000000
Binary files a/imgs/tutorial_2_pipeline.png and /dev/null differ
diff --git a/imgs/tutorial_2_unfolded_report.png b/imgs/tutorial_2_unfolded_report.png
deleted file mode 100644
index fde6fce2..00000000
Binary files a/imgs/tutorial_2_unfolded_report.png and /dev/null differ
diff --git a/imgs/tutorial_3_initial_report.png b/imgs/tutorial_3_initial_report.png
deleted file mode 100644
index c6639742..00000000
Binary files a/imgs/tutorial_3_initial_report.png and /dev/null differ
diff --git a/imgs/tutorial_3_unfolded_report.png b/imgs/tutorial_3_unfolded_report.png
deleted file mode 100644
index 87a67d37..00000000
Binary files a/imgs/tutorial_3_unfolded_report.png and /dev/null differ
diff --git a/imgs/tutorial_blackbox_pipeline.png b/imgs/tutorial_blackbox_pipeline.png
deleted file mode 100644
index 6c55fc7f..00000000
Binary files a/imgs/tutorial_blackbox_pipeline.png and /dev/null differ
diff --git a/imgs/tutorial_whitebox_report_1.png b/imgs/tutorial_whitebox_report_1.png
deleted file mode 100644
index 17317f31..00000000
Binary files a/imgs/tutorial_whitebox_report_1.png and /dev/null differ
diff --git a/imgs/tutorial_whitebox_report_2.png b/imgs/tutorial_whitebox_report_2.png
deleted file mode 100644
index c92067d0..00000000
Binary files a/imgs/tutorial_whitebox_report_2.png and /dev/null differ
diff --git a/imgs/tutorial_whitebox_report_3.png b/imgs/tutorial_whitebox_report_3.png
deleted file mode 100644
index eaa094ce..00000000
Binary files a/imgs/tutorial_whitebox_report_3.png and /dev/null differ
diff --git a/imgs/tutorial_whitebox_report_4.png b/imgs/tutorial_whitebox_report_4.png
deleted file mode 100644
index 3b42350a..00000000
Binary files a/imgs/tutorial_whitebox_report_4.png and /dev/null differ