Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sprint 12 Release #227

Merged
merged 33 commits into from
Jan 31, 2024
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
47a28f4
Add AdaBoost model and option to train on a subset of features
luccalb Jan 26, 2024
d97d90f
Added experiment results and marked the best performing classifier pe…
luccalb Jan 26, 2024
076f688
Minor changes to docs
luccalb Jan 26, 2024
6248259
Added LightGMB to the models
ultiwinter7 Jan 28, 2024
029009d
Added documented results about LightGBM
ultiwinter7 Jan 28, 2024
8f99e4f
Added documentation about TabNet architecture
ultiwinter7 Jan 28, 2024
008d5e6
Add AdaBoost model and option to train on a subset of features (#218)
luccalb Jan 29, 2024
d0468c3
Merge branch 'dev' into feature/lightgbm-model
ultiwinter Jan 29, 2024
1c79c6f
Merge pull request #220 from amosproj/feature/lightgbm-model
ultiwinter Jan 29, 2024
7d21b0c
Added results for subset 1. Pinned package versions
felix-zailskas Jan 30, 2024
66c0132
added imp-squared-backlog.jpg for sprint-12
NicoHambauer Jan 30, 2024
6e4edee
Pipeline now creates test coverage report
felix-zailskas Jan 30, 2024
716ca95
Added testing suite for console utils
felix-zailskas Jan 30, 2024
21d66c5
pre-commit hooks
felix-zailskas Jan 30, 2024
4ea01a7
Fix Pipfile
felix-zailskas Jan 30, 2024
6e66231
Add coverage to packages
felix-zailskas Jan 30, 2024
00d032a
Merge pull request #221 from amosproj/feature/216-testing-optional-da…
felix-zailskas Jan 30, 2024
ab98d06
Merge branch 'dev' into feature/67-testing-improvements
felix-zailskas Jan 30, 2024
22d8ffd
modified sagemaker_training.py after meating Andre
ultiwinter7 Jan 30, 2024
2eeecc3
uploaded requirements file
ultiwinter7 Jan 30, 2024
7c22a47
Merge pull request #222 from amosproj/feature/67-testing-improvements
felix-zailskas Jan 30, 2024
3f61e9b
updated the requirements file and nn_model and sagemaker_training. No…
ultiwinter7 Jan 31, 2024
9cc2166
modified the num_leaves of LightGBM for better performance
ultiwinter7 Jan 31, 2024
f560fff
modified nn_model
ultiwinter7 Jan 31, 2024
57d5999
removing the errors
ultiwinter7 Jan 31, 2024
34f3144
modifying f1 score measurement
ultiwinter7 Jan 31, 2024
9b4b274
updated the save_model
ultiwinter7 Jan 31, 2024
01dab30
Merge pull request #223 from amosproj/feature/sagemaker
ultiwinter Jan 31, 2024
ac4a7a4
Merge pull request #224 from amosproj/deliverables/sprint-12
soapyheas Jan 31, 2024
0df0113
Adding sprint 12 regular homework
soapyheas Jan 31, 2024
735d7fe
Merge pull request #225 from amosproj/sprint-12/deliverables
soapyheas Jan 31, 2024
74d7086
Add demo-day-slide and demo-day-video
Tims777 Jan 31, 2024
2003668
Merge pull request #226 from amosproj/sprint-12/deliverables
soapyheas Jan 31, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .github/workflows/python-app.yml
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ jobs:
pipenv run flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
pipenv run flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: Test with pytest
- name: Test with pytest and create coverage report
run: |
pipenv run pytest
pipenv run coverage run --source ./ -m pytest -v tests/
pipenv run coverage report -m
Binary file added Deliverables/sprint-12/feature-board.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 2 additions & 0 deletions Deliverables/sprint-12/feature-board.png.license
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
SPDX-License-Identifier: MIT
SPDX-FileCopyrightText: 2023 Simon Zimmermann <tim.simon.zimmermann@fau.de>
Binary file added Deliverables/sprint-12/imp-squared-backlog.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 2 additions & 0 deletions Deliverables/sprint-12/imp-squared-backlog.jpg.license
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
SPDX-License-Identifier: MIT
SPDX-FileCopyrightText: 2023 Nico Hambauer <nico.hambauer@fau.de>
Binary file added Deliverables/sprint-12/planning-documents.pdf
Binary file not shown.
2 changes: 2 additions & 0 deletions Deliverables/sprint-12/planning-documents.pdf.license
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
SPDX-License-Identifier: MIT
SPDX-FileCopyrightText: 2023 Simon Zimmermann <tim.simon.zimmermann@fau.de>
93 changes: 65 additions & 28 deletions Documentation/Classifier-Comparison.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ Fully Connected Neural Networks (FCNN) achieved overall lower performance than t
### Fully Connected Neural Networks Regression Model

There has been an idea written in the scientific paper "Inter-species cell detection -
datasets on pulmonary hemosiderophages in equine, human and feline specimens" by Marzahl et al. where they proposed using regression model on a classification task. The idea is to train the regression model on the class values, whereas the model predicts a continous values and learns the relation between the classes. The output is then subjected to threshholds (0-0.49,0.5-1.49,1.5-2.49,2.5-3.49,3.5-4.5) for classes XS, S, M, L, XL respectivly. This yielded better performance than the FCNN classifier but still was worse than that of the Random Forest.
datasets on pulmonary hemosiderophages in equine, human and feline specimens" by Marzahl et al. (https://www.nature.com/articles/s41597-022-01389-0) where they proposed using regression model on a classification task. The idea is to train the regression model on the class values, whereas the model predicts a continous values and learns the relation between the classes. The output is then subjected to threshholds (0-0.49,0.5-1.49,1.5-2.49,2.5-3.49,3.5-4.5) for classes XS, S, M, L, XL respectivly. This yielded better performance than the FCNN classifier but still was worse than that of the Random Forest.

### QDA & Ridge Classifier

Expand All @@ -54,53 +54,90 @@ classes had F1-scores of ~0.00-0.15. For this reason we are not considering thes
in future experiments. This resulted in an overall F1-score of ~0.11, which is significantly
outperformed by the other tested models.

### TabNet Architecture

TabNet, short for "Tabular Neural Network," is a novel neural network architecture specifically designed for tabular data, commonly encountered in structured data, such as databases and CSV files. It was introduced in the paper titled "TabNet: Attentive Interpretable Tabular Learning" by Arik et al. (https://arxiv.org/abs/1908.07442). TabNet uses sequential attention to choose which features to reason from at each decision step, enabling interpretability and more efficient learning as the learning capacity is used for the most salient features. Unfortunately, TabNet similarly to our proposed 4 layer network, TabNet only learned the features of the XS class with XS f1 score of 0.84, while the other f1 scores of other classes are zeros. The underlying data does not seem to respond positively to neural network-based approaches.

## Well performing models

### Random Forest Classifier
In this sub-section we will discuss the results of well performing models, which arer XGBoost, LightGBM, K-Nearest Neighbor (KNN), Random Forest, AdaBoost and Naive Bayes.

### Feature subsets

We have collected a lot of features (~54 data points) for the leads, additionally one-hot encoding the categorical variables
results in a high dimensional feature space (132 features). Not all features might be equally relevant for our classification task
so we want to try different subsets.

The following subsets are available:

Random Forest Classifier with 100 estimators has been been able to achieve an overall F1-score of 0.62 and scores of 0.81, 0.13, 0.09, 0.08 and 0.15 for classes XS, S, M, L and XL respectively.
1. `google_places_rating`, `google_places_user_ratings_total`, `google_places_confidence`, `regional_atlas_regional_score`

### Overall Results

Note:
The Random Forest Classifier used 100 estimators.
The KNN classifier used a distance based weighting for the evaluated neighbors and considered 10 neighbors in the 5-class split and 19 neighbors for the 3-class split.
The XGBoost was trained for 10000 rounds.
**_Notes:_**

- The Random Forest Classifier used 100 estimators.
- The AdaBoost Classifier used 100 DecisionTree classifiers.
- The KNN classifier used a distance based weighting for the evaluated neighbors and considered 10 neighbors in the 5-class split and 19 neighbors for the 3-class split.
- The XGBoost was trained for 10000 rounds.
- The LightGBM was trained with 2000 number of leaves


In the following table we can see the model's overall weighted F1-score on the 3-class and
5-class data set split.
5-class data set split. The best performing classifiers per row is marked **bold**.

| | KNN | Naive Bayes | Random Forest | XGBoost | AdaBoost | LightGBM |
| ------- | ------ | ----------- | ------------- | ---------- | -------- | -------- |
| 5-Class | 0.6314 | 0.6073 | 0.6150 | **0.6442** | 0.6098 | 0.6405 |
| 3-Class | 0.6725 | 0.6655 | 0.6642 | **0.6967** | 0.6523 | 0.6956 |

| | KNN | Naive Bayes | Random Forest | XGBoost |
| ------- | ------ | ----------- | ------------- | ------- |
| 5-Class | 0.6314 | 0.6073 | 0.6150 | 0.6442 |
| 3-Class | 0.6725 | 0.6655 | 0.6642 | 0.6967 |
| | KNN (subset=1) | Naive Bayes (subset=1) | RandomForest (subset=1) | XGBoost (subset=1) | AdaBoost (subset=1) | LightGBM (subset=1) |
| ------- | -------------- | ---------------------- | ----------------------- | ------------------ | ------------------- | ------------------- |
| 5-Class | 0.6288 | 0.6075 | 0.5995 | **0.6198** | 0.6090 | 0.6252 |
| 3-Class | 0.6680 | 0.6075 | 0.6506 | **0.6664** | 0.6591 | 0.6644 |

We can see that all classifiers perform better on the 3-class data set split and that the XGBoost classifier is the best performing for both data set splits.
We can see that all classifiers perform better on the 3-class data set split and that the XGBoost classifier is the best performing for both data set splits. These results are consistent for both the full dataset as well as subset 1. We observe a slight performance for almost all classifiers when using subset 1 compared to the full dataset (except AdaBoost/3-class and Naive Bayes/5-class). This indicates that the few features retained in subset 1 are not the sole discriminant features of the dataset. However, the performance is still high enough to suggest that the features in subset 1 are highly relevant to make classifications on the data.

### Results for each class

#### 5-class split

In the following table we can see the F1-score of each model for each class in the 5-class split:

| Class | KNN | Naive Bayes | Random Forest | XGBoost |
| ----- | ---- | ----------- | ------------- | ------- |
| XS | 0.82 | 0.83 | 0.81 | 0.84 |
| S | 0.15 | 0.02 | 0.13 | 0.13 |
| M | 0.08 | 0.02 | 0.09 | 0.08 |
| L | 0.06 | 0.00 | 0.08 | 0.06 |
| XL | 0.18 | 0.10 | 0.15 | 0.16 |

For every model we can see that the predictions on the XS class are significantly better than every other class. TFor the KNN, Random Forest, and XGBoost all perform similar, having second best classes S and XL and worst classes M and L. The Naive Bayes classifier performs significantly worse on the S, M, and L classes and has second best class XL.
| Class | KNN | Naive Bayes | Random Forest | XGBoost | AdaBoost | LightGBM |
| ----- | ---- | ----------- | ------------- | -------- | -------- | -------- |
| XS | 0.82 | 0.83 | 0.81 | **0.84** | 0.77 | 0.83 |
| S | 0.15 | 0.02 | 0.13 | 0.13 | **0.22** | 0.14 |
| M | 0.08 | 0.02 | 0.09 | 0.08 | **0.14** | 0.09 |
| L | 0.06 | 0.00 | **0.08** | 0.06 | 0.07 | 0.05 |
| XL | 0.18 | 0.10 | 0.15 | 0.16 | 0.14 | **0.21** |

| Class | KNN (subset=1) | Naive Bayes (subset=1) | RandomForest (subset=1) | XGBoost (subset=1) | AdaBoost (subset=1) | LightGBM (subset=1) |
| ----- | -------------- | ---------------------- | ----------------------- | ------------------ | ------------------- | ------------------- |
| XS | 0.82 | 0.84 | 0.78 | **0.84** | 0.78 | 0.82 |
| S | 0.16 | 0.00 | 0.16 | 0.04 | **0.19** | 0.13 |
| M | 0.07 | 0.00 | 0.07 | 0.02 | **0.09** | 0.08 |
| L | **0.07** | 0.00 | 0.06 | 0.05 | **0.07** | 0.06 |
| XL | **0.19** | 0.00 | 0.11 | 0.13 | 0.14 | 0.18 |

For every model we can see that the predictions on the XS class are significantly better than every other class. For the KNN, Random Forest, and XGBoost all perform similar, having second best classes S and XL and worst classes M and L. The Naive Bayes classifier performs significantly worse on the S, M, and L classes and has second best class XL.
Using subset 1 again mostly decreased performance on all classes, with the exception of the KNN classifier and classes L and XL where we can observe a slight increase in F1-score.

#### 3-class split

In the following table we can see the F1-score of each model for each class in the 3-class split:

| Class | KNN | Naive Bayes | Random Forest | XGBoost |
| ----- | ---- | ----------- | ------------- | ------- |
| XS | 0.83 | 0.82 | 0.81 | 0.84 |
| S,M,L | 0.27 | 0.28 | 0.30 | 0.33 |
| XL | 0.16 | 0.07 | 0.13 | 0.14 |
| Class | KNN | Naive Bayes | Random Forest | XGBoost | AdaBoost | LightGBM |
| ----- | ---- | ----------- | ------------- | -------- | -------- | -------- |
| XS | 0.83 | 0.82 | 0.81 | **0.84** | 0.78 | 0.83 |
| S,M,L | 0.27 | 0.28 | 0.30 | 0.33 | **0.34** | **0.34** |
| XL | 0.16 | 0.07 | 0.13 | 0.14 | 0.12 | **0.19** |

| Class | KNN (subset=1) | Naive Bayes (subset=1) | RandomForest (subset=1) | XGBoost (subset=1) | AdaBoost (subset=1) | LightGBM (subset=1) |
| ----- | -------------- | ---------------------- | ----------------------- | ------------------ | ------------------- | ------------------- |
| XS | 0.82 | 0.84 | 0.79 | **0.84** | 0.79 | 0.81 |
| S,M,L | 0.29 | 0.00 | 0.30 | 0.22 | **0.32** | 0.28 |
| XL | 0.18 | 0.00 | 0.11 | 0.11 | **0.20** | 0.17 |

For the 3-class split we observe similar performance for the XS and {S, M, L} classes for each model, while the XGBoost model slightly outperforms the other models. The KNN classifier is performing the best on the XL class while the Naive Bayes classifier performs worst. Interestingly, we can observe that the performance of the models on the XS class was barely affected by the merging of the s, M, and L classes while the performance on the XL class got worse for all of them. This needs to be considered, when evaluating the overall performance of the models on this data set split.
For the 3-class split we observe similar performance for the XS and {S, M, L} classes for each model, while the LightGBM model slightly outperforms the other models. The LightGBM classifier is performing the best on the XL class while the Naive Bayes classifier performs worst. Interestingly, we can observe that the performance of the models on the XS class was barely affected by the merging of the S, M, and L classes while the performance on the XL class got worse for all of them. This needs to be considered, when evaluating the overall performance of the models on this data set split.
The AdaBoost Classifier, trained on subset 1, performs best for the XL class. The KNN classifier got a slight boost in performance for the {S, M, L} and XL classes when using subset 1. All other models perform worse on subset 1.
10 changes: 6 additions & 4 deletions Pipfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ name = "pypi"

[dev-packages]
pytest = "==7.4.0"
coverage = "==7.4.1"
pre-commit = "==3.5.0"
flake8 = "==6.0.0"
pytest-env = "==1.0.1"
Expand Down Expand Up @@ -44,14 +45,15 @@ textblob = "==0.17.1"
deep-translator = "==1.11.4"
fsspec = "2023.12.2"
s3fs = "2023.12.2"
imblearn = "*"
sagemaker = "*"
imblearn = "==0.0"
sagemaker = "==2.198.0"
joblib = "1.3.2"
xgboost = "*"
colorama = "*"
xgboost = "==2.0.3"
colorama = "==0.4.6"
torch = "2.1.2"
deutschland = "0.4.0"
bs4 = "0.0.2"
lightgbm = "==4.3.0"

[requires]
python_version = "3.10"
Loading
Loading