Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError in Ensemble Class when Accessing 'validation_score' #200

Open
simonprovost opened this issue Jun 11, 2023 · 0 comments
Open

Comments

@simonprovost
Copy link

simonprovost commented Jun 11, 2023

Hello @PGijsbers,

I hope all is well with you. While executing ensemble on the solution I am designing thanks to GAMA (ref for newcomers: #191), I ran into a slight issue. Initially, I believed that the issue stemmed from my design, but to make sure, a similar issue arose when I forked the latest version of the GAMA main branch.

Description:

During the execution process of my basic main py file available next, an AttributeError arises when the _str_ function attempts to print the ensemble model after all processes have been completed. The issue seems to occur specifically in the Ensemble class, and persists across different running times.

Steps to Reproduce:

  1. Fork the GAMA project.
  2. Execute the following main.py:

Note: I believe that a few parameters from the Gama classifier instantiation are irrelevant to the error, but this is essentially how I obtained the error so I copy-pasted as-is.

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import log_loss, accuracy_score
from gama import GamaClassifier
from gama.search_methods import RandomSearch, AsynchronousSuccessiveHalving, AsyncEA
from gama.postprocessing.ensemble import Ensemble, EnsemblePostProcessing

if __name__ == '__main__':
    X, y = load_breast_cancer(return_X_y=True)
    X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=0)

    automl = GamaClassifier(
        max_total_time=300,
        store="all",
        search=RandomSearch(),
        max_memory_mb=10000,
        n_jobs=3,
        post_processing=EnsemblePostProcessing(),
        verbosity=50,
    )
    print("Starting `fit` which will take roughly 3 minutes.")
    automl.fit(X_train, y_train)

    print("AutoML Model Champion:\n", automl.model) # Here it fails!

    label_predictions = automl.predict(X_test)
    probability_predictions = automl.predict_proba(X_test)

    print('accuracy:', accuracy_score(y_test, label_predictions))
    print('log loss:', log_loss(y_test, probability_predictions))
    print('log_loss', automl.score(X_test, y_test))

The above script fails to print the AutoML Model Champion. The error trace received is as follows (to jump to the code, click here):

Note: Line numbers may vary slightly. I inserted some prints statements in order to debug this issue.

Traceback (most recent call last):
  File "/tmp/test_gama_main_branch/main.py", line 24, in <module>
    print("AutoML Model Champion:\n", automl.model)
  File "/tmp/test_gama_main_branch/venv/lib/python3.10/site-packages/gama/postprocessing/ensemble.py", line 380, in __str__
    models = sorted(self._models.values(), key=lambda x: x[0].validation_score)
  File "/tmp/test_gama_main_branch/venv/lib/python3.10/site-packages/gama/postprocessing/ensemble.py", line 142, in get_validation_score
    print(f"Type of x[0].validation_score: {type(x[0].validation_score)}")
AttributeError: 'Evaluation' object has no attribute 'validation_score'

Expected Behavior:

The 'AutoML Model Champion' should be printed without errors.

Additional Context:

I suspect this may be a potential cause, although I am uncertain if this was deliberate. It appears that the code is searching for a validation score. The evaluation's object has access on the other to two type of scores. The individual's fitness (x[0] in the code)? Alternately, it could be searching for the 'Evaluation's score' attribute, which is a tuple of floats and therefore may not be suitable?

Even though the ensembling procedure appears to function perfectly, this issue prevents us from printing the champion model. I would appreciate any insights you may have regarding this issue.

In the meantime, I am certain that you are currently very occupied with your Lab. Therefore, I may be able to provide you with a fast PR. I only require certification of the score's attribute you intended to call within the Evaluation's object.

System

Specifications:

  • Device: MacBook Pro 2022 M2.
  • Shells Used: FiSH and ZSH shell.
  • Operating System: OSX Ventura stable version.

Appreciate your time!
Best wishes,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant