Skip to content

Model Size is big in auto sklearn #1359

@shabir1

Description

@shabir1

Model Size is big in auto sklearn

Auto Sklearn model size is big with respect to sklearn, Below are the examples:

1. With ensemble_size=30
AutoSklearnRegressor(
                    ensemble_nbest=32, 
                    ensemble_size=30,
                     include={'data_preprocessor': ['NoPreprocessing'],
                              'feature_preprocessor': ['no_preprocessing']},
                     max_models_on_disc=32, per_run_time_limit=100,
                     time_left_for_this_task=350)

Model size: 789MB

2. With ensemble_size=10
AutoSklearnRegressor( 
                               ensemble_nbest=12, 
                               ensemble_size=10,
                     include={'data_preprocessor': ['NoPreprocessing'],
                              'feature_preprocessor': ['no_preprocessing']},
                     max_models_on_disc=12, per_run_time_limit=100,
                     time_left_for_this_task=350)

Model size: 786MB

3. With ensemble_size=1
AutoSklearnRegressor(
                             ensemble_nbest=3, 
                             ensemble_size=1,
                     include={'data_preprocessor': ['NoPreprocessing'],
                              'feature_preprocessor': ['no_preprocessing']},
                     max_models_on_disc=3, per_run_time_limit=100,
                     time_left_for_this_task=350)
Selected Model:  Random Forest
Model size: 777MB

4. Run Sklearn Model
ExtraTreesRegressor(n_estimators=30,  random_state=0)
Model size: 58MB

5. Run Sklearn Model (Same as the AutoSklearnRegressor with ensemble size 1., Run the same selected model with the same parameters but different model sizes with a huge difference 122MB and 777MB)
RandomForestRegressor(bootstrap=True,   criterion='mse' )
Model size: 122MB

I run autoskearn without feature or data preprocessing but still model size is very huge.
If it is due to ensemble size then I tried with different values of ensemble size 30, 10, 1 but the model size is almost the same, Why?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions