-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Closed
Labels
Description
Model Size is big in auto sklearn
Auto Sklearn model size is big with respect to sklearn, Below are the examples:
1. With ensemble_size=30
AutoSklearnRegressor(
ensemble_nbest=32,
ensemble_size=30,
include={'data_preprocessor': ['NoPreprocessing'],
'feature_preprocessor': ['no_preprocessing']},
max_models_on_disc=32, per_run_time_limit=100,
time_left_for_this_task=350)
Model size: 789MB
2. With ensemble_size=10
AutoSklearnRegressor(
ensemble_nbest=12,
ensemble_size=10,
include={'data_preprocessor': ['NoPreprocessing'],
'feature_preprocessor': ['no_preprocessing']},
max_models_on_disc=12, per_run_time_limit=100,
time_left_for_this_task=350)
Model size: 786MB
3. With ensemble_size=1
AutoSklearnRegressor(
ensemble_nbest=3,
ensemble_size=1,
include={'data_preprocessor': ['NoPreprocessing'],
'feature_preprocessor': ['no_preprocessing']},
max_models_on_disc=3, per_run_time_limit=100,
time_left_for_this_task=350)
Selected Model: Random Forest
Model size: 777MB
4. Run Sklearn Model
ExtraTreesRegressor(n_estimators=30, random_state=0)
Model size: 58MB
5. Run Sklearn Model (Same as the AutoSklearnRegressor with ensemble size 1., Run the same selected model with the same parameters but different model sizes with a huge difference 122MB and 777MB)
RandomForestRegressor(bootstrap=True, criterion='mse' )
Model size: 122MB
I run autoskearn without feature or data preprocessing but still model size is very huge.
If it is due to ensemble size then I tried with different values of ensemble size 30, 10, 1 but the model size is almost the same, Why?