Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TabPFNRegressor unstable for scipy<1.11.0 #175

Open
LeoGrin opened this issue Feb 7, 2025 · 3 comments
Open

TabPFNRegressor unstable for scipy<1.11.0 #175

LeoGrin opened this issue Feb 7, 2025 · 3 comments
Labels
bug Something isn't working

Comments

@LeoGrin
Copy link
Collaborator

LeoGrin commented Feb 7, 2025

Describe the bug

Fitting TabPFNRegressor with older scipy versions often breaks. It seems that it works reliably starting from 1.11.0.

Steps/Code to Reproduce

import numpy as np
import torch
import sklearn.datasets
from tabpfn import TabPFNRegressor

# Fetch the first 100 samples of the California housing dataset
X, y = sklearn.datasets.fetch_california_housing(return_X_y=True)
X, y = X[:100], y[:100]

regressor = TabPFNRegressor(n_estimators=1, device="cpu", random_state=42)
regressor.fit(X, y)

Expected Results

No error is thrown

Actual Results

/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/preprocessing/_data.py:3256: RuntimeWarning: overflow encountered in power
  out[~pos] = -(np.power(-x[~pos] + 1, 2 - lmbda) - 1) / (2 - lmbda)
Traceback (most recent call last):
  File "/Users/leo/VSCProjects/new/TabPFN/test_error.py", line 11, in <module>
    regressor.fit(X, y)
  File "/Users/leo/VSCProjects/new/TabPFN/src/tabpfn/regressor.py", line 503, in fit
    self.executor_ = create_inference_engine(
  File "/Users/leo/VSCProjects/new/TabPFN/src/tabpfn/base.py", line 213, in create_inference_engine
    engine = InferenceEngineCachePreprocessing.prepare(
  File "/Users/leo/VSCProjects/new/TabPFN/src/tabpfn/inference.py", line 265, in prepare
    configs, preprocessors, X_trains, y_trains, cat_ixs = list(zip(*itr))
  File "/Users/leo/VSCProjects/new/TabPFN/src/tabpfn/preprocessing.py", line 664, in fit_preprocessing
    yield from executor(  # type: ignore
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/joblib/parallel.py", line 1918, in __call__
    return output if self.return_generator else list(output)
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/joblib/parallel.py", line 1847, in _get_sequential_output
    res = func(*args, **kwargs)
  File "/Users/leo/VSCProjects/new/TabPFN/src/tabpfn/preprocessing.py", line 571, in fit_preprocessing_one
    res = preprocessor.fit_transform(X_train, cat_ix)
  File "/Users/leo/VSCProjects/new/TabPFN/src/tabpfn/model/preprocessing.py", line 398, in fit_transform
    X, categorical_features = step.fit_transform(X, categorical_features)
  File "/Users/leo/VSCProjects/new/TabPFN/src/tabpfn/model/preprocessing.py", line 987, in fit_transform
    Xt = transformer.fit_transform(X[:, self.subsampled_features_])
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/utils/_set_output.py", line 142, in wrapped
    data_to_wrap = f(self, X, *args, **kwargs)
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/compose/_column_transformer.py", line 726, in fit_transform
    result = self._fit_transform(X, y, _fit_transform_one)
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/compose/_column_transformer.py", line 657, in _fit_transform
    return Parallel(n_jobs=self.n_jobs)(
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/joblib/parallel.py", line 1918, in __call__
    return output if self.return_generator else list(output)
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/joblib/parallel.py", line 1847, in _get_sequential_output
    res = func(*args, **kwargs)
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/utils/fixes.py", line 117, in __call__
    return self.function(*args, **kwargs)
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/pipeline.py", line 894, in _fit_transform_one
    res = transformer.fit_transform(X, y, **fit_params)
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/pipeline.py", line 438, in fit_transform
    Xt = self._fit(X, y, **fit_params_steps)
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/pipeline.py", line 360, in _fit
    X, fitted_transformer = fit_transform_one_cached(
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/joblib/memory.py", line 312, in __call__
    return self.func(*args, **kwargs)
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/pipeline.py", line 894, in _fit_transform_one
    res = transformer.fit_transform(X, y, **fit_params)
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/utils/_set_output.py", line 142, in wrapped
    data_to_wrap = f(self, X, *args, **kwargs)
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/utils/_set_output.py", line 142, in wrapped
    data_to_wrap = f(self, X, *args, **kwargs)
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/preprocessing/_data.py", line 3099, in fit_transform
    return self._fit(X, y, force_transform=True)
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/preprocessing/_data.py", line 3126, in _fit
    X = self._scaler.fit_transform(X)
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/utils/_set_output.py", line 142, in wrapped
    data_to_wrap = f(self, X, *args, **kwargs)
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/utils/_set_output.py", line 142, in wrapped
    data_to_wrap = f(self, X, *args, **kwargs)
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/base.py", line 848, in fit_transform
    return self.fit(X, **fit_params).transform(X)
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/preprocessing/_data.py", line 824, in fit
    return self.partial_fit(X, y, sample_weight)
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/preprocessing/_data.py", line 861, in partial_fit
    X = self._validate_data(
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/base.py", line 535, in _validate_data
    X = check_array(X, input_name="X", **check_params)
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/utils/validation.py", line 919, in check_array
    _assert_all_finite(
  File "/Users/leo/mambaforge/envs/tabpfn_package_3.9/lib/python3.9/site-packages/sklearn/utils/validation.py", line 161, in _assert_all_finite
    raise ValueError(msg_err)
ValueError: Input X contains infinity or a value too large for dtype('float64').

Versions

PyTorch version: 2.1.0
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 14.7.3 (arm64)
GCC version: Could not collect
Clang version: 16.0.0 (clang-1600.0.26.6)
CMake version: version 3.27.7
Libc version: N/A

Python version: 3.9.21 | packaged by conda-forge | (main, Dec  5 2024, 13:47:18)  [Clang 18.1.8 ] (64-bit runtime)
Python platform: macOS-14.7.3-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M3

Dependency Versions:
--------------------
tabpfn: 2.0.3
torch: 2.1.0
numpy: 1.22.4
scipy: 1.10.0
pandas: 2.2.3
scikit-learn: 1.2.0
typing_extensions: 4.12.2
einops: 0.8.0
huggingface-hub: 0.27.1
@LeoGrin LeoGrin added the bug Something isn't working label Feb 7, 2025
@noahho
Copy link
Collaborator

noahho commented Feb 7, 2025

Should we for now update the dependency and specify > 1.11?

@LeoGrin
Copy link
Collaborator Author

LeoGrin commented Feb 7, 2025 via email

@mert-kurttutan
Copy link
Contributor

Just for doc purpose:
The related bug fix:
scipy/scipy#17704

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants