Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constraint error when tuning with AutoNHITS #957

Closed
zzzrbx opened this issue Apr 7, 2024 · 14 comments
Closed

Constraint error when tuning with AutoNHITS #957

zzzrbx opened this issue Apr 7, 2024 · 14 comments

Comments

@zzzrbx
Copy link

zzzrbx commented Apr 7, 2024

What happened + What you expected to happen

When tuning AutoNHITS with Optuna backend (without using Ray Tune) I occasionally get this error:

neuralforecast/lib64/python3.8/site-packages/torch/distributions/distribution.py", line 68, in __init__
    raise ValueError(
ValueError: Expected parameter df (Tensor of shape (1024, 31)) of distribution Chi2() to satisfy the constraint GreaterThan(lower_bound=0.0), but found invalid values:
tensor([[nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        ...,
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan]], grad_fn=<MulBackward0>)
[W 2024-04-07 20:31:41,964] Trial 5 failed with value None.

I'm using the latest version with no Ray Tune support.

Versions / Dependencies

Neuralforecast 1.7.0
Python 3

Reproduction script

It'd be difficult to share because it does not happen always

Issue Severity

High: It blocks me from completing my task.

@zzzrbx zzzrbx added the bug label Apr 7, 2024
@jmoralez
Copy link
Member

jmoralez commented Apr 8, 2024

Hey @zzzrbx, thanks for using neuralforecast. This isn't a shape error, the check that is failing is that it expects the values to be non-negative and they're all NaNs. Are you using scalers?

@jmoralez jmoralez changed the title Shape error when tuning with AutoNHITS Constraint error when tuning with AutoNHITS Apr 8, 2024
@zzzrbx
Copy link
Author

zzzrbx commented Apr 8, 2024 via email

@jmoralez
Copy link
Member

jmoralez commented Apr 8, 2024

Is it on the model (batches) or in the NeuralForecast constructor? If you can at least provide the code you're running that'd help a lot.

@zzzrbx
Copy link
Author

zzzrbx commented Apr 8, 2024

I'm using the code below, basically taken from the documentation. The error comes up more often as I increase the number of trials in optuna (I'm using CPU with no ray tune). Should the time series have a minimum length?

def config_nhits(trial):
    return {
        'futr_exog_list': futr_exog_list,
        'hist_exog_list': hist_exog_list,
        'max_steps': trial.suggest_int("max_steps", 100, 300),                                                                                          
        'input_size': h,         
        'activation':'ReLU',
        'scaler_type':'robust',
        'pooling_mode': 'AvgPool1d',
        'learning_rate': trial.suggest_loguniform("learning_rate", 1e-5, 1e-1),                                         
        'n_pool_kernel_size': trial.suggest_categorical("n_pool_kernel_size", [[2, 2, 2], [16, 8, 1]]),                 
        'n_freq_downsample': trial.suggest_categorical("n_freq_downsample", [[168, 24, 1], [24, 12, 1], [1, 1, 1]]),    
        'batch_size': trial.suggest_int("batch_size", 8, 16, 32), 
        'inference_windows_batch_size': 1,
        'random_seed': trial.suggest_int("random_seed", 1, 10),   
        'val_check_steps': 10,
    }

models = [
    AutoNHITS(
        h=h,
#            loss=DistributionLoss(distribution='StudentT', level=[80, 90], return_params=True),
        config=config_nhits,
        search_alg=optuna.samplers.TPESampler(),
        backend='optuna',
        num_samples=50,
        cpus=20,
    )    
]

@elephaint
Copy link
Contributor

I see you commented the DistributionLoss - what distribution are you using when you see the error? And, does the error also occur you change the distribution type?

@zzzrbx
Copy link
Author

zzzrbx commented Apr 11, 2024 via email

@elephaint
Copy link
Contributor

Can you test with a Normal distribution? I want to exclude the possibility of this being an issue related to the distributions.

Also, the initial error you showed above is (I think) when running the Student-t. What is the exact error you get with the Tweedie?

@elephaint
Copy link
Contributor

Hey @zzzrbx just checking in - did you have any luck trying out with the Normal distribution?

Copy link
Contributor

This issue has been automatically closed because it has been awaiting a response for too long. When you have time to to work with the maintainers to resolve this issue, please post a new comment and it will be re-opened. If the issue has been locked for editing by the time you return to it, please open a new issue and reference this one.

@fariduca
Copy link

fariduca commented Jun 5, 2024

I am facing the same problem when trying to use Tweedie distribution. Both when scaling on the model (batches) and using local_scaler. I have tried 'standard', 'robust', and 'boxcox' scaler types. Still ends up with the same problem.
The code I am running

autonhits = AutoNHITS(
    h=HORIZON, 
    loss=DistributionLoss('Tweedie', level=[75], rho=1.5), 
    num_samples=25,
    config=NHITS_objective_optuna,
    backend='optuna'
)  

 nf = NeuralForecast(
      models=[autonhits],    # AutoNHITS
      freq='MS',
      local_scaler_type=LOCAL_SCALER_TYPE,
  )

  nf.cross_validation(df=ts_train_df, verbose=False, step_size=HORIZON, refit=True, 
                      val_size=HORIZON)

Here is the error I get

File c:\...\torch\distributions\distribution.py:68, in Distribution.__init__(self, batch_shape, event_shape, validate_args)
     [66](file:///.../torch/distributions/distribution.py:66)         valid = constraint.check(value)
     [67](file:///.../torch/distributions/distribution.py:67)         if not valid.all():
---> [68](file:///.../torch/distributions/distribution.py:68)             raise ValueError(
     [69](file:///.../torch/distributions/distribution.py:69)                 f"Expected parameter {param} "
     [70](file:///.../torch/distributions/distribution.py:70)                 f"({type(value).__name__} of shape {tuple(value.shape)}) "
     [71](file:///.../torch/distributions/distribution.py:71)                 f"of distribution {repr(self)} "
     [72](file:///.../torch/distributions/distribution.py:72)                 f"to satisfy the constraint {repr(constraint)}, "
     [73](file:///.../torch/distributions/distribution.py:73)                 f"but found invalid values:\n{value}"
     [74](file:///.../torch/distributions/distribution.py:74)             )
     [75](file:///.../torch/distributions/distribution.py:75) super().__init__()

ValueError: Expected parameter concentration (Tensor of shape (1000, 128, 3)) of distribution Gamma(concentration: torch.Size([1000, 128, 3]), rate: torch.Size([1000, 128, 3])) to satisfy the constraint GreaterThan(lower_bound=0.0), but found invalid values:
tensor([[[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.],
         ...,
         [0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.],
         ...,
         [0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.],
         ...,
         [0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        ...,

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.],
         ...,
         [0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.],
         ...,
         [0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.],
         ...,
         [0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]]], device='cuda:0')

@elephaint
Copy link
Contributor

@fariduca I think some of our bounds are too tight for the distributions, this seems the same issue as you are experiencing. I am working on fixing this.

@elephaint elephaint reopened this Jun 5, 2024
Copy link
Contributor

github-actions bot commented Jun 6, 2024

This issue has been automatically closed because it has been awaiting a response for too long. When you have time to to work with the maintainers to resolve this issue, please post a new comment and it will be re-opened. If the issue has been locked for editing by the time you return to it, please open a new issue and reference this one.

@github-actions github-actions bot closed this as completed Jun 6, 2024
@ll550
Copy link

ll550 commented Jun 17, 2024

waiting for the solution. This kind of error is totally beyond my scope.:)

@yurirocha15
Copy link

yurirocha15 commented Aug 19, 2024

I am facing the same error when using either the Poisson or the Tweedie distribution. Also, I am not using scalers and my input data is nonnegative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

6 participants