Constraint error when tuning with AutoNHITS #957

zzzrbx · 2024-04-07T21:24:45Z

What happened + What you expected to happen

When tuning AutoNHITS with Optuna backend (without using Ray Tune) I occasionally get this error:

neuralforecast/lib64/python3.8/site-packages/torch/distributions/distribution.py", line 68, in __init__
    raise ValueError(
ValueError: Expected parameter df (Tensor of shape (1024, 31)) of distribution Chi2() to satisfy the constraint GreaterThan(lower_bound=0.0), but found invalid values:
tensor([[nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        ...,
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan],
        [nan, nan, nan,  ..., nan, nan, nan]], grad_fn=<MulBackward0>)
[W 2024-04-07 20:31:41,964] Trial 5 failed with value None.

I'm using the latest version with no Ray Tune support.

Versions / Dependencies

Neuralforecast 1.7.0
Python 3

Reproduction script

It'd be difficult to share because it does not happen always

Issue Severity

High: It blocks me from completing my task.

The text was updated successfully, but these errors were encountered:

jmoralez · 2024-04-08T16:46:36Z

Hey @zzzrbx, thanks for using neuralforecast. This isn't a shape error, the check that is failing is that it expects the values to be non-negative and they're all NaNs. Are you using scalers?

zzzrbx · 2024-04-08T16:50:52Z

Yes i’m using the robust scaler

…

On Mon, 8 Apr 2024 at 17:46, José Morales ***@***.***> wrote: Hey @zzzrbx <https://github.com/zzzrbx>, thanks for using neuralforecast. This isn't a shape error, the check that is failing is that it expects the values to be non-negative and they're all NaNs. Are you using scalers? — Reply to this email directly, view it on GitHub <#957 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFREVQ5YYBL57YFFSW25KMDY4LCYDAVCNFSM6AAAAABF3UEHNKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBTGIYTOMRXHE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

jmoralez · 2024-04-08T16:54:52Z

Is it on the model (batches) or in the NeuralForecast constructor? If you can at least provide the code you're running that'd help a lot.

zzzrbx · 2024-04-08T17:37:39Z

I'm using the code below, basically taken from the documentation. The error comes up more often as I increase the number of trials in optuna (I'm using CPU with no ray tune). Should the time series have a minimum length?

def config_nhits(trial):
    return {
        'futr_exog_list': futr_exog_list,
        'hist_exog_list': hist_exog_list,
        'max_steps': trial.suggest_int("max_steps", 100, 300),                                                                                          
        'input_size': h,         
        'activation':'ReLU',
        'scaler_type':'robust',
        'pooling_mode': 'AvgPool1d',
        'learning_rate': trial.suggest_loguniform("learning_rate", 1e-5, 1e-1),                                         
        'n_pool_kernel_size': trial.suggest_categorical("n_pool_kernel_size", [[2, 2, 2], [16, 8, 1]]),                 
        'n_freq_downsample': trial.suggest_categorical("n_freq_downsample", [[168, 24, 1], [24, 12, 1], [1, 1, 1]]),    
        'batch_size': trial.suggest_int("batch_size", 8, 16, 32), 
        'inference_windows_batch_size': 1,
        'random_seed': trial.suggest_int("random_seed", 1, 10),   
        'val_check_steps': 10,
    }

models = [
    AutoNHITS(
        h=h,
#            loss=DistributionLoss(distribution='StudentT', level=[80, 90], return_params=True),
        config=config_nhits,
        search_alg=optuna.samplers.TPESampler(),
        backend='optuna',
        num_samples=50,
        cpus=20,
    )    
]

elephaint · 2024-04-11T10:27:09Z

I see you commented the DistributionLoss - what distribution are you using when you see the error? And, does the error also occur you change the distribution type?

zzzrbx · 2024-04-11T10:33:35Z

I’m using Student-t most of the time but it happens with tweedie as well. Haven’t tested other distributions yet

…

On Thu, 11 Apr 2024 at 11:27, Olivier Sprangers ***@***.***> wrote: I see you commented the DistributionLoss - what distribution are you using when you see the error? And, does the error also occur you change the distribution type? — Reply to this email directly, view it on GitHub <#957 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFREVQ25NI3DGN2RS35XS7LY4ZQRHAVCNFSM6AAAAABF3UEHNKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBZGM4DOMRWGY> . You are receiving this because you were mentioned.Message ID: ***@***.***>

elephaint · 2024-04-11T11:11:34Z

Can you test with a Normal distribution? I want to exclude the possibility of this being an issue related to the distributions.

Also, the initial error you showed above is (I think) when running the Student-t. What is the exact error you get with the Tweedie?

elephaint · 2024-05-07T19:37:12Z

Hey @zzzrbx just checking in - did you have any luck trying out with the Normal distribution?

github-actions · 2024-05-12T04:01:07Z

This issue has been automatically closed because it has been awaiting a response for too long. When you have time to to work with the maintainers to resolve this issue, please post a new comment and it will be re-opened. If the issue has been locked for editing by the time you return to it, please open a new issue and reference this one.

fariduca · 2024-06-05T13:31:41Z

I am facing the same problem when trying to use Tweedie distribution. Both when scaling on the model (batches) and using local_scaler. I have tried 'standard', 'robust', and 'boxcox' scaler types. Still ends up with the same problem.
The code I am running

autonhits = AutoNHITS(
    h=HORIZON, 
    loss=DistributionLoss('Tweedie', level=[75], rho=1.5), 
    num_samples=25,
    config=NHITS_objective_optuna,
    backend='optuna'
)  

 nf = NeuralForecast(
      models=[autonhits],    # AutoNHITS
      freq='MS',
      local_scaler_type=LOCAL_SCALER_TYPE,
  )

  nf.cross_validation(df=ts_train_df, verbose=False, step_size=HORIZON, refit=True, 
                      val_size=HORIZON)

Here is the error I get

File c:\...\torch\distributions\distribution.py:68, in Distribution.__init__(self, batch_shape, event_shape, validate_args)
     [66](file:///.../torch/distributions/distribution.py:66)         valid = constraint.check(value)
     [67](file:///.../torch/distributions/distribution.py:67)         if not valid.all():
---> [68](file:///.../torch/distributions/distribution.py:68)             raise ValueError(
     [69](file:///.../torch/distributions/distribution.py:69)                 f"Expected parameter {param} "
     [70](file:///.../torch/distributions/distribution.py:70)                 f"({type(value).__name__} of shape {tuple(value.shape)}) "
     [71](file:///.../torch/distributions/distribution.py:71)                 f"of distribution {repr(self)} "
     [72](file:///.../torch/distributions/distribution.py:72)                 f"to satisfy the constraint {repr(constraint)}, "
     [73](file:///.../torch/distributions/distribution.py:73)                 f"but found invalid values:\n{value}"
     [74](file:///.../torch/distributions/distribution.py:74)             )
     [75](file:///.../torch/distributions/distribution.py:75) super().__init__()

ValueError: Expected parameter concentration (Tensor of shape (1000, 128, 3)) of distribution Gamma(concentration: torch.Size([1000, 128, 3]), rate: torch.Size([1000, 128, 3])) to satisfy the constraint GreaterThan(lower_bound=0.0), but found invalid values:
tensor([[[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.],
         ...,
         [0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.],
         ...,
         [0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.],
         ...,
         [0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        ...,

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.],
         ...,
         [0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.],
         ...,
         [0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.],
         ...,
         [0., 0., 0.],
         [0., 0., 0.],
         [0., 0., 0.]]], device='cuda:0')

elephaint · 2024-06-05T19:47:08Z

@fariduca I think some of our bounds are too tight for the distributions, this seems the same issue as you are experiencing. I am working on fixing this.

github-actions · 2024-06-06T04:01:07Z

This issue has been automatically closed because it has been awaiting a response for too long. When you have time to to work with the maintainers to resolve this issue, please post a new comment and it will be re-opened. If the issue has been locked for editing by the time you return to it, please open a new issue and reference this one.

ll550 · 2024-06-17T16:03:07Z

waiting for the solution. This kind of error is totally beyond my scope.:)

yurirocha15 · 2024-08-19T10:12:19Z

I am facing the same error when using either the Poisson or the Tweedie distribution. Also, I am not using scalers and my input data is nonnegative.

zzzrbx added the bug label Apr 7, 2024

jmoralez changed the title ~~Shape error when tuning with AutoNHITS~~ Constraint error when tuning with AutoNHITS Apr 8, 2024

jmoralez added the awaiting response label Apr 8, 2024

github-actions bot removed the awaiting response label Apr 8, 2024

jmoralez added the awaiting response label Apr 8, 2024

github-actions bot removed the awaiting response label Apr 8, 2024

elephaint added the awaiting response label Apr 11, 2024

github-actions bot closed this as completed May 12, 2024

elephaint reopened this Jun 5, 2024

github-actions bot closed this as completed Jun 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Constraint error when tuning with AutoNHITS #957

Constraint error when tuning with AutoNHITS #957

zzzrbx commented Apr 7, 2024

jmoralez commented Apr 8, 2024

zzzrbx commented Apr 8, 2024 via email

jmoralez commented Apr 8, 2024

zzzrbx commented Apr 8, 2024 •

edited

Loading

elephaint commented Apr 11, 2024

zzzrbx commented Apr 11, 2024 via email

elephaint commented Apr 11, 2024

elephaint commented May 7, 2024

github-actions bot commented May 12, 2024

fariduca commented Jun 5, 2024

elephaint commented Jun 5, 2024

github-actions bot commented Jun 6, 2024

ll550 commented Jun 17, 2024

yurirocha15 commented Aug 19, 2024 •

edited

Loading

Constraint error when tuning with AutoNHITS #957

Constraint error when tuning with AutoNHITS #957

Comments

zzzrbx commented Apr 7, 2024

What happened + What you expected to happen

Versions / Dependencies

Reproduction script

Issue Severity

jmoralez commented Apr 8, 2024

zzzrbx commented Apr 8, 2024 via email

jmoralez commented Apr 8, 2024

zzzrbx commented Apr 8, 2024 • edited Loading

elephaint commented Apr 11, 2024

zzzrbx commented Apr 11, 2024 via email

elephaint commented Apr 11, 2024

elephaint commented May 7, 2024

github-actions bot commented May 12, 2024

fariduca commented Jun 5, 2024

elephaint commented Jun 5, 2024

github-actions bot commented Jun 6, 2024

ll550 commented Jun 17, 2024

yurirocha15 commented Aug 19, 2024 • edited Loading

zzzrbx commented Apr 8, 2024 •

edited

Loading

yurirocha15 commented Aug 19, 2024 •

edited

Loading