Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

freq is None for Weather samples, causing finetuning error #107

Closed
zqiao11 opened this issue Aug 18, 2024 · 2 comments
Closed

freq is None for Weather samples, causing finetuning error #107

zqiao11 opened this issue Aug 18, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@zqiao11
Copy link
Contributor

zqiao11 commented Aug 18, 2024

Describe the bug
When finetuning moriai on Weather dataset, errors occur due to freq of data_entry being None.

To Reproduce

  1. Build the weather dataset for LSF finetuning.
python -m uni2ts.data.builder.simple weather "PATH_TO_LSF_DATASET/weather/weather.csv"  --dataset_type wide --offset 36887
  1. Create the config files for weather dataset.
  • cli/conf/finetune/data/weather.yaml
_target_: uni2ts.data.builder.simple.SimpleDatasetBuilder
dataset: weather
  • cli/conf/finetune/val_data/weather.yaml
_target_: uni2ts.data.builder.ConcatDatasetBuilder
_args_:
  _target_: uni2ts.data.builder.simple.generate_eval_builders
  dataset: weather_eval
  offset: 36887 # Same as _lsf_dataset.py
  eval_length: 5269 # Same as _lsf_dataset.py; test_length=10539
  prediction_lengths: [96]
  context_lengths: [3000]
  patch_sizes: [128]
  1. Run the codes for finetuning:
python -m cli.train -cp conf/finetune run_name=example_run model=moirai_1.0_R_small data=weather val_data=weather

Expected behavior
The following error will occur:

...
Original Traceback (most recent call last):
  File "/home/eee/qzz/SF_main/uni2ts/venv/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop
    data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
  File "/home/eee/qzz/SF_main/uni2ts/venv/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/eee/qzz/SF_main/uni2ts/venv/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 52, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/eee/qzz/SF_main/uni2ts/venv/lib/python3.10/site-packages/torch/utils/data/dataset.py", line 350, in __getitem__
    return self.datasets[dataset_idx][sample_idx]
  File "/home/eee/qzz/SF_main/uni2ts/src/uni2ts/data/dataset.py", line 73, in __getitem__
    return self.transform(self._flatten_data(self._get_data(idx)))
  File "/home/eee/qzz/SF_main/uni2ts/src/uni2ts/transform/_base.py", line 62, in __call__
    data_entry = t(data_entry)
  File "/home/eee/qzz/SF_main/uni2ts/src/uni2ts/transform/patch.py", line 91, in __call__
    constraints = self.patch_size_constraints(freq)
  File "/home/eee/qzz/SF_main/uni2ts/src/uni2ts/transform/patch.py", line 39, in __call__
    start, stop = self._get_boundaries(offset.n, norm_freq_str(offset.name))
AttributeError: 'NoneType' object has no attribute 'n'
. Did you mean: '_return_value'?

Root cause tracking
This issue arises becausefreq for each data_entry is None.
image

The problem is due to the use of pd.infer_freq in simple.py, which return None for the Weather dataset (and possibly other datasets).
image

Environment

  • Operating system: Linux 6.5.0
  • Python version: Python 3.10.12
  • PyTorch version: 2.4.0
  • uni2ts version: Latest version.
@zqiao11 zqiao11 added the bug Something isn't working label Aug 18, 2024
@liu-jc liu-jc self-assigned this Aug 19, 2024
@liu-jc
Copy link
Contributor

liu-jc commented Aug 19, 2024

Hi @zqiao11,

Thanks for pointing out this issue! According to pandas doc, it's possible to return None if no discernible frequency. I would suggest setting it to a default frequency (maybe hourly) when it gets None with pd.infer_freq. If you want, you can create a PR for this. Otherwise, we will fix this soon :)

@zqiao11
Copy link
Contributor Author

zqiao11 commented Aug 19, 2024

Hi @liu-jc. Thank you for your prompt reply. Sure, I can fix that issue and create a PR later :)

liu-jc pushed a commit that referenced this issue Aug 22, 2024
…builder (#110)

Fix the bug in issue #107. Enable to manually set `freq` for the
datasets in which `freq=None` using `pd.infer_freq`.
@liu-jc liu-jc closed this as completed Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants