We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The StreamingDataLoader returns empty state dict after it has fetched samples from the dataset.
master
import lightning as L import torch def run(): fabric = L.Fabric(devices=4) fabric.launch() train_dataloader = create_dataloader() state = {"train_dataloader": train_dataloader} train_iterator = iter(train_dataloader) next(train_iterator) next(train_iterator) next(train_iterator) fabric.print("train_dataloader:", train_dataloader.state_dict()) # Why is it empty? fabric.save("my-checkpoint.pth", state) if fabric.global_rank == 0: state = torch.load("my-checkpoint.pth") print("saved train_dataloader:", state["train_dataloader"]) # Why is it empty? fabric.barrier() def create_dataloader(): from lightning.data import StreamingDataset, CombinedStreamingDataset, StreamingDataLoader from lightning.data.streaming.item_loader import TokensLoader train_datasets = [ StreamingDataset( input_dir="data/slimpajama/train", item_loader=TokensLoader(block_size=128), ), StreamingDataset( input_dir="data/starcoder", item_loader=TokensLoader(block_size=128), ), ] combined_dataset = CombinedStreamingDataset(datasets=train_datasets) train_dataloader = StreamingDataLoader(combined_dataset, batch_size=4, num_workers=8) return train_dataloader if __name__ == "__main__": run()
The state is empty as printed.
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/4 Initializing distributed: GLOBAL_RANK: 2, MEMBER: 3/4 Initializing distributed: GLOBAL_RANK: 1, MEMBER: 2/4 Initializing distributed: GLOBAL_RANK: 3, MEMBER: 4/4 ---------------------------------------------------------------------------------------------------- distributed_backend=nccl All distributed processes registered. Starting with 4 processes ---------------------------------------------------------------------------------------------------- train_dataloader: {} saved train_dataloader: {}
No response
The text was updated successfully, but these errors were encountered:
Successfully merging a pull request may close this issue.
Bug description
The StreamingDataLoader returns empty state dict after it has fetched samples from the dataset.
What version are you seeing the problem on?
master
How to reproduce the bug
Error messages and logs
The state is empty as printed.
Environment
Current environment
- GPU:
- NVIDIA A10G
- NVIDIA A10G
- NVIDIA A10G
- NVIDIA A10G
- available: True
- version: 12.1
- lightning: 2.2.0.dev0
- lightning-cloud: 0.5.61
- lightning-utilities: 0.10.0
- pytorch-lightning: 2.1.2
- pytorch-triton: 2.2.0+e28a256d71
- torch: 2.3.0.dev20240110+cu121
- torch-tb-profiler: 0.4.3
- torchmetrics: 1.2.0
- absl-py: 2.0.0
- accelerate: 0.24.1
- aiofiles: 22.1.0
- aiohttp: 3.9.0
- aiosignal: 1.3.1
- aiosqlite: 0.19.0
- annotated-types: 0.6.0
- antlr4-python3-runtime: 4.9.3
- anyio: 3.7.1
- appdirs: 1.4.4
- argon2-cffi: 23.1.0
- argon2-cffi-bindings: 21.2.0
- arrow: 1.3.0
- asttokens: 2.4.1
- async-timeout: 4.0.3
- attrs: 23.1.0
- babel: 2.13.1
- beautifulsoup4: 4.12.2
- bitsandbytes: 0.41.0
- black: 23.12.0
- bleach: 6.1.0
- boto3: 1.29.4
- botocore: 1.32.4
- cachetools: 5.3.2
- certifi: 2023.11.17
- cffi: 1.16.0
- chardet: 5.2.0
- charset-normalizer: 3.3.2
- click: 8.1.7
- colorama: 0.4.6
- comm: 0.2.0
- dataproperty: 1.0.1
- datasets: 2.15.0
- debugpy: 1.8.0
- decorator: 5.1.1
- defusedxml: 0.7.1
- dill: 0.3.7
- distro: 1.8.0
- docker-pycreds: 0.4.0
- docstring-parser: 0.15
- einops: 0.7.0
- entrypoints: 0.4
- exceptiongroup: 1.2.0
- executing: 2.0.1
- fastapi: 0.104.1
- fastjsonschema: 2.19.0
- filelock: 3.13.1
- fqdn: 1.5.1
- frozenlist: 1.4.0
- fsspec: 2023.10.0
- gitdb: 4.0.11
- gitpython: 3.1.40
- google-auth: 2.23.4
- google-auth-oauthlib: 1.1.0
- grpcio: 1.59.3
- gviz-api: 1.10.0
- h11: 0.14.0
- httpcore: 1.0.2
- httpx: 0.25.2
- huggingface-hub: 0.19.4
- idna: 3.4
- importlib-resources: 6.1.1
- iniconfig: 2.0.0
- ipykernel: 6.26.0
- ipython: 8.17.2
- ipython-genutils: 0.2.0
- ipywidgets: 8.1.1
- isoduration: 20.11.0
- isort: 5.13.2
- jedi: 0.19.1
- jinja2: 3.1.2
- jmespath: 1.0.1
- joblib: 1.3.2
- json5: 0.9.14
- jsonargparse: 4.27.1
- jsonlines: 4.0.0
- jsonpointer: 2.4
- jsonschema: 4.20.0
- jsonschema-specifications: 2023.11.1
- jupyter-client: 7.4.9
- jupyter-core: 5.5.0
- jupyter-events: 0.9.0
- jupyter-server: 2.10.1
- jupyter-server-fileid: 0.9.0
- jupyter-server-terminals: 0.4.4
- jupyter-server-ydoc: 0.6.1
- jupyter-ydoc: 0.2.5
- jupyterlab: 3.6.1
- jupyterlab-pygments: 0.2.2
- jupyterlab-server: 2.25.2
- jupyterlab-widgets: 3.0.9
- lightning: 2.2.0.dev0
- lightning-cloud: 0.5.61
- lightning-utilities: 0.10.0
- lm-eval: 0.3.0
- markdown: 3.5.1
- markdown-it-py: 3.0.0
- markupsafe: 2.1.3
- matplotlib-inline: 0.1.6
- mbstrdecoder: 1.1.3
- mdurl: 0.1.2
- mistune: 3.0.2
- mpmath: 1.3.0
- multidict: 6.0.4
- multiprocess: 0.70.15
- mypy-extensions: 1.0.0
- nbclassic: 1.0.0
- nbclient: 0.9.0
- nbconvert: 7.11.0
- nbformat: 5.9.2
- nest-asyncio: 1.5.8
- networkx: 3.2.1
- nltk: 3.8.1
- notebook: 6.5.6
- notebook-shim: 0.2.3
- numexpr: 2.8.7
- numpy: 1.26.2
- nvidia-cublas-cu12: 12.1.3.1
- nvidia-cuda-cupti-cu12: 12.1.105
- nvidia-cuda-nvrtc-cu12: 12.1.105
- nvidia-cuda-runtime-cu12: 12.1.105
- nvidia-cudnn-cu12: 8.9.2.26
- nvidia-cufft-cu12: 11.0.2.54
- nvidia-curand-cu12: 10.3.2.106
- nvidia-cusolver-cu12: 11.4.5.107
- nvidia-cusparse-cu12: 12.1.0.106
- nvidia-nccl-cu12: 2.19.3
- nvidia-nvjitlink-cu12: 12.3.101
- nvidia-nvtx-cu12: 12.1.105
- oauthlib: 3.2.2
- omegaconf: 2.3.0
- openai: 1.3.6
- overrides: 7.4.0
- packaging: 23.2
- pandas: 2.1.3
- pandocfilters: 1.5.0
- parso: 0.8.3
- pathspec: 0.12.1
- pathvalidate: 3.2.0
- peft: 0.6.2
- pexpect: 4.8.0
- pillow: 10.1.0
- pip: 23.3
- platformdirs: 4.0.0
- pluggy: 1.3.0
- portalocker: 2.8.2
- prometheus-client: 0.19.0
- prompt-toolkit: 3.0.41
- protobuf: 4.23.4
- psutil: 5.9.6
- ptyprocess: 0.7.0
- pure-eval: 0.2.2
- pyarrow: 14.0.1
- pyarrow-hotfix: 0.6
- pyasn1: 0.5.1
- pyasn1-modules: 0.3.0
- pybind11: 2.11.1
- pycountry: 22.3.5
- pycparser: 2.21
- pydantic: 2.5.1
- pydantic-core: 2.14.3
- pygments: 2.17.1
- pyjwt: 2.8.0
- pytablewriter: 1.2.0
- pytest: 7.4.3
- python-dateutil: 2.8.2
- python-json-logger: 2.0.7
- python-multipart: 0.0.6
- pytorch-lightning: 2.1.2
- pytorch-triton: 2.2.0+e28a256d71
- pytz: 2023.3.post1
- pyyaml: 6.0.1
- pyzmq: 24.0.1
- referencing: 0.31.0
- regex: 2023.10.3
- requests: 2.31.0
- requests-oauthlib: 1.3.1
- rfc3339-validator: 0.1.4
- rfc3986-validator: 0.1.1
- rich: 13.7.0
- rouge-score: 0.1.2
- rpds-py: 0.13.1
- rsa: 4.9
- s3transfer: 0.7.0
- sacrebleu: 1.5.0
- safetensors: 0.4.1
- scikit-learn: 1.3.2
- scipy: 1.11.4
- send2trash: 1.8.2
- sentencepiece: 0.1.99
- sentry-sdk: 1.38.0
- setproctitle: 1.3.3
- setuptools: 68.0.0
- six: 1.16.0
- smmap: 5.0.1
- sniffio: 1.3.0
- soupsieve: 2.5
- sqlitedict: 2.1.0
- stack-data: 0.6.3
- starlette: 0.27.0
- sympy: 1.12
- tabledata: 1.3.3
- tcolorpy: 0.1.4
- tensorboard: 2.15.1
- tensorboard-data-server: 0.7.2
- tensorboard-plugin-profile: 2.14.0
- terminado: 0.18.0
- threadpoolctl: 3.2.0
- tinycss2: 1.2.1
- tokenizers: 0.15.0
- tomli: 2.0.1
- torch: 2.3.0.dev20240110+cu121
- torch-tb-profiler: 0.4.3
- torchmetrics: 1.2.0
- tornado: 6.3.3
- tqdm: 4.66.1
- tqdm-multiprocess: 0.0.11
- traitlets: 5.13.0
- triton: 2.1.0
- typepy: 1.3.2
- types-python-dateutil: 2.8.19.14
- typeshed-client: 2.4.0
- typing-extensions: 4.8.0
- tzdata: 2023.3
- uri-template: 1.3.0
- urllib3: 2.0.7
- uvicorn: 0.24.0.post1
- wandb: 0.16.0
- wcwidth: 0.2.11
- webcolors: 1.13
- webencodings: 0.5.1
- websocket-client: 1.6.4
- werkzeug: 3.0.1
- wheel: 0.41.2
- widgetsnbextension: 4.0.9
- xgboost: 2.0.2
- xxhash: 3.4.1
- y-py: 0.6.2
- yarl: 1.9.3
- ypy-websocket: 0.8.4
- zstandard: 0.22.0
- OS: Linux
- architecture:
- 64bit
- ELF
- processor: x86_64
- python: 3.10.10
- release: 5.15.0-1051-aws
- version: pip installation using github repository incomplete #56~20.04.1-Ubuntu SMP Tue Nov 28 15:43:31 UTC 2023
More info
No response
The text was updated successfully, but these errors were encountered: