Issues for loading from RAM instead of from Disk #431

DomInvivo · 2023-08-10T14:03:53Z

The DatasetSubsampler class fails if processed_graph_data_path is None, due to this line:

graphium/graphium/data/sampler.py

Line 28 in 10fe04b

path_with_hash = os.path.join(data_path, data_hash)
cache_data_path should be depreciated in favor of processed_graph_data_path:

graphium/graphium/data/datamodule.py

Line 773 in 10fe04b

cache_data_path: Optional[Union[str, os.PathLike]] = None,
load_from_file should be its own parameter. Depending on processed_graph_data_path means that either:
- We can cache the data and dataloading from disk
- We cannot cache and we can do dataloading from RAM
- We need the option of caching while doing dataloading from RAM, and this will be enabled by a new load_from_file parameter in the Datamodule
Why is normalize_label only applied when dataloading is from RAM? Where is it applied on disk?

graphium/graphium/data/datamodule.py

Line 1300 in 10fe04b

self.normalize_label(multitask_dataset, stage)

The text was updated successfully, but these errors were encountered:

WenkelF · 2023-08-10T17:43:12Z

Why is normalize_label only applied when dataloading is from RAM? Where is it applied on disk?

This is fine because _save_data_to_files() internally calls _make_multitask_dataset(..., load_from_file=False), i.e., label normalization is already applied to datasets loaded from disk.

DomInvivo changed the title ~~Issues for loading from RAM instead of from DISK~~ Issues for loading from RAM instead of from Disk Aug 10, 2023

DomInvivo mentioned this issue Aug 10, 2023

Caching logic improvement #432

Merged

5 tasks

DomInvivo linked a pull request Aug 10, 2023 that will close this issue

Caching logic improvement #432

Merged

5 tasks

DomInvivo closed this as completed in #432 Aug 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues for loading from RAM instead of from Disk #431

Issues for loading from RAM instead of from Disk #431

DomInvivo commented Aug 10, 2023 •

edited

Loading

WenkelF commented Aug 10, 2023

Issues for loading from RAM instead of from Disk #431

Issues for loading from RAM instead of from Disk #431

Comments

DomInvivo commented Aug 10, 2023 • edited Loading

WenkelF commented Aug 10, 2023

DomInvivo commented Aug 10, 2023 •

edited

Loading