Skip to content
This repository has been archived by the owner on Sep 9, 2024. It is now read-only.

Parallelize data loading #32

Open
2 of 3 tasks
nshaud opened this issue Nov 25, 2020 · 0 comments
Open
2 of 3 tasks

Parallelize data loading #32

nshaud opened this issue Nov 25, 2020 · 0 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@nshaud
Copy link
Owner

nshaud commented Nov 25, 2020

Currently, the torch DataLoader uses blocking data loading. Although loading is very fast (we store the NumPy arrays in-memory), transfer to GPU and data augmentation (which is done on CPU) can slow things done.

Using workers > 0 would make data loading asynchronous and workers > 1 could increase speed somewhat.

TODO:

  • Benchmark speed gain using asynchronous data loading
  • Implement asynchronous data loading for all DataLoader objects
  • Add a user-input option to define the number of jobs
@nshaud nshaud self-assigned this Nov 25, 2020
@nshaud nshaud added the enhancement New feature or request label Nov 25, 2020
@nshaud nshaud added this to the 0.1.0 milestone Nov 25, 2020
@nshaud nshaud mentioned this issue Nov 27, 2020
23 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant