Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add thread args to ThreadBuffer #2862

Merged
merged 6 commits into from
Aug 30, 2021
Merged

Conversation

Nic-Ma
Copy link
Contributor

@Nic-Ma Nic-Ma commented Aug 30, 2021

Description

This PR added the threadbuffer args to the ThreadDataLoader.

Status

Ready

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Integration tests passed locally by running ./runtests.sh -f -u --net --coverage.
  • Quick tests passed locally by running ./runtests.sh --quick --unittests.
  • In-line docstrings updated.
  • Documentation updated, tested make html command in the docs/ folder.

@Nic-Ma
Copy link
Contributor Author

Nic-Ma commented Aug 30, 2021

/black

@Nic-Ma Nic-Ma requested review from ericspod, rijobro and wyli August 30, 2021 04:33
Copy link
Contributor

@wyli wyli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think the default value of buffer size should be the dataloader's batch size, that is to prefetch a batch when loading

@Nic-Ma
Copy link
Contributor Author

Nic-Ma commented Aug 30, 2021

Hi @ericspod ,

What do you think about @wyli ‘s suggestion?

Thanks in advance.

@Nic-Ma
Copy link
Contributor Author

Nic-Ma commented Aug 30, 2021

Hi @wyli ,

I think maybe the source data of thread_buffer in ThreadDataLoader is already batched data:
https://github.com/pytorch/pytorch/blob/v1.9.0/torch/utils/data/dataloader.py#L373
So we don't need to increase the buffer size according to batch size.
Of course, it may have some relationship to batch_size, depends on the processing speed of transforms and network sides.

Thanks.

@Nic-Ma Nic-Ma merged commit 895592e into Project-MONAI:dev Aug 30, 2021
@ericspod
Copy link
Member

ericspod commented Sep 1, 2021

Hi @ericspod ,

What do you think about @wyli ‘s suggestion?

Thanks in advance.

Hi @Nic-Ma I don't think a buffer size beyond 1 is going to make a difference, except in the possible situation where the time taken to generate batches varies wildly one to the other. If it's set to a larger number it'll function but just take longer to initialize at the start.

@Nic-Ma
Copy link
Contributor Author

Nic-Ma commented Sep 1, 2021

Hi @ericspod ,
What do you think about @wyli ‘s suggestion?
Thanks in advance.

Hi @Nic-Ma I don't think a buffer size beyond 1 is going to make a difference, except in the possible situation where the time taken to generate batches varies wildly one to the other. If it's set to a larger number it'll function but just take longer to initialize at the start.

Hi @ericspod ,

Thanks for your comments, yes, we didn't change the default buffer size in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants