-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Download and Kinetics 400/600/700 Datasets #3680
Conversation
@pmeier is there a nice/neat way to read in CSV annotation file and have it be queryable without using pandas? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Bruno, thanks for the PR! I did an initial pass. Comments below.
Co-authored-by: Philip Meier <github.pmeier@posteo.de>
Co-authored-by: Philip Meier <github.pmeier@posteo.de>
Co-authored-by: Philip Meier <github.pmeier@posteo.de>
@pmeier I've fixed all the issues you mentioned, plus removed pandas dependency. Can you please check if all looks well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly style related comments left.
Also, additionally, should we add a download test?
Yes, please. If you need help to set them up, please let me know. Happy to help.
Plus, since this adds new parameters (n_classes
and split
), we should also include them in our dataset test
Line 1067 in d7ad62f
class Kinetics400TestCase(datasets_utils.VideoDatasetTestCase): |
You can do so by adding a ADDITIONAL_CONFIGS
class variable like
ADDITIONAL_CONFIGS = datasets_utils.combinations_grid(
n_classes=("400", "600", "700"), split=("train", "val")
)
Co-authored-by: Philip Meier <github.pmeier@posteo.de>
Co-authored-by: Philip Meier <github.pmeier@posteo.de>
Co-authored-by: Philip Meier <github.pmeier@posteo.de>
… into bkorbar/datasets/kinetics
Co-authored-by: Philip Meier <github.pmeier@posteo.de>
Co-authored-by: Philip Meier <github.pmeier@posteo.de>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks awesome, thanks Bruno!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, thanks Bruno!
I only have one comment regarding the order of the dimensions for a frame. If we are to be deprecating Kinetics400, I think we might as well change the format to perform a .transpose()
if we use the legacy format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Summary: * Initial commit * pmeiers comments * pmeiers changes * pmeiers comments * replace pandas with system library to avoid crashes * Lint * Lint * fixing unittest * Minor comments removal * pmeier comments * remove asserts * address pmeier formatting changes * address pmeier changes * pmeier changes * rename n_classes to num_classes * formatting changes * doc change to add ".mp4" to backported class * formatting to correct line length * adding **kwargs to Kinetics400 class * remove urlib request and download the file directly * annotations and files can be already downloaded * test fix * add download tests for Kinetics * users now dont need to provide full path within the root for new Kinetics dataset * linter * Update test/test_datasets_download.py * Update torchvision/datasets/kinetics.py * revert whitespace (3680#discussion_r626382842) * addressing annotation_path parameter which is unnecessary * Update torchvision/datasets/kinetics.py * Update torchvision/datasets/kinetics.py * kwargs update * expose num_download_workers as public * swap os.isfile with check_integrity * nit on private things * special case if there are no default arguments * revert changes to kinetics400 test case for BC * add split_folder changes and support for legacy format * pmeiers suggestions * pmeiers suggestions - root comment * pmeiers comments - annotation attribute remmoved * pmeiers suggestion * pmeiers suggestion * pmeiers suggestion * pmeiers suggestion * Update torchvision/datasets/kinetics.py * Update torchvision/datasets/kinetics.py * Update torchvision/datasets/kinetics.py * Update torchvision/datasets/kinetics.py * Update torchvision/datasets/kinetics.py * Update torchvision/datasets/kinetics.py * minor debugging * nit picks * only include public kwargs into defaults * add _use_legacy_structure in favour of **kwargs * add type hints for Kinetics400 * flake8 * flake8 * flake8 * rename to make thigs clearer * permuting the output Reviewed By: fmassa Differential Revision: D29097736 fbshipit-source-id: 1de2119e82eadbbba682f0897f03aba3929e3604 Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Philip Meier <github.pmeier@posteo.de> Co-authored-by: Francisco Massa <fvsmassa@gmail.com>
Addresses #3452.