-
-
Notifications
You must be signed in to change notification settings - Fork 16.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Datasets with different aspect ratios vs shuffling of dataset #932
Comments
With default settings this is a non-issue. |
@shayanalibhatti good news 😃! Your original issue may now be fixed ✅ in PR #5623 by @werner-duvaud. This PR turns on shuffling in the YOLOv5 training DataLoader by default, which was missing until now. This works for all training formats: CPU, Single-GPU, Multi-GPU DDP. train_loader, dataset = create_dataloader(train_path, imgsz, batch_size // WORLD_SIZE, gs, single_cls,
hyp=hyp, augment=True, cache=opt.cache, rect=opt.rect, rank=LOCAL_RANK,
workers=workers, image_weights=opt.image_weights, quad=opt.quad,
prefix=colorstr('train: '), shuffle=True) # <--- NEW I evaluated this PR against master on VOC finetuning for 50 epochs, and the results show a slight improvement in most metrics and losses, particularly in objectness loss and mAP@0.5, perhaps indicating that the shuffle addition may help delay overtraining. https://wandb.ai/glenn-jocher/VOC To receive this update:
Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀! |
@glenn-jocher Thanks. Glad to know my insight was of help and will be, to the community. I am not working on computer vision related project right now so I cant test it but your observation of improvement in results is great. Keep up the great work. |
How to use shuffel in this new Update? or is shuffel turned on by default ? |
@yizweithree shuffle is enabled by default now for training sets. |
@glenn-jocher thanks for updating this and other threads on the issue! Can you elaborate more on the rationale for sorting the images by aspect ratio when doing "rectangular" training? With the above PR, images are still sorted by aspect ratio during evaluation on the val set, since currently the |
@yangsiyu007 I don't understand your question. shuffle is enabled for train, aspect ratio sorted in val for speed. |
@glenn-jocher I'm wondering why sorting by aspect ratio would speed up inference? :) |
@yangsiyu007 you don't have to ask me, you can set rect=False in the val.py dataloader and profile both ways |
Hi,
Great work developing yolov5. I have a question. Imagine you are combining different datasets, such as COCO (which has images of different aspect ratios), then some other datasets with image of same aspect ratio among that dataset.
Wouldn't rectangular training (which sorts images by aspect ratio) hurt the shuffling of dataset ? as it would sort the images of same aspect ratio to be together in one batch. Thus, it might sort COCO first and then the other datasets in a serial manner. Or would just turning rectangular training OFF and shuffling dataset, do the job so that our model generalizes even on different datasets with images of different aspect ratios ?
Can you elaborate on this and correct my understanding?
The text was updated successfully, but these errors were encountered: