Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of dataset Logger #2943

Merged
merged 3 commits into from
Apr 30, 2021
Merged

Improve performance of dataset Logger #2943

merged 3 commits into from
Apr 30, 2021

Conversation

AyushExel
Copy link
Contributor

@AyushExel AyushExel commented Apr 27, 2021

This PR discards unnecessary operations performed on labels before logging which should speed up things for large datsets.

@glenn-jocher there's still a bug in this which I couldn't explain properly over mail, let me try again in the meeting today.
Before the images are logged, the dataset directory is registered as artifacts, so we need to log the images from the registered path to make sure they don't get duplicated.

The bug that I'm seeing is that this logging code works perfectly when the dataset consists of all square images(case1) but the boxs are misplaced in cases where images are raw/unaugmented(case2)
See Example output -> case 1-> augmented dataset, case 2-> raw dataset
[I have marked the exact logging code L245-L255 in wandb_utils.py]
To reproduce:
python utils/wandb_logging/log_dataset.py this logs raw coco128 dataset as W&B Table with slightly misplaced bboxes.

🛠️ PR Summary

Made with ❤️ by Ultralytics Actions

🌟 Summary

Improvement in Weights & Biases (wandb) dataset artifact logging in the YOLOv5 repository.

📊 Key Changes

  • Updated LoadImagesAndLabels calls to use rect=True and batch_size=1 when creating dataset artifacts for training and validation sets.
  • Simplified the bounding box data creation by using middle point (x, y) and dimensions (width, height) instead of previous corner points (minX, minY, maxX, maxY).
  • Removed unnecessary attributes such as scores and domain.

🎯 Purpose & Impact

  • These changes are intended to make the dataset artifact logging process for wandb more efficient and the bounding box data more intuitive.
  • Enhancing the dataset artifact creation can potentially lead to easier integration with other tools and clearer visualization in the wandb interface.
  • Users of YOLOv5 leveraging wandb will find it simpler to review and understand their object detection dataset metrics due to these modifications. 🚀

@glenn-jocher glenn-jocher merged commit 801b469 into ultralytics:master Apr 30, 2021
KMint1819 pushed a commit to KMint1819/yolov5 that referenced this pull request May 12, 2021
* Improve performance of Dataset Logger

* Fix scaling bug
danny-schwartz7 pushed a commit to danny-schwartz7/yolov5 that referenced this pull request May 22, 2021
* Improve performance of Dataset Logger

* Fix scaling bug
Lechtr pushed a commit to Lechtr/yolov5 that referenced this pull request Jul 20, 2021
* Improve performance of Dataset Logger

* Fix scaling bug

(cherry picked from commit 801b469)
BjarneKuehl pushed a commit to fhkiel-mlaip/yolov5 that referenced this pull request Aug 26, 2022
* Improve performance of Dataset Logger

* Fix scaling bug
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants