Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚀 Implement MVTecAD2 dataset #2562

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

samet-akcay
Copy link
Contributor

📝 Description

This PR adds support for the MVTec AD 2 dataset, which is an advanced version of the original MVTec AD dataset designed for unsupervised anomaly detection. The implementation includes support for three different test sets:

  • Public test set (test_public/): Contains both normal and anomalous samples with ground truth masks
  • Private test set (test_private/): Contains unseen test samples without ground truth
  • Private mixed test set (test_private_mixed/): Contains unseen test samples with mixed anomalies without ground truth

Features

  • Full implementation of the MVTecAD2 dataset class with support for all test types
  • Proper handling of dataset splits (train, validation, test)
  • Support for image augmentations and transforms
  • Comprehensive test coverage for all test types and edge cases
  • Integration with existing Anomalib data pipeline

✨ Changes

  • Added MVTecAD2 datamodule in src/anomalib/data/datamodules/image/mvtecad2.py
  • Added MVTecAD2Dataset class in src/anomalib/data/datasets/image/mvtecad2.py
  • Added TestType enum to handle different test set types
  • Added unit tests for the dataset implementation
  • Updated dummy dataset generation for testing
  • Added configuration example in examples/configs/data/mvtecad2.yaml

Select what type of change your PR is:

  • 🐞 Bug fix (non-breaking change which fixes an issue)
  • 🔨 Refactor (non-breaking change which refactors the code base)
  • 🚀 New feature (non-breaking change which adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📚 Documentation update
  • 🔒 Security update

Testing

The implementation includes comprehensive tests that verify:

  • Dataset loading and preprocessing
  • All three test types (public, private, mixed)
  • Image augmentations and transforms
  • Error handling for invalid test types
  • Dataset split functionality
  • Integration with the Anomalib pipeline

Usage Example

from anomalib.data import MVTecAD2

# Create datamodule with public test set
datamodule = MVTecAD2(
    root="./datasets/MVTec_AD_2",
    category="sheet_metal",
    train_batch_size=32,
    eval_batch_size=32,
)

# Use private test set
datamodule = MVTecAD2(
    root="./datasets/MVTec_AD_2",
    category="sheet_metal",
    test_type="private",
)

# Access different test sets
datamodule.setup()
public_loader = datamodule.test_dataloader()  # returns loader based on test_type
private_loader = datamodule.test_dataloader(test_type="private")
mixed_loader = datamodule.test_dataloader(test_type="private_mixed")

License

The MVTec AD 2 dataset will be released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).

Notes

  • This implementation requires proper dataset access and follows the MVTec AD 2 dataset license terms
  • Users need to download the dataset separately from the MVTec website
  • The implementation maintains compatibility with the existing Anomalib API

Signed-off-by: Samet Akcay <samet.akcay@intel.com>
Signed-off-by: Samet Akcay <samet.akcay@intel.com>
Signed-off-by: Samet Akcay <samet.akcay@intel.com>
Signed-off-by: Samet Akcay <samet.akcay@intel.com>
Signed-off-by: Samet Akcay <samet.akcay@intel.com>
Signed-off-by: Samet Akcay <samet.akcay@intel.com>
Signed-off-by: Samet Akcay <samet.akcay@intel.com>
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant