Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docstring for RandAugment #4457

Merged
merged 2 commits into from
Sep 22, 2021
Merged

Update docstring for RandAugment #4457

merged 2 commits into from
Sep 22, 2021

Conversation

VinhLoiIT
Copy link
Contributor

@VinhLoiIT VinhLoiIT commented Sep 22, 2021

Fixes #4456

Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @VinhLoiIT.

I left a couple of comments, please let me know what you think.

torchvision/transforms/autoaugment.py Outdated Show resolved Hide resolved
torchvision/transforms/autoaugment.py Outdated Show resolved Hide resolved
@@ -333,7 +333,7 @@ class TrivialAugmentWide(torch.nn.Module):
r"""Dataset-independent data-augmentation with TrivialAugment Wide, as described in
`"TrivialAugment: Tuning-free Yet State-of-the-Art Data Augmentation" <https://arxiv.org/abs/2103.10158>`.
If the image is torch Tensor, it should be of type torch.uint8, and it is expected
to have [..., 1 or 3, H, W] shape, where ... means an arbitrary number of leading dimensions.
to have [..., 3, H, W] shape, where ... means an arbitrary number of leading dimensions.
If img is PIL Image, it is expected to be in mode "L" or "RGB".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If img is PIL Image, it is expected to be in mode "L" or "RGB".
If img is PIL Image, it is expected to be in mode "RGB".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @datumbox ,

The below snippet still works when I pass a pillow image with the mode is 'L'. This definition of "work" here is that it does not raise the above issue. I've also checked some of the results and see that they are all good. If then, does it come from the tensor transformation side?

from PIL import Image
from pathlib import Path
import torch
import torchvision
from torchvision import transforms

image_path = 'n02100877_4699.jpg'
raw_image = Image.open(image_path).convert('L')  # try with 'L', 'RGB
print(image.size, image.mode) 
tf = transforms.AutoAugment(transforms.autoaugment.AutoAugmentPolicy.CIFAR10)
for seed in range(0, 2000):
    torch.manual_seed(seed)
    image = tf(raw_image)
    image.save(out_dir / f"{seed}.png", 'png')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the already setup colab notebook that you could manually check yourself: https://colab.research.google.com/drive/1WtlOAASAGJA927oeFECN2YSFAhlMJ6j2?usp=sharing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, thanks for investigating. It seems that PIL supports the colour operators even when the image is grayscale.

I think we have a few options:

  1. Update the documentation for Tensors only since PIL supports L
  2. Update the documentation for both to avoid confusing people
  3. Update the implementations of colour transforms of TorchVision that don't support Grayscale images to align them with PIL.

I think option 3 is worth exploring but it will take longer and needs to happen carefully. So I would be in favour of doing 1 or 2 in this PR and follow up on 3 on a separate one. I don't have strong opinion over how we go, so let me know what you prefer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that this PR is just for updating the documentation only to keep it relates to the original issue. Furthermore, since we have already declared some policies such as ImageNet, CIFAR10 that their input is not a grayscale image. Thus, changing to support grayscale images would have some conflicts with the policy definitions as well that we might have further discussion in another thread.

In conclusion, in this PR, I would change the document so that this transformation only supports RGB for now.

Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VinhLoiIT LGTM, thanks a lot for the contribution and the detailed investigation.

@datumbox datumbox merged commit 972ca65 into pytorch:main Sep 22, 2021
facebook-github-bot pushed a commit that referenced this pull request Sep 30, 2021
Summary:
* update docstring for RandAugment

* update docstrings for pillow image

Reviewed By: datumbox

Differential Revision: D31268031

fbshipit-source-id: cc688541ff66b7ad78a2528b1c69424838be5971
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AutoAugment's document is wrong (or unexpected?)
3 participants