Using the ViT for Single-channel Images #727

ahmed1996said · 2021-06-28T07:35:05Z

ahmed1996said
Jun 28, 2021

Hello all,

I'm wondering how to use a pretrained ViT model to finetune on single channel images?
I tried adjusting the architecture (for vit_base_patch16_224) so that the patch embedding projection layer is

Conv2D(1,256,kernel_size=(16,16), stride=(16,16))  # D = 16*16*1 = 256

instead of

Conv2D(3,768,kernel_size=(16,16), stride=(16,16))

Similarly, I adjusted the encoder inputs/outputs to be 256 instead of 768 and made similar changes to the the attention qkv, attention projection layer and the MLP.
However, I face an error when training complaining about the sizes: Sizes of tensors must match except in dimension 2. Got 256 and 768 (The offending index is 0) triggered by the forward pass function.

Any suggestions on how to correctly use the model for single channel images?

Abhishek-Prajapat · 2021-07-08T00:02:27Z

Abhishek-Prajapat
Jul 8, 2021

Hi there.
This might be a late reply but you can use one thing and that is to stack you images. I was doing a similar problem and what I did was this:-

image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
image = cv2.merge((image, image, image))

0 replies

rwightman · 2021-07-08T21:32:31Z

rwightman
Jul 8, 2021
Maintainer

@Abhishek-Prajapat just create the model with arg in_chans=1 https://fastai.github.io/timmdocs/models#My-dataset-doesn't-consist-of-3-channel-images---what-now?

EDIT: also @ahmed1996said

2 replies

ahmed1996said Jul 13, 2021
Author

Great thank you, this worked fine!

yipliu Sep 5, 2024

How about ViTImageProcessor ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using the ViT for Single-channel Images #727

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Using the ViT for Single-channel Images #727

ahmed1996said Jun 28, 2021

Replies: 2 comments · 2 replies

Abhishek-Prajapat Jul 8, 2021

rwightman Jul 8, 2021 Maintainer

ahmed1996said Jul 13, 2021 Author

yipliu Sep 5, 2024

ahmed1996said
Jun 28, 2021

Replies: 2 comments 2 replies

Abhishek-Prajapat
Jul 8, 2021

rwightman
Jul 8, 2021
Maintainer

ahmed1996said Jul 13, 2021
Author