- Clone this repo:
git clone https://github.com/Alpha-VL/FastConvMAE
cd FastConvMAE
- Create a conda environment and activate it:
conda create -n fastconvmae python=3.7
conda activate fastconvmae
- Install
Pytorch==1.8.0
andtorchvision==0.9.0
withCUDA==11.1
conda install pytorch==1.8.0 torchvision==0.9.0 cudatoolkit=11.1 -c pytorch -c conda-forge
- Install
timm==0.3.2
pip install timm==0.3.2
You can download the ImageNet-1K here and prepare the ImageNet-1K follow this format:
imagenet
├── train
├── class1
│ ├── img1.jpeg
│ ├── img2.jpeg
│ └── ...
├── class2
│ ├── img3.jpeg
│ └── ...
└── ...
To pretrain FastConvMAE-Base with multi-node distributed training, run the following on 1 node with 8 GPUs each (only mask 75% is supported):
python submitit_pretrain.py \
--job_dir ${JOB_DIR} \
--nodes 1 \
--batch_size 64 \
--model fastconvmae_convvit_base_patch16 \
--norm_pix_loss \
--mask_ratio 0.75 \
--epochs 50 \
--warmup_epochs 10 \
--blr 6.0e-4 --weight_decay 0.05 \
--data_path ${IMAGENET_DIR}