LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion
Pancheng Zhao1,2 · Peng Xu3+ · Pengda Qin4 · Deng-Ping Fan2,1 · Zhicheng Zhang1,2 · Guoli Jia1 · Bowen Zhou3 · Jufeng Yang1,2
1 VCIP & TMCC & DISSec, College of Computer Science, Nankai University
2 Nankai International Advanced Research Institute (SHENZHEN· FUTIAN)
3 Department of Electronic Engineering, Tsinghua University · 4Alibaba Group
+corresponding authors
CVPR 2024
- 🔥2024-07-15🔥: Revised a misspelling in Fig. 2 , and an error in Equ. 4. The latest version can be download on arXiv
-
2024-04-13: Updated Fig. 3, including the computational flow of
$\tilde{\mathrm{c} }^f$ and some of the variable names. The latest version can be download on arXiv(After 16 Apr 2024 00:00:00 GMT.) - 2024-04-13: Full Code, Dataset, and model weight have been released!
- 2024-04-03: The preprint is now available on arXiv.
- 2024-03-17: Basic code uploaded. Data, checkpoint and more code will come soon ...
- 2024-03-11: Creating repository. The Code will come soon ...
- 2024-02-27: LAKE-RED has been accepted to CVPR 2024!
If you already have the ldm environment, please skip it
A suitable conda environment named ldm
can be created
and activated with:
conda env create -f ldm/environment.yaml
conda activate ldm
We collected and organized the dataset LAKERED from existing datasets. The training set is from COD10K and CAMO, and testing set is including three subsets: Camouflaged Objects (CO), Salient Objects (SO), and General Objects (GO).
Datasets | GoogleDrive | BaiduNetdisk(v245) |
---|
The results of this paper can be downloaded at the following link:
Results | GoogleDrive | BaiduNetdisk(berx) |
---|
The Pre-trained Latent-Diffusion-Inpainting Model
Pretrained Autoencoding Models | Link |
---|---|
Pretrained LDM | Link |
Put them into specified path:
Pretrained Autoencoding Models: ldm/models/first_stage_models/vq-f4-noattn/model.ckpt
Pretrained LDM: ldm/models/ldm/inpainting_big/last.ckpt
The Pre-trained LAKERED Model
LAKERED | GoogleDrive | BaiduNetdisk(dzi8) |
---|
Put it into specified path:
LAKERED: ckpt/LAKERED.ckpt
You can quickly experience the model with the following commands:
sh demo.sh
python combine.py
You can change the `config_LAKERED.yaml' files to modify settings.
sh train.sh
Note:The solution to the KeyError 'global_step'
Quick fix : You can --resume with the model that is saved during termination from error. (logs/checkpoints/last.ckpt)
You can also skip 4.1 and download the LAKERED_init.ckpt to start training.
Generate camouflage images with foreground objects in the test set:
sh test.sh
Note that this will take a lot of time, you can download the results.
Use torch-fidelity to calculate FID and KID:
pip install torch-fidelity
You need to specify the result root and the data root, then eval it by running:
sh eval.sh
For the “RuntimeError: stack expects each tensor to be equal size”
This is due to inconsistent image sizes.
Debug by following these steps:
(1) Find the datasets.py in the torch-fidelity
anaconda3/envs/envs-name/lib/python3.8/site-packages/torch_fidelity/datasets.py
(2) Import torchvision.transforms
import torchvision.transforms as TF
(3) Revise line 24:
self.transforms = TF.Compose([TF.Resize((299,299)),TransformPILtoRGBTensor()]) if transforms is None else transforms
Or you can manually modify the size of the images to be the same.
If you have any questions, please feel free to contact me:
zhaopancheng@mail.nankai.edu.cn
If you find this project useful, please consider citing:
@inproceedings{zhao2024camouflaged,
author = {Zhao, Pancheng and Xu, Peng and Qin, Pengda and Fan, Deng-Ping and Zhang, Zhicheng and Jia, Guoli and Zhou, Bowen and Yang, Jufeng},
title = {LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2024},
}
This code borrows heavily from latent-diffusion-inpainting, thanks the contribution of nickyisadog