This repository is the official open-source
of Dual Stream Fusion U-Net Transformers for 3D Medical Image Segmentation
by Seungkyun Hong*, Sunghyun Ahn*, Youngwan Jo and Sanghyun Park. (*equally contributed)
- [2024/08/20] DS-UNETR network codes are released!
- [2023/12/12] Our DS-UNETR has been accepted by BigComp 2024!
we propose a Dual Stream fusion U-NEt TRansformers (DS-UNETR) comprising a Dual Stream Attention Encoder (DS-AE) and Bidirectional All Scale Fusion (Bi-ASF) module. We designed the DS-AE that extracts both spatial and channel features in parallel streams to better understand the relation between channels. When transferring the extracted features from the DS-AE to the decoder, we used the Bi-ASF module to fuse all scale features. We achieved an average Dice similarity coefficient (Dice score) improvement of 0.97 % and a 95 % Hausdorff distance (HD95), indicating an improvement of 7.43% compared to that for a state-of-the-art model on the Synapse dataset. We also demonstrated the efficiency of our model by reducing the space and time complexity with a decrease of 80.73 % in parameters and 78.86 % in FLoating point OPerationS (FLOPS). Our proposed model, DS-UNETR, shows superior performance and efficiency in terms of segmentation accuracy and model complexity (both space and time) compared to existing state-of-the-art models on the 3D medical image segmentation benchmark dataset. The approach of our proposed model can be effectively applied in various medical big data analysis applications.
The overview of the DS-UNETR framework. In DS-AE, the outputs of the Swin block and C-MSA block in each stage of each stream are fused by the Fusion block. In Bi-ASF, the Fnf is performed on the features received from DS-AE, followed by depth-wise separable convolution.
Comparison on the abdominal multi-organ segmentation (Synapse) dataset. Abbreviations are: Spl: spleen, RKid: right kidney, LKid: left kidney, Gal: gallbladder, Liv: liver, Sto: stomach, Aor: aorta, Pan: pancreas. Best results are bolded. Best seconds are underlined.
We experimented with the abdominal multi-organ segmentation (Synapse) dataset. We followed the same dataset preprocessing as described in nnFormer. You can download it by following the instructions on the nnFormer page.
If you use our work, please consider citing:
@inproceedings{hong2024dual,
title={Dual Stream Fusion U-Net Transformers for 3D Medical Image Segmentation},
author={Hong, Seungkyun and Ahn, Sunghyun and Jo, Youngwan and Park, Sanghyun},
booktitle={2024 IEEE International Conference on Big Data and Smart Computing (BigComp)},
pages={301--308},
year={2024},
organization={IEEE}
}
Should you have any question, please create an issue on this repository or contact me at skd@yonsei.ac.kr.