Domain-Specific Pre-training Improves Confidence in Whole Slide Image Classification

Published at EMBC 2023. Preprint. IEEE Explorer

Abstract: Whole Slide Images (WSIs) or histopathology images are used in digital pathology. WSIs pose great challenges to deep learning models for clinical diagnosis, owing to their size and lack of pixel-level annotations. With the recent advancements in computational pathology, newer multiple-instance learning-based models have been proposed. Multiple-instance learning for WSIs necessitates creating patches and uses the encoding of these patches for diagnosis. These models use generic pre-trained models (ResNet-50 pre-trained on ImageNet) for patch encoding. The recently proposed KimiaNet, a DenseNet121 model pre-trained on TCGA slides, is a domain-specific pre-trained model. This paper shows the effect of domain-specific pre-training on WSI classification. To investigate the effect of domain-specific pre-training, we considered the current state-of-the-art multiple-instance learning models, 1) CLAM, an attention-based model, and 2) TransMIL, a self-attention-based model, and evaluated the models' confidence and predictive performance in detecting primary brain tumors - gliomas. Domain-specific pre-training improves the confidence of the models and also achieves a new state-of-the-art performance of WSI-based glioma subtype classification, showing a high clinical applicability in assisting glioma diagnosis.

Installation

Create a conda environment and install the requirements

conda create --name domain-wsi --file requirements.txt

Now, install smooth-topk

git clone https://github.com/oval-group/smooth-topk.git
cd smooth-topk
python setup.py install

Install net:cal lib Python library

pip install netcal

For Patch extraction, clone CLAM into this repository

git clone https://github.com/mahmoodlab/CLAM.git

Patch and Feature Extraction

Download the file from here. Place the file in CLAM/presets

The following directory structure is required for data

├── root
│   ├── data
│   │   ├── Class1
│   │   ├── Class2
│   │   └── ...

Before running the script, set the path in the script itself.

bash create_patches_features.sh

This creates a following directory structure

├── root
│   ├── data
│   │   ├── Class1
│   │   │   ├── Slide1.ndpi
│   │   │   ├── Slide2.ndpi
│   │   │   └── ...
│   │   ├── Class2
│   │   │   ├── Slide1.ndpi
│   │   │   ├── Slide2.ndpi
│   │   │   └── ...
│   │   └── ...
│   ├── patches
│   │   ├── masks
│   │   │   ├── Slide1.png
│   │   │   ├── Slide2.png
│   │   │   └── ...
│   │   ├── patches
│   │   │   ├── Slide1.h5
│   │   │   ├── Slide2.h5
│   │   │   └── ...
│   │   ├── stitches
│   │   │   ├── Slide1.png
│   │   │   ├── Slide2.png
│   │   │   └── ...
│   │   └── process_list_autogen.csv
│   ├── features
│   │   ├── Feature_model
│   │   │   ├── Class1
│   │   │   │   ├── Slide1
│   │   │   │   ├── Slide2
│   │   │   │   └── ...
│   │   │   ├── Class2
│   │   │   │   ├── Slide1
│   │   │   │   ├── Slide2
│   │   │   │   └── ...
│   │   │   └── ...

Training

To train a model, use the following command. Since the experiment was conducted with multiple data and model seeds the option to set it is available.

python train_seeded.py --name WANDB_PROJECT_NAME --n_classes NUM_CLASSES --feat_dir FEATURE_DIR --csv CSV_PATH --feature_model FEATURE_MODEL --model MODEL --drop_out --early_stopping --opt OPTIMIZER --result_dir RESULT_DIR

Evaluation

To evaluate a model, use the following command.

python --n_classes NUM_CLASSES --device GPU_DEVICE --feat_dir FEATURE_DIR --csv_path CSV_PATH --model_path MODEL_CHECKPOINT --model MODEL --result_dir RESULT_DIR

Heatmaps

To generate heatmaps for a given set of slides for a specific model, use the following command.

Before running the script, please set the paths and desired configuration in the script itself. Model Checkpoints can be found in the results folder.

bash create_heatmaps.sh

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
CustomOptim		CustomOptim
models		models
results		results
saved_models		saved_models
wsi_core		wsi_core
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
create_heatmaps.sh		create_heatmaps.sh
create_patches_features.sh		create_patches_features.sh
dataset.py		dataset.py
eval.py		eval.py
eval_utils.py		eval_utils.py
feature_dataset.py		feature_dataset.py
feature_extraction.py		feature_extraction.py
heatmap_utils.py		heatmap_utils.py
heatmaps.py		heatmaps.py
requirements.txt		requirements.txt
train_seeded.py		train_seeded.py
utils.py		utils.py
writer.py		writer.py
wsi_dataset.py		wsi_dataset.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Domain-Specific Pre-training Improves Confidence in Whole Slide Image Classification

Installation

Patch and Feature Extraction

Training

Evaluation

Heatmaps

About

Releases

Packages

Languages

License

soham-chitnis10/WSI-domain-specific

Folders and files

Latest commit

History

Repository files navigation

Domain-Specific Pre-training Improves Confidence in Whole Slide Image Classification

Installation

Patch and Feature Extraction

Training

Evaluation

Heatmaps

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages