Temporal Label Smoothing

This repository contains the code for the paper: Temporal Label Smoothing for Early Prediction of Adverse Events. It contains code from both datasets' original repositories, M3B and HiB, which we extended to extract labels at multiple horizons or add additional components.

Set up the environment

For all our experiments, we assume a Linux installation, however, other platforms may also work:

Install Conda, see the official installation instructions
clone this repository and change it into the directory of the repository
conda env update (creates an environment tls-env using the environment.yml file)
pip install -e . (creates package tls)

Preprocess the data

Download the raw data

HiRID

Get access to the HiRID 1.1.1 dataset on physionet. This entails
1. getting a credentialed physionet account
2. submit a usage request to the data depositor
Once access is granted, download the following files
unpack the files into the hirid-data-root directory using e.g. cat *.tar.gz | tar zxvf - -i

MIMIC-III

Get access to MIMIC-III dataset on physionet
1. getting a credentialed physionet account
2. complete required training
3. sign the data use agreement
Once access is granted, download all CSV files provided on the page and place them in a directory mimic3-source
Run all the steps described in M3B repository to obtain MIMIC-III Benchmark data. You should place this data in the so-called mimic3-data-root folder.

Run the data pipeline

Here we describe how to obtain the dataset in a format compatible with the deep learning models we use.

HiB

You can directly obtain our preprocessed version of the HiB dataset with the following steps:

Activate the conda environment using conda activate tls-env.
Complete the arguments in run_script/preprocess/hirid.sh for --hirid-data-root and --work-dir.
Run pre-processing with sh run_script/preprocess/hirid.sh

This second step wraps the following command that you can adapt to your need.

tls preprocess --dataset hirid \
               --hirid-data-root [path to source] #TODO User \
               --work-dir [path to output] #TODO User \
               --resource-path ./preprocessing/resources/ \
               --horizons 2 4 6 8 10 12 14 16 18 20 22 \
	       --nr-worker 8

The above command requires about 10GB of RAM per core and, in total, approximately 40GB of disk space.

M3B

Similarly, you can directly obtain our preprocessed version of the M3B dataset with the following steps:

Activate the conda environment using conda activate tls-env.
Complete the arguments in run_script/preprocess/mimic3.sh for --mimic3-data-root and --work-dir.
Run pre-processing with sh run_script/preprocess/mimic3.sh

This second step wraps the following command that you can adapt to your need.

tls preprocess --dataset mimic3 \
               --mimic3-data-root [path to source] #TODO User \
               --work-dir [path to output] #TODO User \
               --resource-path ./preprocessing/resources/ \
               --horizons 4 8 12 16 20 24 28 32 36 40 44 \
               --mimic3-static-columns Height

The above command requires about 10GB of RAM per core and, in total, approximately 20GB of disk space.

Run Experiments

Update config files

The code is built around gin-config files. These files needs to be modified with the source path to the data. You should update the files in ./configs where you there is a #TODO User as in the previous step. For instance in ./configs/hirid/GRU.gin you should insert the correct path at line 36:

train_common.data_path = [path to output of pipe] #TODO User

Reproduce experiments from the paper

If you are interested in reproducing the experiments from the paper, you can directly use the pre-built scripts in ./run_scripts/. For instance, you can run the following command to reproduce the GRU baseline on the Circulatory Failure task:

sh run_script/baseline/Circ/GRU.sh

this will create a new directory [path to logdir]/[task name]/[seed number]/ containing:

val_metrics.pkl and test_metrics.pkl: Pickle files with the model's performance on respective validation and test sets.
train_config.gin: The so-called "operative" config allows the saving of the configuration used at training.
model.torch : The weights of the model that was trained.
tensorboard/: (Optional) Directory with tensorboard logs. One can do tensorboard --logdir ./tensorboard to visualize them.

The pre-built scripts are divided into two categories as follows:

baseline: This folder contains scripts to reproduce the main benchmark experiment. Each of them will run a model with the best parameters we provide for ten identical seeds.
hp-search: This folder contains the scripts we used to search hyperparameters for our method and baselines.

Run evaluation of trained models

For a trained model, you can evaluate any previously trained model using the evaluate as follows:

tls evaluate -c [path to gin config] \
             -l [path to logdir] \
             -t [task name] \

This command will evaluate the model at [path to logdir]/[task name]/model.torch on the test set of the dataset provided in the config. Results are saved to the test_metrics.pkl file.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
configs		configs
preprocessing/resources		preprocessing/resources
run_scripts		run_scripts
tests		tests
tls		tls
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
environment.yml		environment.yml
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Temporal Label Smoothing

Set up the environment

Preprocess the data

Download the raw data

HiRID

MIMIC-III

Run the data pipeline

HiB

M3B

Run Experiments

Update config files

Reproduce experiments from the paper

Run evaluation of trained models

About

Releases

Packages

Languages

License

ratschlab/tls

Folders and files

Latest commit

History

Repository files navigation

Temporal Label Smoothing

Set up the environment

Preprocess the data

Download the raw data

HiRID

MIMIC-III

Run the data pipeline

HiB

M3B

Run Experiments

Update config files

Reproduce experiments from the paper

Run evaluation of trained models

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages