Skip to content

Su-informatics-lab/MedACE2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MedACE: ICI and irAKI concept extraction

ContextGem-based extractor that pulls immune checkpoint inhibitor (ICI) drug exposures and AKI context from clinical notes. Works with a local vLLM-hosted OSS-120B.

Quick start

# install
pip install -U contextgem pandas pyarrow tqdm requests
export OPENAI_API_KEY=not-needed
# init oss endpoint
sbatch serve.sbatch
# run pipeline
python run.py --notes `path_to_notes_parquet` 
# or only those from the drug list
python run.py --notes `path_to_notes_parquet` --filter-ici

Inputs

Parquet columns: SERVICE_NAME, PHYSIOLOGIC_TIME, OBFUSCATED_GLOBAL_PERSON_ID, ENCOUNTER_ID, REPORT_TEXT

Outputs

outputs/.../drug_mentions.parquet (one row per ICI exposure)

  • patient_id(Int64), encounter_id(Int64), note_id(str), note_date(ts), note_type(str)
  • drug_text(str), normalized_name(str), canonical_generic(str|null), class(str|null)
  • rxcui(str|null), dose_text(str|null), route_text(str|null), when_text(str|null)
  • sentence(str)

outputs/.../note_concepts.parquet (one row per note)

  • patient_id(Int64), encounter_id(Int64), note_id(str), note_date(ts), note_type(str)
  • has_alt_cause(bool), n_drug_exposures_raw(int), n_drug_exposures(int), n_aki_mentions(int)
  • concepts_json(str) with keys: drug_exposures, aki_mentions, attributions, alt_causes

Useful flags

  • --debug process first 10 notes
  • --offset N skip first N notes then process
  • --filter-ici keep only drug exposures that match the ICI vocab

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published