Skip to content

kondratevakate/medical-image-report-generation-tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 

Repository files navigation

Medical Image Report Generation Models

This document provides an overview of state-of-the-art models for generating medical image reports. We compare two main approaches: end-to-end large language models (LLM) and image-segmentation report approaches.

End-to-End LLM Approach

Pros:

  • Direct generation of reports from images.
  • Handles raw data with rich context.
  • Flexible and scalable for various medical imaging tasks.

Cons:

  • Requires extensive training data.
  • May produce hallucinations and overfitting.
  • Needs robust filtering and augmentation.

Additional Context:

"Using LLMs like GPT-4 can help with normalization detection. CheXagent provides a solid baseline for report generation with a large dataset. Fine-tuning on private data (e.g., with LLaVA) can yield good results. However, current models often face issues like hallucinations and overfitting."
Senior AI Researcher in AI University

"At our organization, we generate reports using fixed logic and templates, not LLMs, due to their unreliability and limited added value in this context."
CTO at MedAI Startup

"End-to-end models that combine segmentation and text generation are being developed but often have poor performance in practice."
Senior Research Engineer in Startup

"We anchor our models to a set of AI validated outputs to ensure reliability and accuracy."
Annalise.ai Representative (YouTube Video)

Model Name # stars Unique Features Performance Highlights Source Code Link
PromptMRG ⭐⭐⭐⭐ Uses diagnosis-driven prompts (DDP), cross-modal feature enhancement Higher diagnostic accuracy, improved clinical relevance of reports arXiv GitHub
KERP ⭐⭐⭐⭐ Combines abnormality graph learning with template retrieval and paraphrasing Structured and accurate reports, state-of-the-art results in classification AAAI GitHub
IIHT ⭐⭐⭐⭐ Classifier, indicator expansion, and generator modules mimicking radiologists' workflow Effective modeling of hierarchical report generation SpringerLink GitHub
MedRAT ⭐⭐⭐⭐ Does not require paired image-report data, uses auxiliary tasks Detailed, contextually relevant reports, surpasses previous methods Papers With Code GitHub
CheXagent ⭐⭐⭐⭐ Trained on the largest publicly available dataset of image and text pairs Solid baseline for medical report generation Hugging Face GitHub
LLaVA ⭐⭐⭐⭐ Fine-tuned on private datasets for flexible and customizable results Comparable to other top models, flexible influence on results BioNLP Workshop GitHub

Image-Segmentation Report Approach

Pros:

  • Reliable and interpretable results.
  • Facilitates precise measurements and visualizations.
  • Easier management of segmentation tasks.

Cons:

  • Requires detailed segmentation models for each pathology.
  • Time-consuming development and re-training when templates change.

Additional Context:

"We use fixed logic and templates for report generation instead of LLMs due to their unreliability."
CTO at MedAI Startup

"Segmentation is often not used for modalities like chest X-rays due to their limited detail. However, end-to-end segmentation and text generation can be useful for other imaging modalities."
Senior Research Engineer in Startup

Project Name # stars Description Scenario Source
Raidionics ⭐⭐⭐⭐ Provides a complete pipeline for medical image segmentation and report generation using templates Detection, Segmentation, Reporting GitHub
MONAI ⭐⭐⭐⭐ PyTorch-based framework for deep learning in healthcare imaging Preprocessing, Classification, Segmentation GitHub
Medical Detection Toolkit ⭐⭐⭐ Contains 2D + 3D implementations of prevalent object detectors for medical images Detection, Segmentation GitHub
TransUnet ⭐⭐⭐ Transformers for medical image segmentation Segmentation GitHub

Comparison of Approaches

End-to-End LLM Approach:

  • Pros: Direct generation of reports from images, handles raw data with rich context, flexible and scalable for various medical imaging tasks.
  • Cons: Requires extensive training data, may produce hallucinations and overfitting, needs robust filtering and augmentation.

Image-Segmentation Report Approach:

  • Pros: Reliable and interpretable results, facilitates precise measurements and visualizations, easier management of segmentation tasks.
  • Cons: Requires detailed segmentation models for each pathology, time-consuming development and re-training when templates change.

Both approaches have their strengths and are suited to different aspects of medical imaging and report generation. End-to-end LLM approaches are more flexible and scalable, while image-segmentation report approaches offer precision and reliability.

References

  1. PromptMRG: arXiv
  2. KERP: AAAI
  3. IIHT: SpringerLink
  4. MedRAT: Papers With Code
  5. CheXagent: Hugging Face
  6. LLaVA: BioNLP Workshop
  7. Raidionics: GitHub
  8. MONAI: GitHub
  9. Medical Detection Toolkit: GitHub
  10. TransUnet: GitHub
  11. Annalise.ai: YouTube Video

About

A collection for LLM and computer vision SOTA solutions

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published