Benchmarking LLM

About The Project

Our project introduces a metric designed to evaluate the quality of textual summaries. This metric is pivotal in fields like finance, where precise information synthesis is critical.

Quality Discrimination: Distinguishes effectively between superior and inferior summaries, ensuring clear differentiation in their factual accuracies.
Factual Accuracy Measurement: Detects and quantifies any factual deviations, assigning lower scores to less accurate summaries.
Detail-Oriented Assessment: Provides comprehensive evaluations, focusing on how well the summary captures the essence and details of the original text.

This metric is not merely a tool for evaluation; it's a step towards enhancing the integrity of information processing in sectors where factual accuracy is non-negotiable.

Framework

Named Entity Comparison: Extract and compare financial-related named entities in texts. Analyzes and visualizes named entity accuracy and presence in summaries versus original texts.

Sentence-Level-based Summary Checking: Applies LLMs to check the consistency between the summary and the original text sentence by sentence. Highlights and identifies inconsistencies between the summary and the original text for in-depth analysis.

Direcroty Tree

│   .gitignore
│   LICENSE.txt
│   README.md
│
├───config
│       config.sh
│       requirements.txt
│
├───data
│       10summary_with_result.csv
│       falsified_summary.csv
│       falsified_summary_level.csv
│       final_version_cropped_first1000.csv
│       final_version_withouttext.csv
│
├───doc
│   ├───About_Us
│   │       Team's Bio.pdf
│   │
│   ├───Academic Paper
│   │       5054_factuality_enhanced_language_m.pdf
│   │       Evaluating Factuality.pdf
│   │       Evaluating the Factual Consistency.pdf
│   │
│   ├───Project Description
│   │       Benchmarking LLM .pdf
│   │       CAPSTONE PROJECT PROPOSAL Fidelity Summarization Metrics.pdf
│   │
│   └───Report
│           Capstone Project Initial Due Diligence Report.pdf
│           F23_Fidelity_Benchmarking LLM_1st_report.pdf
│           F23_Fidelity_Benchmarking LLM_final_report.pdf
│           F23_Fidelity_BenchmarkLLM_poster.pdf
│           Project Proposal.pdf
│
├───res
│   │   10levels.svg
│   │   good_to_bad.svg
│   │   LLM_Assisted_Framework.jpg
│   │   NER_Framework.jpg
│   │
│   └───Baseline
│           Boxplot_for_Scores.png
│
├───samples
│       documents_extraction.ipynb
│       presentation.ipynb
│       summary_level_with_result.csv
│
├───src
│   │   Bart.py
│   │   PaLM.py
│   │   pipeline.py
│   │   summary_generation.py
│ 
│
└───test
    ├───Data_Pipeline
    └───Summary_Generation
            bart.ipynb
            llama2.ipynb
            PaLM2.ipynb
            test.py

Getting Started

Dependencies

python==3.10.0
ipython==8.15.0
nltk==3.8.1
numpy==1.24.3
openai==1.3.7
pandas==1.5.3
python-dotenv==1.0.0
rouge_score==0.1.2
scikit_learn==1.2.2
sentence_transformers==2.2.2
spacy==3.7.2
stanza==1.6.1

Configuration

1. Environment setup

Setup with python virtual environment

bash ./config/config.sh

Setup with conda

bash conda install --file ./config/requirements.txt

2. OpenAI API setup

import sys
sys.path.append('../src/')
import pipeline
os.environ['OPENAI_API_KEY'] = 'Your OpenAi API Key'

Usage

The data extraction process is in documents_extraction

You can also find the demo and result compare with baseline metrics in presentation.

Report

Initial Due Deiligence Report: Initial Due Deiligence Report
Project Proposal: Project Proposal
1st Milestone Report: 1st Milestone Report
Final Report: Final Report
Poster: Poster

License

About us

Group Members

Cong Chen (cc4887)

Longxiang Zhang (lz2869)

Ruolan Lin (rl3312)

Taichen Zhou (tz2555)

Yichen Huang (yh3550) - Team Captain

Fidelity Memtors

Lilli Ann Rowan, Indraneel Biswas, Michael Threlfall, and Diana Kulmizev

Instructor/CA

Adam Kelleher

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Benchmarking LLM

About The Project

Framework

Direcroty Tree

Getting Started

Dependencies

Configuration

1. Environment setup

Setup with python virtual environment

Setup with conda

2. OpenAI API setup

Usage

Report

License

About us

Group Members

Cong Chen (cc4887)

Longxiang Zhang (lz2869)

Ruolan Lin (rl3312)

Taichen Zhou (tz2555)

Yichen Huang (yh3550) - Team Captain

Fidelity Memtors

Instructor/CA

About

Releases

Packages

Contributors 4

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
config		config
data		data
doc		doc
res		res
samples		samples
src		src
test/Summary_Generation		test/Summary_Generation
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md

License

Lucas66Zhang/ButisCapstone4Fidelity

Folders and files

Latest commit

History

Repository files navigation

Benchmarking LLM

About The Project

Framework

Direcroty Tree

Getting Started

Dependencies

Configuration

1. Environment setup

Setup with python virtual environment

Setup with conda

2. OpenAI API setup

Usage

Report

License

About us

Group Members

Cong Chen (cc4887)

Longxiang Zhang (lz2869)

Ruolan Lin (rl3312)

Taichen Zhou (tz2555)

Yichen Huang (yh3550) - Team Captain

Fidelity Memtors

Instructor/CA

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages