MAPLE (Bill Summarization, Tagging, Explanation)

In this project, we generate summaries and category tags for of Massachusetts bills for MAPLE Platform. The goal is to simplify the legal language and content to make it comprehensible for a broader audience (9th-grade comprehension level) by exploring different ML and LLM services.

This repository contains a pipeline from taking bills from Massachusetts legislature, generating summaries and category tags leveraging different the Massachusetts General Law sections, creating a dashboard to display and save the generated texts, to deploying and integrating into MAPLE platform.

Roadmap of Repository Directories

Documentation:
Research.md: our research on large language models and evaluation methods we planned to use for this project.
Documentation MAPLE.pdf: includes detail operation of our model for future use and improvement.
EDA: the notebook eda.ipynb includes our work from scraping data that takes bills from MAPLE Swagger API, creating a dataframe to clean and process data, making visualizations to analyze data and explore characteristics of the dataset.
demoapp:
app.py: contains the codes of the LLM service we used and the wepapp we made using Streamlit. The webapp allows user to search for all bills.
app2.py: we test on top 12 bills from MAPLE website. We extract information from Massachusetts General Law to add context for the summaries of these bills.
Other files: helper files to be imported in the above two Python app files.
Prompts Engineering: prompts.md stores all prompts that we tested.
Tagging: contains the list of categories and tags.
Deployment: contains the link of our Streamlit deployed webapp.

Ethical Implications

The dataset used for this project is fully open sourced and can be access through Mass General Laws API.

Our team and MAPLE agree about putting disclaimer that this text is AI-generated.

Although we make use of open source transformers to evaluate hallucination with Vectara, it is important to have experts and human evaluation to further maintain a trustworthy LLM system.

Resources and Citation

Team Members

Vy Nguyen - Email: nptv1207@bu.edu
Andy Yang - Email: ayang903@bu.edu
Gauri Bhandarwar - Email: gaurib3@bu.edu
Weining Mai - Email: weimai@bu.edu

Name		Name	Last commit message	Last commit date
Latest commit History 213 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
Deployment		Deployment
Documentation		Documentation
EDA		EDA
Prompts Engineering		Prompts Engineering
Tagging		Tagging
demoapp		demoapp
.gitignore		.gitignore
COLLABORATORS		COLLABORATORS
LICENSE		LICENSE
Project_outline.md		Project_outline.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MAPLE (Bill Summarization, Tagging, Explanation)

Roadmap of Repository Directories

Ethical Implications

Resources and Citation

Team Members

About

Releases 1

Packages

Contributors 5

Languages

License

BU-Spark/ml-maple-bill-summarization

Folders and files

Latest commit

History

Repository files navigation

MAPLE (Bill Summarization, Tagging, Explanation)

Roadmap of Repository Directories

Ethical Implications

Resources and Citation

Team Members

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 5

Languages

Packages