š Overview
This project simulates a Brazilian Health Plan environment, analyzing key metrics through 20 SQL queries. The goal is to derive actionable insights from a fictional database created using Python with synthetic data generation. All data are fictitious and intended for portfolio and learning purposes only.
š§ Key Skills Demonstrated
- SQL (CTEs, joins, aggregations, ranking functions)
- Data modeling & healthcare KPIs
- Analytical thinking and storytelling
- Realistic database simulation with Brazilian health plan structure
šļø Database Structure
The project uses the following core tables (in .csv format):
- beneficiarios.csv ā Health plan beneficiaries
- planos.csv ā Health plans
- prestadores.csv ā Healthcare providers
- atendimentos1.csv, atendimentos2.csv, atendimentos3.csv ā Patient visits (split due to file size)
The full database simulates records from 2007 to 2024, following realistic patterns of a Brazilian healthcare operator.
š Analytical Scripts
Grouped by topic for better clarity:
š Evolution & Volume
- Evolution of active beneficiaries by year
- Annual evolution of appointments
- Total billing per year
- Most common procedures performed
š„ Beneficiary Profile
- Distribution by age group
- Distribution by gender (volume and cost)
- Distribution by city and state
- Beneficiaries with the highest cost
- Average appointments per beneficiary
- Average tenure of beneficiaries in the plan
š¼ Plan & Provider Performance
- Count of beneficiaries per health plan
- Cost comparison across plans (volume and value)
- Ticket average per appointment
- Providers with highest appointment volume
- Top specialties by volume
- Top providers by specialty
- Reutilization by provider
- Reutilization by specialty
š Temporal Analysis
- Appointment frequency by day of week
- Average length of stay in the plan (in days, weeks, months, years)
šļø Files
/1 - Evolution and Volume of Assistance ā 5 SQL scripts (.sql format)
/2 - Profile and Behavior of Beneficiaries ā 6 SQL scripts (.sql format)
/3 - Performance and Production of Providers ā 5 SQL scripts (.sql format)
/4 - Financial and Plans ā 3 SQL scripts (.sql format)
/5 - Administrative Information ā 1 SQL script (.sql format)
/Creating Database ā Database tables in CSV
README.md ā Project documentation
š How to Use
You can import the .csv files into any relational database (like MySQL, PostgreSQL, SQLite) and execute the scripts directly.
š Important Notes
This dataset was entirely generated with Python and does not reflect real patient data.
Some scripts use CTEs (WITH), window functions like ROW_NUMBER(), and date functions like DATEDIFF().
- Email: gioferracini97@gmail.com
- LinkedIn: linkedin.com/in/giovanni-ferracini
ā If you found this analysis useful or insightful, feel free to star the repository!