Skip to content

Divyanalam98/sfo-exploratory-data-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

PLACES TO AVOID ON YOUR FIRST DATE IN SAN FRANCISCO

Pre Requisites: Python, Basic Python Libraries

Packages and Libraries used: NumPy, Pandas, Matplotlib,Ipyleaflet

In San Francisco, you can find more interesting dates than the typical coffee dates or happy hour beverages. San Francisco is a beautiful city well-known for its landscape and restaurants, and it’s a city filled with young people.

First dates are important, you do not want to end up in an embarrassing situation.

Image

In this project, we would be finding restaurants to avoid going to based on risk category, inspection score and other parameters. Good news, we would be finding some safe places as well!

The dataset is downloaded from Kaggle (https://www.kaggle.com/datasets/datasf/sf-restaurant-inspection-scores).Here’s some more description of the dataset:

The SF Health Department has developed an inspection report and scoring system. After conducting an inspection of the facility, the Health Inspector calculates a score based on the violations observed. Violations can fall into:high risk category: records specific violations that directly relate to the transmission of food borne illnesses, the adulteration of food products and the contamination of food-contact surfaces.moderate risk category: records specific violations that are of a moderate risk to the public health and safety.low risk category: records violations that are low risk or have no immediate risk to the public health and safety.The score card that will be issued by the inspector is maintained at the food establishment and is available to the public in this dataset.

San Francisco's LIVES restaurant inspection data leverages the LIVES Flattened Schema (https://goo.gl/c3nNvr), which is based on LIVES version 2.0, cited on Yelp's website (http://www.yelp.com/healthscores).

In this project, we would be looking at : ● Top 20 High Risk Restaurants ● Top 20 Violations observed in the restaurants ● Zipcodes with most high risk restaurant count ● Checking if one inspection score would help in determining the restaurant safety. ● Map with locations of safe restaurants marked.

The results obtained are mentioned below. Respective graphs and descriptions are provided as well.

Image

Figure 1: head and tail of dataframe

Image

Figure 2: Top 20 High Risk Restaurants in San Francisco ( based on risk category)

Image

Figure 3: Top 20 violations found in the restaurants

Image

Figure 4: Zipcodes with the most high risk restaurants

Image

Figure 5: Safe Restaurants in San Francisco ( low risk category and inspection score =100)

Image

Figure 6: Safe Restaurants plotted/marked on San Francisco map

Releases

No releases published

Packages

No packages published