Skip to content

Latest commit

 

History

History
81 lines (54 loc) · 3.77 KB

README.md

File metadata and controls

81 lines (54 loc) · 3.77 KB
Banner

    🚕 NYC Taxi Trip Records Data Analysis

Data Engineering Project Using GCP & MageAI

Dashboard 🌀 Data ☄️ Request Feature

🎯 Goal

The goal of this project is to perform data analytics on NYC Taxi Trip Records using various tools and technologies, including GCP Storage, Python, Compute Instance, Mage Data Pipeline Tool, BigQuery, and Looker Studio.

💾 Dataset Used

Yellow trip records include fields capturing pick-up and drop-off dates/times, pick-up and drop-off locations, trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts. The data used in the attached datasets were collected and provided to the NYC Taxi and Limousine Commission (TLC) by technology providers authorized under the Taxicab & Livery Passenger Enhancement Programs (TPEP/LPEP).

More info about the dataset can be found here :

  1. Website - https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page
  2. Data Dictionary - https://www.nyc.gov/assets/tlc/downloads/pdf/data_dictionary_trip_records_yellow.pdf

📊 Dashboard

image image

🕵️ Key Insights

  • 🧳 Total Trips

    • "VeriFone Inc" is the provider with the most number of trips with over 88k trips and "Creative Mobile Technologies" with only 11k trips.
  • 💳 Top Payment Types

    • N°1: Credit Card with 66%
    • N°2: Cash with 33%
  • 👨‍👩‍👧‍👧 Number of passengers by trip

    • 65% of the trips have only 1 passenger.
    • 13% have 2 passengers.
    • 8% have 5 passengers.
  • 💵 Common Rate Code

    • The most common final rate code in effect at the end of the trip is the "Standard rate" with over 97%, followed by JFK with 2.2%, Negotiated fare etc. with less than 1%

🛠️ Technologies Used

Python Pandas Jupyter Google Cloud mageai

📝 Project Architecture

Banner

📄 Data Model

nyctaxi-data-model

🔧 Mage ETL

nyctaxi-mage-etl

📨 Contact Me

LinkedInWebsiteGmail