-
Notifications
You must be signed in to change notification settings - Fork 34
Data Wrangling, Exploration and Visualization
Rajdeep Biswas edited this page Jun 12, 2020
·
3 revisions
Data is cleansed and enriched using SparkR and SparkSQL. The curated dataset is written in Azure Blob storage in parquet format (parquet.apache.org, n.d.) partitioned by City Name. The Code can be referred from “Step02a_Data_Wrangling” R Notebook from the artifacts section.
Data exploration and visualization is done using SparkR, SparkSQL, ggplot2, htmltools, htmlwidgets, leaflet with ESRI plugin, magrittr etc. The Code and detailed exploration can be referred from “Step02b_Data_Exploration_Visualization” R Notebook from the artifacts section. Below are some highlights from this notebook.
Changes Over Time - Volume of All Safety Calls and specific Safety Calls (Graffiti in this example):
Fully explorable geoplot done using leaflet with ESRI plugin (with a subset of the data) attached in the artifacts :
The interactive map with a subset of the data can be viewed here