Skip to content

This repository includes my IBM Applied Data Science Capstone. Leveraged the Foursquare location data to explore/compare neighborhoods and cities of my choosing and came up with a problem that I could use the Foursquare location data to solve.

Notifications You must be signed in to change notification settings

lopez-christian/IBM-Coursera-Capstone

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IBM Data Science Professional Certificate - Data Science Capstone

Screen Shot 2020-11-01 at 7 40 15 PM

Week 1 - Introduction to Capstone Project

  • Introduction to Capstone Project
  • Location Data Providers
  • Signing-up for a Watson Studio Account

Peer-review Assignment: Capstone Project Notebook

In this assignment, you will be asked to create a new repository on your Github account, and to create a Jupyter Notebook and submit a shareable link to it for peer evaluation.

Week 2 - Foursquare API

  • Introduction to Foursquare
  • Getting Foursquare API Credentials
  • Using Foursquare API

Lab: Foursquare API

In this lab, you will learn in details how to make calls to the Foursquare API for different purposes. You will learn how to construct a URL to send a request to the API to search for a specific type of venues, to explore a particular venue, to explore a Foursquare user, to explore a geographical location, and to get trending venues around a location. Also, you will learn how to use the visualization library, Folium, to visualize the results.

  • Quiz: Foursquare API

Week 3 - Neighborhood Segmentation and Clustering

  • Clustering

Lab: Clustering

There are many models for clustering out there. In this lab, we will be presenting the model that is considered the one of the simplest model among them. Despite its simplicity, k-means is vastly used for clustering in many data science applications, especially useful if you need to quickly discover insights from unlabeled data.

Lab: Segmenting and Clustering Neighborhoods in New York City

In this lab, you will learn how to convert addresses into their equivalent latitude and longitude values. Also, you will use the Foursquare API to explore neighborhoods in New York City. You will use the explore function to get the most common venue categories in each neighborhood, and then use this feature to group the neighborhoods into clusters. You will use the k-means clustering algorithm to complete this task. Finally, you will use the Folium library to visualize the neighborhoods in New York City and their emerging clusters.

Peer-review Assignment: Segmenting and Clustering Neighborhoods in Toronto

In this assignment, you will be required to explore, segment, and cluster the neighborhoods in the city of Toronto. However, unlike New York, the neighborhood data is not readily available on the internet. What is interesting about the field of data science is that each project can be challenging in its unique way, so you need to learn to be agile and refine the skill to learn new libraries and tools quickly depending on the project.

For the Toronto neighborhood data, a Wikipedia page exists that has all the information we need to explore and cluster the neighborhoods in Toronto. You will be required to scrape the Wikipedia page and wrangle the data, clean it, and then read it into a pandas dataframe so that it is in a structured format like the New York dataset.

Once the data is in a structured format, you can replicate the analysis that we did to the New York City dataset to explore and cluster the neighborhoods in the city of Toronto.

Your submission will be a link to your Jupyter Notebook on your Github repository.

Week 4 - Capstone Project

  • Define a problem for your capstone project
  • Discuss the data that you will use to solve the problem

Peer-graded Assignment: Capstone Project - The Battle of Neighborhoods (Week 1)

Now that you have been equipped with the skills and the tools to use location data to explore a geographical location, over the course of two weeks, you will have the opportunity to be as creative as you want and come up with an idea to leverage the Foursquare location data to explore or compare neighborhoods or cities of your choice or to come up with a problem that you can use the Foursquare location data to solve. If you cannot think of an idea or a problem, here are some ideas to get you started:

In Module 3, we explored New York City and the city of Toronto and segmented and clustered their neighborhoods. Both cities are very diverse and are the financial capitals of their respective countries. One interesting idea would be to compare the neighborhoods of the two cities and determine how similar or dissimilar they are. Is New York City more like Toronto or Paris or some other multicultural city? I will leave it to you to refine this idea. In a city of your choice, if someone is looking to open a restaurant, where would you recommend that they open it? Similarly, if a contractor is trying to start their own business, where would you recommend that they setup their office? These are just a couple of many ideas and problems that can be solved using location data in addition to other datasets. No matter what you decide to do, make sure to provide sufficient justification of why you think what you want to do or solve is important and why would a client or a group of people be interested in your project.

Week 5 - Capstone Project (Cont'd)

  • Carry out the remaining work to complete the capstone project
  • Submit a link to your project notebook and a complete project report

Peer-graded Assignment: Capstone Project - The Battle of Neighborhoods (Week 2)

In this week, you will continue working on your capstone project. Please remember by the end of this week, you will need to submit the following:

1. A full report consisting of all of the following components (15 marks): Introduction where you discuss the business problem and who would be interested in this project. Data where you describe the data that will be used to solve the problem and the source of the data. Methodology section which represents the main component of the report where you discuss and describe any exploratory data analysis that you did, any inferential statistical testing that you performed, if any, and what machine learnings were used and why. Results section where you discuss the results. Discussion section where you discuss any observations you noted and any recommendations you can make based on the results. Conclusion section where you conclude the report.

2. A link to your Notebook on your Github repository pushed showing your code. (15 marks)

3. Your choice of a presentation or blogpost. (10 marks)

Screen Shot 2020-11-03 at 6 53 49 PM

Screen Shot 2020-11-03 at 6 54 08 PM

___

LINK TO COURSE:

https://www.coursera.org/professional-certificates/ibm-data-science

LINK TO BLOG POST:

https://lopez-christian.github.io/2020-11-01-IBM-applied-data-science-capstone/

LINK TO PROFESSIONAL CERTIFICATE:

https://www.coursera.org/account/accomplishments/specialization/certificate/WUM7FEYCNTNX

About

This repository includes my IBM Applied Data Science Capstone. Leveraged the Foursquare location data to explore/compare neighborhoods and cities of my choosing and came up with a problem that I could use the Foursquare location data to solve.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published