This repository has been archived by the owner on Apr 28, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 0
/
00_whatisdata.Rmd
34 lines (21 loc) · 3.58 KB
/
00_whatisdata.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
{
course-completeness: 100
course-attempts: 2
default-quiz-attempts: 2
default-random-choice-order: true
default-quiz-show-answers: none
}
# What is data?
The basics of data were covered in the Introductory course, where we defined data as "any information that you can store on a computer." Examples we discussed previously of data were text messages, Facebook posts, websites you visit, things you buy with a credit card, pictures of your car on speed cameras, and information you fill out in profiles for your work, school, or community organizations. We said that if you can take a picture of it, write about it, make a video of it, or record it on audio - then it is probably data. All of this information can be collected and saved on a computer. This definition of data is still true, and it is the definition we're going to use. In this lesson, we will discuss what data are in a bit more detail.
### Types of data
Generally, there are just two types of data. If data are numerical, consisting of counts or measurements, they're referred to as quantitative data. If they are **not** numerical, they're qualitative. Qualitative data would include words or text, but they could also include photographs, videos, or audio recordings. Every photo you have ever taken with your phone is data, and more specifically, it's qualitative data.
Examples of:
* Qualitative data - eye color, gender, TRUE/FALSE responses, hometown, photos, text files, audio files, videos, etc.
* Quantitative data - height, weight, heart rate, daily step count, body temperature, test scores, etc.
While people often distinguish between qualitative data and quantitative data, the truth is that qualitative data can become quantitative data. For example, if you collect eye color information from individuals in each country around the world, you could then summarize *how many* people in each country have brown eyes. Suddenly, you have quantitative data - a number of individuals in each country with brown eyes. We will be working with lots of different data sets throughout this curriculum. Knowing the distinction between qualitative and quantitative data is not crucially important. But as a data scientist, it will be crucially important for you to know how to work with a lot of different data sets. Throughout these courses, you'll do projects to familiarize yourself with how to work with data and get practice working with various different data sets to ensure that you can work with data, no matter what type it is.
### Why is data important?
You generate both qualitative and quantitative data all the time. The taps on a touchscreen at the bank are data. The GPS from the maps app in your phone generate data about you. And, every credit card purchase you have ever made has generated data. Taken together, data can tell you a lot about a person, a company, or an entire population.
Data are being used to make business decisions, to decide who to advertise a product to, and to decide how much someone should pay for insurance. As a data scientist, you will be using data to answer interesting questions. The role of the data scientist is to be able to collect and clean the data, study the data, create models to help understand and answer the question, and to share your findings with other people. We will work through each of these steps throughout this curriculum; however, understanding the importance of data and what qualifies as data is an important first step.
### Slides and Video
[Automated Video](https://youtu.be/zMFI9z47psg)
* [Slides](https://docs.google.com/presentation/d/1btywbP59z-QJKtRw5yjQn81Gsck8Tc6Q-q3djeCfnYI/edit?usp=sharing)