-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path11-example_cancer.Rmd
108 lines (64 loc) · 2.88 KB
/
11-example_cancer.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
# Intro: Cancer Experiemnt {#cancerprep}
```{r, message=FALSE, error=FALSE}
library(tidyverse) # super helpful everything!
library(haven) # inporting SPSS, SAS, & Stata data files
library(psych) # lots of nice tidbits
```
## Source of Data
Mid-Michigan Medical Center, Midland, Michigan, 1999: A study of oral condition of cancer patients.
## Description of the Study
The data set contains part of the data for a study of **oral condition** of cancer patients conducted at the Mid-Michigan Medical Center. The oral conditions of the patients were measured and recorded at the *initial* stage, at the end of the *second week*, at the end of the *fourth week*, and at the end of the *sixth week*. The variables age, initial weight and initial cancer stage of the patients were recorded. Patients were divided into two groups **at random**: One group received a placebo and the other group received aloe juice treatment.
> Sample size n = 25 patients with neck cancer.
>
> The treatment is Aloe Juice.
## Variables
* `ID` patient identification number
* `trt` treatment group
+ `0` *placebo*
+ `1` *aloe juice*
* `age` patient's age, *in years*
* `weightin` patient's weight *(pounds)* at the initial stage
* `stage` initial cancer stage
+ coded `1` through `4`
* `totalcin` oral condition at the *initial stage*
* `totalcw2` oral condition at the end of *week 2*
* `totalcw4` oral condition at the end of *week 4*
* `totalcw6` oral condition at the end of *week 6*
## Import Data
The `Cancer` dataset is saved in SPSS format, which is evident from the `.sav` ending on the file name.
The `haven` package is downloaded as part of the `tidyverse` set of packages, but is not automatically loaded. It must have its own `library()` function call *(see above)*. The `haven::read_spss()` function works very simarly to the `readxl::read_excel()` function we used last chapter [@R-haven].
* Make sure the **dataset** is saved in the same *folder* as this file
* Make sure the that *folder* is the **working directory**
```{r}
cancer_raw <- haven::read_spss("https://github.com/CEHS-research/PSY-6600_students/raw/master/Data/Cancer.sav")
```
```{r}
tibble::glimpse(cancer_raw)
```
## Wrangle Data
```{r}
cancer_clean <- cancer_raw %>%
dplyr::rename_all(tolower) %>%
dplyr::mutate(id = factor(id)) %>%
dplyr::mutate(trt = factor(trt,
labels = c("Placebo",
"Aloe Juice"))) %>%
dplyr::mutate(stage = factor(stage))
```
## Overview
### Dimentions (size) of the dataset
```{r}
dim(cancer_clean)
```
### Variable Names
```{r}
names(cancer_clean)
```
### Quick Glimpse
```{r}
tibble::glimpse(cancer_clean)
```
### Top and Bottom Rows
```{r}
psych::headTail(cancer_clean)
```