-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy path0005_analyze_multi_subject_trial_data.Rmd
158 lines (115 loc) · 7.23 KB
/
0005_analyze_multi_subject_trial_data.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
---
title: "Analyze multi-subject/trial RT data"
description: Simple tutorial and workflow for analyzing reaction time (RT) and accuracy data from multiple experimental subjects
output:
radix::radix_article:
toc: true
toc_depth: 3
editor_options:
chunk_output_type: console
---
```{r setup, include=FALSE}
# this chunk is for knitting Rmarkdown; you don't need this chunk if you're not knitting
knitr::opts_chunk$set(echo = TRUE, cache = FALSE, comment = NA, message = FALSE, warning = FALSE, R.options = list(width = 80), tidy = TRUE, tidy.opts = list(width.cutoff = 80))
```
This tutorial shows you how to analyze simple reaction time and acccuracy data that resemble data collected from standard psychology experiments (note that the three subjects' data were actually simulated with the `rwiener` function from the [RWiener](https://cran.r-project.org/web/packages/RWiener/index.html) package). This package also assumes you know how to set your working directory; if not, check out my intro tutorials [here](https://hausetutorials.netlify.com/0001_rbasics.html#workingcurrent-directory-where-are-you-and-where-should-you-be-getwd).
I use the [data.table](https://github.com/Rdatatable/data.table/wiki) package extensively in my analyses. `tidyverse` and `dplyr` are great, but I prefer the speed and concise syntax of `data.table`. I won't explain how `data.table` works in this tutorial, but you'll see in the code below how flexible and concise it is. If you want to learn more, see [my tutorials](https://hausetutorials.netlify.com/0002_tidyverse_datatable.html) and extra [resources](https://hausetutorials.netlify.com/posts/2019-04-11-datatable-resources/).
## Consider being a patron and supporting my work?
[Donate and become a patron](https://donorbox.org/support-my-teaching): If you find value in what I do and have learned something from my site, please consider becoming a patron. It takes me many hours to research, learn, and put together tutorials. Your support really matters.
## Getting the data and code
* Download Rmarkdown script for this tutorial [here](https://raw.githubusercontent.com/hauselin/rtutorialsite/master/0005_analyze_multi_subject_trial_data.Rmd).
* Download data for all my tutorials [here](https://www.dropbox.com/sh/2ix85ij7chfuqdg/AABpkIMymvjOE5Dh9iV7SULga?dl=1).
* Inside the data folder, you should see a `rt_acc_data_raw` folder that contains three subjects' data, which we will analyze in this tutorial.
* Leave the folders and files in the unzipped folder as they are because we'll use R to read the csv files into our R environment.
## Dataset information
* 3 subjects: 30, 35, 39
* Completed a task where reaction time (rt) and accuracy (acc) were collected
* 2 blocks in the experiment (1, 2)
* 40 trials per block; 4 blocks in total; 80 trials in total
* 2 experimental manipulations: reward available on trial (high vs low); instructions for block (focus on responding carefully [caution] vs focusing on responding quickly (speed))
### Variables
* subject: subject id
* completed: whether subject completed study (1, 0)
* rt: reaction time
* acc: accuracy
* reward: reward available on trial (high or low)
* instructions: block instructions (caution or speed)
* block: block number (1, 2)
* trialinblock: trial in block (1 to 4 0)
* overalltrial: overall trial number (1 to 80)
## Organizing folders/directory
This is how my directories/folders/files look like. You should make sure yours look similar before you continue.
— 0005_analyze_multi_subject_trial_data.Rmd
——— data
————— rt_acc_data_raw
——————— subject_30.csv
——————— subject_35.csv
——————— subject_39.csv
## Load libraries
```{r libraries}
library(tidyverse); library(data.table); library(lme4); library(lmerTest); library(ggbeeswarm); library(hausekeep)
```
<aside>
[hausekeep](https://hauselin.github.io/hausekeep/) is my package. It contains a few helper functions. You may or may not want to use them. If you want to, install the package by running `devtools::install_github("hauselin/hausekeep")` (you might need to install `devtools` package first).
</aside>
## Read all subjects' data into R
```{r read data}
(files <- list.files(path = "./data/rt_acc_data_raw", pattern = "subject_", full.names = TRUE)) # find matching file names
dt1 <- bind_rows(lapply(files, fread)) %>% as.data.table() # read data, combine items in list, convert to data.table
```
Note and explanation: `lapply` loops through each item in `files` and applies the `fread` function to each item. `bind_rows` combines all the dataframes stored as separate items in the list, and `as.data.table()` converts the merged dataframe into `data.table` class.
## Summarize experimental design
```{r summarize design}
dt1[, .N, by = subject] # rows (trials) per subject
dt1[, .N, keyby = .(subject, block)] # blocks per subject and rows/trials per block
dt1[, .N, keyby = .(subject, instructions, reward)] # row (trials) per subject for instructions/reward conditions
```
## Remove outlier trials
```{r remove outliers}
dt1[rt < 0.2, `:=` (rt = NA, acc = NA)] # if rt < 0.2, convert rt and acc to NA (too fast response on that trial)
dt1[rt > 3.0, `:=` (rt = NA, acc = NA)] # if rt > 3.0, convert rt and acc to NA (too slow response on that trial)
summary(dt1$rt) # check range and number of NA
summary(dt1$acc) # check range and number of NA
```
## Compute mean RT and ACC for each subject, for each condition
```{r mean rt/acc}
dt1_avg <- dt1[, .(rt = mean(rt, na.rm = T), acc = mean(acc, na.rm = T)), keyby = .(subject, instructions, reward)]
dt1_avg
```
## Compute grand average RT and ACC (and within-subject error bars)
```{r grand average rt/acc}
dt1_grdavg <- seWithin(data = dt1_avg, measurevar = c("rt", "acc"), withinvars = c("instructions", "reward"), idvar = "subject")
```
<aside>
`seWithin` computes within-subject error bars and is a function from my `hausekeep` package.
</aside>
## Plot reaction time
```{r plot rt}
plot_rt <- ggplot(dt1_grdavg$rt, aes(instructions, rt, col = reward)) +
geom_quasirandom(data = dt1_avg, alpha = 0.8, dodge = 1, size = 1.5) +
geom_point(position = position_dodge(1), shape = 95, size = 6) +
geom_errorbar(aes(ymin = rt - ci, ymax = rt + ci), position = position_dodge(1), width = 0, size = 1.1) +
labs(y = "reaction time (95% CI)")
plot_rt
```
## Plot accuracy
```{r plot acc}
plot_acc <- ggplot(dt1_grdavg$acc, aes(instructions, acc, col = reward)) +
geom_quasirandom(data = dt1_avg, alpha = 0.8, dodge = 1, size = 1.5) +
geom_point(position = position_dodge(1), shape = 95, size = 6) +
geom_errorbar(aes(ymin = acc - ci, ymax = acc + ci), position = position_dodge(1), width = 0, size = 1.1) +
scale_y_continuous(limits = c(0, 1.2), breaks = seq(0, 1.0, 0.25)) +
labs(y = "accuracy (95% CI)")
plot_acc
```
## Fit linear mixed models to single-trial RT and ACC
```{r fit models}
model_rt <- lmer(rt ~ instructions + reward + (1 | subject), data = dt1)
summary(model_rt)
summaryh(model_rt) # summary function from my hausekeep package
model_acc <- glmer(acc ~ instructions + reward + (1 | subject), data = dt1, family = binomial)
summary(model_acc)
summaryh(model_acc) # summary function from my hausekeep package
```
## Support my work
[Support my work and become a patron here](https://donorbox.org/support-my-teaching)!