-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
205 lines (153 loc) · 6.64 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.retina = 2,
fig.width = 6,
fig.height = 3,
fig.path = "man/figures/README-",
out.width = "80%"
)
library(tidyverse)
library(Shewhart) %>% suppressPackageStartupMessages()
```
# Shewhart <a href="https://github.com/castlaboratory/Shewhart"><img src="man/figures/logo.png" align="right" height="114" alt="Shewhart website" /></a>
<!-- badges: start -->
![](https://img.shields.io/badge/devel%20version-0.1.1-blue.svg)
<!-- badges: end -->
The goal of Shewhart is to provide a streamlined workflow for detecting and modeling process phase changes using Shewhart control charts. It features tools for fitting common transformations (log, log-log, Gompertz), identifying phases via the “7 consecutive points” rule, and computing control limits on both transformed and original scales. The package leverages the tidyverse for intuitive data manipulation and is designed to be easily integrated into broader data analysis pipelines.
## Installation
You can install the development version of Shewhart from [GitHub](https://github.com/castlaboratory/Shewhart) with:
``` r
# install.packages("pak")
pak::pak("castlaboratory/Shewhart")
```
## Usage Overview
This package provides functions for creating and analyzing **Shewhart control charts** (using either **ggplot2** or **plotly**) and detecting potential phase changes via the "7 consecutive points" rule. Key functions include:
- `shewhart()`: Creates the control chart plot (either ggplot2 or plotly).
- `shewhart_7points()`: Detects phase changes by checking for 7 consecutive points above or below the central line.
- `shewhart_model()`: Builds a data frame (or tibble) with annotated phase information and control limits.
- `shewhart_fit()`: Internally fits the specified model (e.g., "log", "loglog", or "gompertz").
## Example Data
For demonstration, we'll use a dataset (`cvd_recife`) included in the package:
```{r, eval = TRUE}
cvd_recife <- readr::read_rds(
system.file("extdata",
package = "Shewhart",
file = "recife_2020_covid19.rds")
) %>%
slice_head(n = 50)
cvd_recife %>%
glimpse()
```
In this dataset, `data` is a date column, and `obitosNovos` are new daily deaths (integer counts).
### Plotting with ggplot2
You can create a **ggplot2**-based Shewhart chart simply by setting `type = "ggplot"` (the default). The `shewhart()` function will detect whether the dataset already has phase information—if not, it internally calls `shewhart_7points()` and `shewhart_model()`.
```{r, eval = TRUE}
shewhart(
data = cvd_recife,
index_col = data,
values_col = obitosNovos,
locale = "en_US"
)
```
You can specify a different model (e.g., "loglog"):
```{r, eval = TRUE}
shewhart(
data = cvd_recife,
index_col = data,
values_col = obitosNovos,
model = "loglog",
locale = "en_US"
)
```
In the next example, we first add a new factor column `wd` to the cvd_recife dataset by extracting the day of the week from the data column. By specifying `dummy_col = wd` in the `shewhart()` call, we allow the underlying Shewhart model (here using `model = "log"`) to account for potential day-of-week effects. In other words, each weekday becomes a “dummy” factor that might influence the estimated control limits, helping reveal whether there are systematic shifts or patterns associated with particular days. We also set locale = "en_US" so that date labels and text are formatted in English.
```{r, eval = TRUE}
shewhart(data = cvd_recife %>% mutate(wd = factor(wday(data))),
index_col = data,
values_col = obitosNovos,
model = "log",
dummy_col = wd,
locale = "en_US")
```
```{r, eval = TRUE}
shewhart(data = cvd_recife %>% mutate(wd = factor(wday(data))),
index_col = data,
values_col = obitosNovos,
model = "loglog",
dummy_col = wd,
locale = "en_US")
```
### Plotting with plotly
To generate an **interactive** chart with **plotly**, set `type = "plotly"`:
```{r, eval = FALSE}
shewhart(
data = cvd_recife,
index_col = data,
values_col = obitosNovos,
locale = "en_US",
model = "log",
type = "plotly"
)
```
Hover over points or ribbons to see tooltips, zoom in/out, and more.
## Autodetecting Phases
You can explicitly detect phase changes using the **7-point rule** via `shewhart_7points()`:
```{r, eval = TRUE}
phase_dates <- shewhart_7points(
data = cvd_recife,
index_col = data,
values_col = obitosNovos
)
print(phase_dates)
```
## Building a Model for Custom Usage
You can also build the phase model data frame yourself by calling `shewhart_model()`, passing the detected phase changes:
```{r, eval = TRUE}
shwt_model <- shewhart_model(
data = cvd_recife,
index_col = data,
values_col = obitosNovos,
phase_changes = phase_dates
)
shwt_model %>% head()
```
This tibble now has columns like `phase`, `CL`, `UL_EXP`, `LL_EXP`, etc., which can be used directly in your custom plots or analyses.
## Reusing an Existing Model or Phase Changes
If you already have the `phase_changes` or the annotated data frame from `shewhart_model`, you can pass them to `shewhart()` to avoid recalculating:
```{r, eval = TRUE}
# Using phase_changes explicitly:
shewhart(
data = cvd_recife,
index_col = data,
values_col = obitosNovos,
model = "loglog",
locale = "en_US",
phase_changes = phase_dates
)
# Or using the 'shwt_model' directly:
shewhart(
data = shwt_model,
index_col = data,
values_col = obitosNovos,
model = "loglog",
locale = "en_US"
)
```
Both approaches yield the same final plot but skip re-running the 7-point detection or the modeling if the data is already annotated.
## Dependencies
This package uses:
- **lubridate (>= 1.8.0)** for date handling and wday extraction
- **tidyverse (>= 1.3.0)** for data manipulation (dplyr, tidyr, etc.)
- **tibbletime (>= 0.1.6)** or **slider** for rolling sum calculations
- **tidymodels (>= 1.0.0)** for modeling convenience
- **pals (>= 1.7)** and **scales (>= 1.2.1)** for color palettes and scaling
- **plotly (>= 4.1)** for interactive plots
Ensure these are installed and loaded (as needed) before using `Shewhart`.
## More Information
Visit <https://castlab.org> for updates and other packages/research from the CASTLab team. If you have suggestions or encounter issues, please open an Issue on [GitHub](https://github.com/castlaboratory/Shewhart/issues) or contact us at <leite@castlab.org>.
Happy charting!