-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path12_nibrs_administrative.Rmd
489 lines (394 loc) · 38.5 KB
/
12_nibrs_administrative.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
# Administrative Segment
```{r, echo=FALSE}
knitr::opts_chunk$set(
echo = FALSE,
warning = FALSE,
error = FALSE,
message = FALSE
)
```
```{r, results='hide'}
administrative <- readRDS("data/nibrs_administrative_segment_2023.rds")
window_exceptional <- readRDS("data/nibrs_window_exceptional_clearance_segment_2021.rds")
nibrs_administrative_summary_stats <- readRDS("data/nibrs_summary_stats/nibrs_administrative_summary_stats.rds")
administrative$total_offender_segments_temp <- administrative$total_offender_segments
administrative$total_arrestee_segments_temp <- administrative$total_arrestee_segments
gc()
```
The Administrative Segment provides information about the incident itself, such as how many victims or offenders there were. In practice this means that it tells us how many other segments - Offense, Offender, Victim, and Arrestee Segments - there are for this particular incident. It also has several important variables at the incident-level such as what hour of the day the incident occurred and whether the "incident date" variable refers to the date the incident occurred or the date it was reported to the police. Finally, it tells us whether the case was cleared exceptionally and, if so, what type of exceptional clearance it was. An exceptional clearance is one where the police can declare the case closed but without making an arrest. This can tell us, for example, how many crimes were cleared because the offender died or the victim refused to cooperate. In comparison, the Offenses Known and Clearances by Arrest data that is part of the SRS and is detailed in Chapter \@ref(offensesKnown), tells us how many offenses were cleared by either arrest or exceptional clearance, but does not differentiate between the two. NIBRS, therefore, provides a deeper understanding of case outcomes.
## The incident and report date
An important variable, especially for policy analyses, is when the crime happened. NIBRS tells you both the date and the hour of the day for when the crime occurred. We will start with the date. We can convert the date a few different ways, such as daily, weekly, monthly, quarterly. We could use this precise date to do regression discontinuity studies where we look at days just before and just after some policy change or natural experiment. In this chapter we will look simply at the percent of crimes each month and each day of the month (overall, not within each month). And we will look at all incidents; if you want to see the distribution for certain offenses or victim/offender groups you will need to merge this segment with one of the other segments.
Figure \@ref(fig:administrativeIncidentMonth) shows the percent of incidents in the 2022 for each month. Past research has found that crimes are lowest when it is cold and highest when it is hot^[Summer also comes with many teens and young adults out of school so have more free time to offend or be victimized, so the weather is only part of the cause.]. Consistent with previous research, we find that crime rates are lowest in February, steadily increasing through the warmer months before peaking in July and August, then decreasing as temperatures cool. These seasonal patterns are important in understanding how environmental factors, such as weather, influence criminal activity, and they can help law enforcement agencies plan how many officers they want on patrol since they can determine which times of the year have the highest expected crime.
```{r administrativeIncidentMonth, fig.cap = "The percent of crime incidents in 2022 NIBRS by the month of incident."}
administrative$month <- lubridate::month(administrative$incident_date, label = TRUE,
abbr = FALSE)
administrative$month <- factor(administrative$month, levels = rev(levels(administrative$month)))
administrative %>%
filter(!report_date_indicator %in% "report date") %>%
ggplot2::ggplot(aes(x = month)) +
theme_crim() +
ggplot2::coord_flip() +
ggplot2::xlab("") +
ggplot2::geom_bar(ggplot2::aes_string(y = "(..count..)/sum(..count..)")) +
crimeutils::theme_crim() +
ggplot2::scale_y_continuous(labels = scales::percent,
expand = c(0, 0)) +
ylab("% of Crimes")
```
We can also look at the days of the month to see if there is any variation there. Figure \@ref(fig:administrativeMonthDayIncident) shows the percent of incidents on each day of the month. There's not much variation other than a few days. The 29th and 30th day of the month have fewer incidents than average, and the 31st day has by far the fewest incidents These findings are reasonable since not all months have more than 28 days so by definition there are fewer 31st (and 29th, and 30th) days of the month for crimes to occur on.
The most common day of the month is the 1st which accounts for 3.95% of all incidents. In this data the agencies must report a date, even if they do not know the exact date; there is no option to put "unknown date". When agencies are unsure of the exact date of a crime, they appear to default to entering the 1st of the month as a placeholder. This practice introduces a potential source of error, and researchers should be cautious when analyzing trends that rely on specific dates, as the 1st of the month may disproportionately represent incidents with unknown dates.
```{r administrativeMonthDayIncident, fig.cap = "The percent of incidents that occur using the day of the incident, even if the incidents was not reported that day, each day of the month for all agencies reporting to NIBRS in 2022.", fig.height = 15}
administrative$mday <- lubridate::mday(administrative$incident_date)
administrative$mday <- factor(administrative$mday,
levels = rev(1:31))
administrative %>%
filter(!report_date_indicator %in% "report date") %>%
ggplot2::ggplot(aes(x = mday)) +
theme_crim() +
ggplot2::coord_flip() +
ggplot2::xlab("Day of Month") +
ggplot2::geom_bar(ggplot2::aes_string(y = "(..count..)/sum(..count..)")) +
crimeutils::theme_crim() +
ggplot2::scale_y_continuous(labels = scales::percent,
expand = c(0, 0)) +
ylab("% of Crimes")
```
The above graph showed the days of the month where the incident was said to occur. There is also a variable that says if the date included was the incident date or the date the crime was reported to the police. Figure \@ref(fig:administrativeMonthDayReport) replicates Figure \@ref(fig:administrativeMonthDayIncident) but now shows only report dates rather than incident date. Here too we see the same pattern of the 1st of the month having a disproportionate share of data, again suggesting that it is a placeholder for "unknown" dates.
```{r administrativeMonthDayReport, fig.cap = "The percent of incidents that are reported (the day of the report, even if not the day of the incident) each day of the month for all agencies reporting to NIBRS in 2022.", fig.height = 15}
administrative$mday <- mday(administrative$incident_date)
administrative$mday <- factor(administrative$mday,
levels = rev(1:31))
ggplot2::ggplot(administrative[administrative$report_date_indicator %in% "report date", ], aes(x = mday)) +
theme_crim() +
ggplot2::coord_flip() +
ggplot2::xlab("Day of Month") +
ggplot2::geom_bar(ggplot2::aes_string(y = "(..count..)/sum(..count..)")) +
crimeutils::theme_crim() +
ggplot2::scale_y_continuous(labels = scales::percent,
expand = c(0, 0)) +
ylab("% of Crimes")
```
## Hour of incident
Understanding the exact time of day when crimes occur is crucial for developing effective anti-crime strategies. For example, if crime spikes consistently at the end of the local high school day, this may indicate that students are involved in these incidents, either as victims or offenders. Law enforcement agencies can use this information to adjust patrol schedules and allocate resources more effectively to areas and times with higher crime rates. Luckily, NIBRS data does have the time of each incident, though it is only at the hour level.
Figure \@ref(fig:administrativeHours) shows the distribution of incidents that occurred in the 2022 for each hour of the day. There are two key trends in this figure. First, past research has found that crime tends to increase during the night (this is especially true during weekends), drop to a daily low in the couple of hours before sunrise, and then slowly increase as the day progresses.^[In all of the nighttime police ride-alongs I have been on the police tend to stop patrolling in early morning (e.g. 3-4am) and go back to the station to do paperwork. I think this likely partially explains the findings that crime is lowest around 4-5am.] What we find here is a little different. Crime peaks at night at 5-5:59pm which is far earlier than other estimates. Since this is all crimes it could be biased by large increases of certain crimes at this time, such as people coming home from work and finding their house burgled. As crimes differ in their timing (e.g. burglary happens often during the day, fights are more common at night), you will need to merge this segment with the Offense Segment to be able to look at certain types of crimes alone.
The substantial spike at midnight is unlikely to reflect actual crime patterns, as the number of incidents during this hour is more than triple that of neighboring hours. The noon hour is about 50% larger than in the neighboring hours, so is a sizable increase though continues the trend of increasing crime during the day and is a far smaller increase than at midnight. This suggests that, similar to the "1st of the month" issue, officers may be using midnight and (less so) noon as a placeholder when the exact time of the crime is unknown. Researchers should exclude the midnight and noon hours from time-sensitive analyses to avoid skewed results.
```{r administrativeHours, fig.cap = "The percent of crimes that are reported each hour for all agencies reporting to NIBRS in 2022.", fig.height = 14.3}
administrative$hour <- administrative$incident_date_hour
administrative$hour <- gsub("on or between ", "",
administrative$hour)
administrative$hour <- gsub(" and ", "-",
administrative$hour)
administrative$hour <- gsub("midnight", "Midnight",
administrative$hour)
administrative$hour <- factor(administrative$hour,
levels = rev(c(
"Midnight-00:59",
"01:00-01:59",
"02:00-02:59",
"03:00-03:59",
"04:00-04:59",
"05:00-05:59",
"06:00-06:59",
"07:00-07:59",
"08:00-08:59",
"09:00-09:59",
"10:00-10:59",
"11:00-11:59",
"12:00-12:59",
"13:00-13:59",
"14:00-14:59",
"15:00-15:59",
"16:00-16:59",
"17:00-17:59",
"18:00-18:59",
"19:00-19:59",
"20:00-20:59",
"21:00-21:59",
"22:00-22:59",
"23:00-23:59")))
ggplot2::ggplot(administrative, aes(x = hour)) +
theme_crim() +
ggplot2::coord_flip() +
ggplot2::xlab("Hour of Day") +
ggplot2::geom_bar(ggplot2::aes_string(y = "(..count..)/sum(..count..)")) +
crimeutils::theme_crim() +
ggplot2::scale_y_continuous(labels = scales::percent,
expand = c(0, 0)) +
ylab("% of Crimes")
```
To look at these trends over time, Figure \@ref(fig:nibrsAdministrativeHours) shows the percent of incidents each year that are reported at noon, at midnight, and where the hour is unknown. The noon hour has slowly and steadily become more common, moving from about 4% in 1991 to 6% in 2022. The midnight hour has seen more fluctuations, increasing to 9% by 1993 before steadily decreasing until a large and sustained spike to 9% in 2017. The spike was caused by the end of data being reported as having an unknown hour. While the share of incidents with an unknown hour has also fluctuated - from around 2.5% to 5% of incidents depending on the year - that dropped to 0% in 2017, as unknown hours stopped being reported after 2016.
```{r nibrsAdministrativeHours, fig.cap="Annual percent of incidents that occurred at midnight, noon, and at an unknown time, 1991-2023."}
nibrs_administrative_summary_stats %>%
ggplot(aes(x = year)) +
geom_line(aes(y = percent_hour_midnight, color = "Midnight Hour"), linewidth = 1.05) +
geom_line(aes(y = percent_hour_noon, color = "Noon Hour"), linewidth = 1.05) +
geom_line(aes(y = percent_hour_unknown , color = "Unknown Hour"), linewidth = 1.05) +
xlab("Year") +
ylab("% of Incidents") +
theme_crim() +
scale_y_continuous(labels = scales::percent, expand = c(0, 0), limits = c(0, NA)) +
labs(color = "") +
expand_limits(y = 0) +
scale_color_manual(values = c("Midnight Hour" = "#1b9e77",
"Noon Hour" = "#d95f02",
"Unknown Hour" = "#7570b3")) +
scale_x_continuous(breaks = time_series_x_axis_year_breaks)
```
Another way to visualize this is to see what hour is most and least common for every year we have data, as shown in Table \@ref(tab:nibrsAdministrativeCommonHours). Results are strikingly similar for the entire time period we have NIBRS. In every year except for 1991 the most common hour is midnight, and in every year the least common is 5am. When excluding midnight the most common hours are are the end of the work day at 5PM-5:59PM and 6PM-6:59PM, or at noon.
NIBRS data is available since 1991, and the number of agencies reporting has grown each year. This is also a time period which has seen considerable changes in crimes, an increase in the 1990s followed by a sustained decrease since then until a (now seemingly temporary) spike starting in 2020. Yet throughout all these changes the most and least common hours remain very consistent, suggesting that there appear to strong rules of when crime occurs regardless of other changes. Or at least strong rules in what appears in our data, as I do not believe the midnight or noon hour results are real.
```{r}
hour_cleanup <- c("^18:00-18:59$" = "6PM",
"^17:00-17:59$" = "5PM",
"^12:00-12:59$" = "Noon",
"^00:00-00:59$" = "Midnight",
"^05:00-05:59$" = "5AM")
nibrs_administrative_summary_stats %>%
select(year,
most_common_hour,
least_common_hour,
most_common_hour_excluding_midnight,
most_common_hour_excluding_midnight_and_noon) %>%
mutate(most_common_hour = gsub("on or between", "", most_common_hour),
most_common_hour = gsub("midnight", "00:00", most_common_hour),
most_common_hour = gsub(" and ", "-", most_common_hour),
least_common_hour = gsub("on or between", "", least_common_hour),
least_common_hour = gsub(" and ", "-", least_common_hour),
most_common_hour_excluding_midnight = gsub("on or between", "", most_common_hour_excluding_midnight),
most_common_hour_excluding_midnight = gsub(" and ", "-", most_common_hour_excluding_midnight),
most_common_hour_excluding_midnight_and_noon = gsub("on or between", "", most_common_hour_excluding_midnight_and_noon),
most_common_hour_excluding_midnight_and_noon = gsub(" and ", "-", most_common_hour_excluding_midnight_and_noon),
most_common_hour = str_replace_all(trimws(most_common_hour), hour_cleanup),
least_common_hour = str_replace_all(trimws(least_common_hour), hour_cleanup),
most_common_hour_excluding_midnight = str_replace_all(trimws(most_common_hour_excluding_midnight), hour_cleanup),
most_common_hour_excluding_midnight_and_noon = str_replace_all(trimws(most_common_hour_excluding_midnight_and_noon), hour_cleanup)) %>%
rename(Year = year,
`Most Common` = most_common_hour,
`Least Common` = least_common_hour,
`Most Common, Exclude Midnight` = most_common_hour_excluding_midnight,
`Most Common, Exclude Midnight/Noon` = most_common_hour_excluding_midnight_and_noon) %>%
kableExtra::kbl(# format = "html",
digits = 2,
align = c("l", "l", "r", "r"),
#booktabs = TRUE,
longtable = TRUE,
escape = TRUE,
label = "nibrsAdministrativeCommonHours",
caption = "The most and least common incident hours, and the most common hours excluding midnight and noon.") %>%
kable_styling(bootstrap_options = "striped", full_width = FALSE, latex_options = c("hold_position", "repeat_header"))
```
## Exceptional clearance
When we speak of clearances we generally mean that a person was arrested for the crime.^[While a more expansive definition may include a conviction in a court for that crime (including pleading guilty), NIBRS data only extends to the arrest stage so we never know the judicial outcome of the case.] Cases may also be cleared "through exceptional means" which is also called an "exceptional clearance." Exceptional clearances, which occur in about 3% of cases, are important because they indicate that the police have identified the offender and have enough evidence to arrest them, but are unable to do so for specific reasons. Unlike standard clearances, which involve arrests, exceptional clearances may result from factors such as the offender's death, the victim's refusal to cooperate, or the prosecution's decision not to pursue the case. Basically, if they could arrest them they would but for some reason they cannot. NIBRS data tells us if the case is exceptionally cleared as well as the reason for the exceptional clearance.
Figure \@ref(fig:administrativeExceptionalClearances) shows the breakdown of reasons why the case was cleared for these ~3 of cases that are exceptionally cleared. The most common reason, at 45% of exceptional clearances, is that the victim refused to cooperate with the police or prosecution. This can include cases where the victim reported their crime to the police and then later decide to stop assisting. The next most common reason, also at at 45% of exceptionally cleared cases, is that the prosecution decides to not prosecute the case. This excludes cases where the prosecution believes that there is not probable cause to make the arrest. Therefore it largely includes cases that the prosecution either does not believe they could win or where the agency has a policy of non-prosecution - this is likely increasingly common in jurisdiction which has "progressive prosecutors" who de facto legalize certain crimes.
Next we have when the offender is a juvenile and the police chose to avoid arresting them due to their age, which makes up about 7% of these incidents. This generally occurs in minor offenses such as property crimes and the police do give notice to the juvenile offender's parents or guardians about the situation. If the offender is in another agency's jurisdiction (including out of the country) and the agency reporting data is unable to make an arrest, including when the agency who has the offender in their jurisdiction refuses to extradite the offender, the case can be exceptionally cleared. This occurs in 2% of exceptional clearances. In these cases we do not know any information about which jurisdiction the offender is in at the time of the exceptional clearance. Finally, if the offender dies (by any means) before the arrest they cannot be arrested so the case is exceptionally cleared. This makes up about 1% of exceptional clearances.
The values shown in Figure \@ref(fig:administrativeExceptionalClearances) are for all incidents so can be quite different when examining subsets of the data such as by offender demographics or incident type. Doing this would require merging the Administrative Segment with another segment such as Offense or Victim.
```{r administrativeExceptionalClearances, fig.cap = "The distribution of exceptional clearances for all exceptional clearances reported to NIBRS, 1991-2023."}
nibrs_administrative_summary_stats %>%
ggplot(aes(x = year)) +
geom_line(aes(y = percent_exceptional_offender_death, color = "Offender\n Death"), linewidth = 1.05) +
geom_line(aes(y = percent_exceptional_extradition_denied , color = "Extradition\n Denied"), linewidth = 1.05) +
geom_line(aes(y = percent_exceptional_juvenile_no_custody , color = "Juvenile/\nNo Custody"), linewidth = 1.05) +
geom_line(aes(y = percent_exceptional_prosecution_declined , color = "Prosecution\n Declined"), linewidth = 1.05) +
geom_line(aes(y = percent_exceptional_victim_refused_cooperate , color = "Victim\n Refused to\n Cooperate"), linewidth = 1.05) +
xlab("Year") +
ylab("% of Incidents") +
theme_crim() +
scale_y_continuous(labels = scales::percent, expand = c(0, 0), limits = c(0, NA)) +
labs(color = "") +
expand_limits(y = 0) +
scale_color_manual(values = c("Offender\n Death" = "#1b9e77",
"Extradition\n Denied" = "#d95f02",
"Juvenile/\nNo Custody" = "#7570b3",
"Prosecution\n Declined" = "#1f78b4",
"Victim\n Refused to\n Cooperate" = "black")) +
scale_x_continuous(breaks = time_series_x_axis_year_breaks)
```
In Figure \@ref(fig:nibrsAdministrativeClearance) we can see trends in the percent of incidents that involve an arrest or an exceptional clearance. Ignoring the spike in the arrest rate in the first few years of data, likely part of growing pains of any new dataset, the share of incidents with an arrest is relatively steady over time, increasing until it peaks at a little under 30% of incidents in the mid-2010s and then declining since then. The share of incidents that are exceptionally cleared likewise are relatively steady but do show a slow decline over time, moving from a bit over 5% at the start of our data to about 3% by the end. These changes may simply be due to different agencies reporting over time but they are steady enough that I think the trend likely accurately reflects arrest and exceptional clearance rates in the US.
```{r nibrsAdministrativeClearance, fig.cap="Percent of incidents with an arrest or exceptional clearance, 1991-2023."}
nibrs_administrative_summary_stats %>%
ggplot(aes(x = year)) +
geom_line(aes(y = percent_with_arrest, color = "Arrest"), linewidth = 1.05) +
geom_line(aes(y = percent_cleared_exceptionally , color = "Cleared\n Exceptionally"), linewidth = 1.05) +
geom_line(aes(y = percent_with_arrest_or_cleared_exceptionally , color = "Arrest/Cleared\n Exceptionally"), linewidth = 1.05) +
xlab("Year") +
ylab("% of Incidents") +
theme_crim() +
scale_y_continuous(labels = scales::percent, expand = c(0, 0), limits = c(0, NA)) +
labs(color = "") +
expand_limits(y = 0) +
scale_color_manual(values = c("Arrest" = "#1b9e77",
"Cleared\n Exceptionally" = "#d95f02",
"Arrest/Cleared\n Exceptionally" = "#7570b3")) +
scale_x_continuous(breaks = time_series_x_axis_year_breaks)
```
## Number of other segments
The "Administrative" part of this segment is that it tells us about other segments related to this incident. Here we know how many Offense, Offender, Victim, and Arrestee segments there are for the incident. In effect it says how many crimes were committed, how many offenders involved (at least the number known to police), how many victims there were, and how many people were arrested for this particular incident. This can be useful both as a check to make sure you are not missing anything when looking at the other segments and to quickly subset data, such as to only single-victim or only multiple-offender incidents.
### Offense Segments
This variable indicates how many offense segments there are associated with this incident. Each incident can have multiple offenses, so this just says how many of these crimes there were. For example, if a person is raped and robbed then there would be two offense segments related to that incident.
Figure \@ref(fig:administrativeOffenseSegments) shows the number of offense segments - and thus the number of crimes - associated with each incident. The vast majority of incidents only have one offense reported, making up 88% of incidents.^[In reality a person who commits a crime may be arrested or charged with many (often highly related) offenses related to a single criminal incident. So this data does report fewer incidents than you would likely find in other data sources, such as if you request data from a local police agency or district attorney's office.] This drops considerably to 10% of incidents having two offenses, 1% having three, and then under 0.15% of incidents having four through nine offenses.
```{r administrativeOffenseSegments, fig.cap = "The distribution for the number of Offender Segments per incident, for all incidents in NIBRS 2022."}
administrative$total_offense_segments[administrative$total_offense_segments >= 10] <- "10 or more"
administrative$total_offense_segments <- factor(administrative$total_offense_segments,
levels = rev(c(0:9, "10 or more")))
ggplot2::ggplot(administrative, aes(x = total_offense_segments)) +
theme_crim() +
ggplot2::coord_flip() +
ggplot2::xlab("# of Offense Segments") +
ggplot2::geom_bar(ggplot2::aes_string(y = "(..count..)/sum(..count..)")) +
crimeutils::theme_crim() +
ggplot2::scale_y_continuous(labels = scales::percent,
expand = c(0, 0)) +
ylab("% of Incidents")
```
This trend is consistent over time. As shown in Figure \@ref(fig:nibrsAdministrativeNumberOffense), the median number of offense segments each year is one, while the mean number is slightly over one.
```{r nibrsAdministrativeNumberOffense, fig.cap="Annual mean and median number of Offense Segments, 1991-2023."}
nibrs_administrative_summary_stats %>%
ggplot(aes(x = year)) +
geom_line(aes(y = mean_number_offense_segments, color = "Mean"), linewidth = 1.05) +
geom_line(aes(y = median_number_offense_segments , color = "Median"), linewidth = 1.05) +
xlab("Year") +
ylab("# of Offense Segments") +
theme_crim() +
scale_y_continuous(labels = scales::comma, expand = c(0, 0), limits = c(0, NA)) +
labs(color = "") +
expand_limits(y = 0) +
scale_color_manual(values = c("Mean" = "#1b9e77",
"Median" = "#d95f02")) +
scale_x_continuous(breaks = time_series_x_axis_year_breaks)
```
### Offender Segments
The Administrative Segment tells you how many offenders are involved with an incident. This is, of course, an estimate because in some incidents the police do not know how many people are involved. If, for example, someone was robbed then they can tell the police how many robbers there were. But if someone comes home to find their home burglarized then they do not know how many burglars there were. If there is no video evidence (e.g. a home security camera) and neighbors did not see anything then the police would not know how many offenders were involved in the incident. In these cases they put in a single offender and in the Offender Segment all of the information about the offender is "unknown." The remaining number of offenders are still estimates as the police may not know for sure how many offenders were involved, but this is more reliable than when there is only a single offender reported.
With that major caveat in mind, Figure \@ref(fig:administrativeOffenderSegments) shows the distribution in how many offenders there were per incident. The vast majority of incidents have only one (or potentially an unknown number) offenders, at 91% percent of incidents. Incidents with two offenders make up only 7% of incidents while those with three make up 1% of incidents. No other number of offenders make up more than 0.5% of incidents. The data does have the exact number of offenders but I have top coded it to 10 in the graph for simplicity. There can potentially be a large number of offenders involved in an incident and in the 2022 NIBRS data the incident with the higher number of offenders had 86. However, it is exceedingly rare for there to be even more than a handful of offenders.
```{r administrativeOffenderSegments, fig.cap = "The distribution for the number of Offender Segments per incident, for all incidents in NIBRS 2022."}
administrative$total_offender_segments[administrative$total_offender_segments >= 10] <- "10 or more"
administrative$total_offender_segments <- factor(administrative$total_offender_segments,
levels = rev(c(0:9, "10 or more")))
ggplot2::ggplot(administrative, aes(x = total_offender_segments)) +
theme_crim() +
ggplot2::coord_flip() +
ggplot2::xlab("# of Offender Segments") +
ggplot2::geom_bar(ggplot2::aes_string(y = "(..count..)/sum(..count..)")) +
crimeutils::theme_crim() +
ggplot2::scale_y_continuous(labels = scales::percent,
expand = c(0, 0)) +
ylab("% of Incidents")
```
As seen in Figure \@ref(fig:nibrsAdministrativeNumberOffender), in every year the median number of offenders is one and the mean number is just above one.
```{r nibrsAdministrativeNumberOffender, fig.cap="Annual mean and median number of Offender Segments, 1991-2023."}
nibrs_administrative_summary_stats %>%
ggplot(aes(x = year)) +
geom_line(aes(y = mean_number_offender_segments, color = "Mean"), linewidth = 1.05) +
geom_line(aes(y = median_number_offender_segments , color = "Median"), linewidth = 1.05) +
xlab("Year") +
ylab("# of Offender Segments") +
theme_crim() +
scale_y_continuous(labels = scales::comma, expand = c(0, 0), limits = c(0, NA)) +
labs(color = "") +
expand_limits(y = 0) +
scale_color_manual(values = c("Mean" = "#1b9e77",
"Median" = "#d95f02"))+
scale_x_continuous(breaks = time_series_x_axis_year_breaks)
```
### Victim Segments
In cases where the offense is a "victimless crime" (or at least one where there is no specific victim) such as a drug offense, the victim and the offender can be the same individual. For other cases, being a victim in an incident just means that you were the victim of at least one offense committed in the incident.
Figure \@ref(fig:administrativeVictimSegments) shows the distribution in the number of victims per incident. Like the number of offenses and offenders, this is massively skewed to the left with 91% of incidents having a single victim. Incidents with two victims make up 8% of the data while incidents with three victims are 1%. All remaining numbers of victims are less than one third of 0.5% of the data each. The data does have the exact number of victims but I have top coded it to 10 in the graph for simplicity. The incident with the most victims in 2022 had 163 victims.
```{r administrativeVictimSegments, fig.cap = "The distribution for the number of Victim Segments per incident, for all incidents in NIBRS 2022."}
administrative$total_victim_segments[administrative$total_victim_segments >= 10] <- "10 or more"
administrative$total_victim_segments <- factor(administrative$total_victim_segments,
levels = rev(c(0:9, "10 or more")))
ggplot2::ggplot(administrative, aes(x = total_victim_segments)) +
theme_crim() +
ggplot2::coord_flip() +
ggplot2::xlab("# of Victim Segments") +
ggplot2::geom_bar(ggplot2::aes_string(y = "(..count..)/sum(..count..)")) +
crimeutils::theme_crim() +
ggplot2::scale_y_continuous(labels = scales::percent,
expand = c(0, 0)) +
ylab("% of Incidents")
```
Similar to what we have seen with offenses and offenders, we can see in Figure \@ref(fig:nibrsAdministrativeNumberVictim) that the median number of victims is one and the mean number is just a bit more than one.
```{r nibrsAdministrativeNumberVictim, fig.cap="Annual mean and median number of Victim Segments, 1991-2023."}
nibrs_administrative_summary_stats %>%
ggplot(aes(x = year)) +
geom_line(aes(y = mean_number_victim_segments, color = "Mean"), linewidth = 1.05) +
geom_line(aes(y = median_number_victim_segments , color = "Median"), linewidth = 1.05) +
xlab("Year") +
ylab("# of Victim Segments") +
theme_crim() +
scale_y_continuous(labels = scales::comma, expand = c(0, 0), limits = c(0, NA)) +
labs(color = "") +
expand_limits(y = 0) +
scale_color_manual(values = c("Mean" = "#1b9e77",
"Median" = "#d95f02"))+
scale_x_continuous(breaks = time_series_x_axis_year_breaks)
```
### Arrestee Segments
Unlike the previous segments, there may not always be an arrestee segment since not all crimes lead to an arrest. Figure \@ref(fig:administrativeArresteeSegments) shows the distribution in the number of arrestee segments per incident in the 2022 NIBRS data. Indeed, the vast majority - 77% of incidents - did not lead to a single arrest. In 21% of incidents a single person was arrested while in 2% of incidents two people were arrested. The remaining numbers of people arrested are increasingly small with fewer than 0.3% of incidents having more than three people arrested. The incident with the most arrests in 2022 led to 65 people arrested.
```{r administrativeArresteeSegments, fig.cap = "The distribution for the number of Arrestee Segments per incident, for all incidents in NIBRS 2022."}
administrative$total_arrestee_segments[administrative$total_arrestee_segments >= 10] <- "10 or more"
administrative$total_arrestee_segments <- factor(administrative$total_arrestee_segments,
levels = rev(c(0:9, "10 or more")))
ggplot2::ggplot(administrative, aes(x = total_arrestee_segments)) +
theme_crim() +
ggplot2::coord_flip() +
ggplot2::xlab("# of Arrestee Segments") +
ggplot2::geom_bar(ggplot2::aes_string(y = "(..count..)/sum(..count..)")) +
crimeutils::theme_crim() +
ggplot2::scale_y_continuous(labels = scales::percent,
expand = c(0, 0)) +
ylab("% of Incidents")
```
Of course, to really understand these arrests we would need to know how many people committed the crime. Having one arrest for an incident with one offender is good, having one arrest when there are multiple offenders means some criminals are walking free. While we do not know the true number of offenders (as police may not know how many there actually were), we can use the Offender Segment count as an estimate. Figure \@ref(fig:administrativeArrestsAny) shows the percent of incidents where at least one offender was arrested and where all offenders were arrested, broken down by the number of reported offenders.
There is wide variability in the percent of offenders arrested by the number of offenders in an incident. In cases with one offender, there was an arrest made only 22% of the time. Given that this includes incidents without a known offender, I expect the true arrest rate for incidents that actually have one offender to be higher.
When there are two offenders, about 39% of incidents have at least one arrest and 26% of incidents have both offenders arrested. For having at least one person arrested we see a fairly steady rate of mid- to high-30% for each number of offenders. In contrast, the share of incidents where all offenders are arrested declines with each additional offender, reaching to only 9% with 10 or more offenders.
```{r administrativeArrestsAny, fig.cap = "The percent of incidents by number of offenders where at least one offender is arrested and where all offenders are arrested."}
administrative$arrests_equal_offenders <- 0
administrative$arrests_equal_offenders[administrative$total_offender_segments_temp == administrative$total_arrestee_segments_temp] <- 1
percent_arrested_all_offenders <-
administrative %>%
group_by(total_offender_segments) %>%
summarize(all_offenders_arrested = mean(arrests_equal_offenders)) %>%
ungroup()
percent_arrested_all_offenders$total_offender_segments <-
gsub("10.*", "10+", percent_arrested_all_offenders$total_offender_segments)
administrative$total_arrestee_segments_any <- 0
administrative$total_arrestee_segments_any[administrative$total_arrestee_segments_temp > 0] <- 1
percent_arrested <-
administrative %>%
group_by(total_offender_segments) %>%
summarize(any_arrests = mean(total_arrestee_segments_any))
percent_arrested$total_offender_segments <-
gsub("10.*", "10+", percent_arrested$total_offender_segments)
percent_arrested <-
percent_arrested %>%
left_join(percent_arrested_all_offenders)
percent_arrested$total_offender_segments <- factor(percent_arrested$total_offender_segments,
levels = c(0:9, "10+"))
ggplot(percent_arrested, aes(x = total_offender_segments, y = any_arrests, group = 1)) +
geom_line(linewidth = 1.05, aes(color = "At least one arrest")) +
geom_line(linewidth = 1.05, aes(y = all_offenders_arrested, color = "All offenders arrested")) +
ylab("Arrest rate") +
xlab("# of Offenders") +
theme_crim() +
scale_y_continuous(labels = scales::percent, expand = c(0, 0), limits = c(0, NA)) +
labs(color = "") +
expand_limits(y = 0) +
scale_color_manual(values = c("At least one arrest" = "#d7191c",
"All offenders arrested" = "#fdae61"))
```
The median number of arrestee segments over time, as shown in Figure \@ref(fig:nibrsAdministrativeNumberArrestee) is zero, with the mean number slightly higher at around 0.3.
```{r nibrsAdministrativeNumberArrestee, fig.cap="Annual mean and median number of Arrestee Segments, 1991-2023."}
nibrs_administrative_summary_stats %>%
ggplot(aes(x = year)) +
geom_line(aes(y = mean_number_arrestee_segments, color = "Mean"), linewidth = 1.05) +
geom_line(aes(y = median_number_arrestee_segments , color = "Median"), linewidth = 1.05) +
xlab("Year") +
ylab("# of Arrestee Segments") +
theme_crim() +
scale_y_continuous(labels = scales::comma, expand = c(0, 0), limits = c(0, NA)) +
labs(color = "") +
expand_limits(y = 0) +
scale_color_manual(values = c("Mean" = "#1b9e77",
"Median" = "#d95f02"))+
scale_x_continuous(breaks = time_series_x_axis_year_breaks)
```
In summary, the Administrative Segment provides a useful metadata for understanding what other segments are available for each incident. Although it is often necessary to combine this data with other segments to gain a full understanding of the incident, the information in the Administrative Segment - such as the timing of the crime and exceptional clearance details - offers useful insights into the broader patterns of criminal activity and law enforcement responses.