-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathlinear_mixed_models.qmd
938 lines (767 loc) · 31.2 KB
/
linear_mixed_models.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
---
title: "A Longitudinal Approach to Loss Reserving"
subtitle: ST 537
author: Brian A. Fannin ACAS
abstract: "Reserving methods like chain-ladder are inherently linear models. In application, they are often applied to one business segment or coverage item at a time. This misses an opportunity to incorporate exogeneous data to smooth estimates. This paper examines the use of linear mixed models for reserving."
bibliography: references.bib
format:
docx:
number-sections: true
reference-doc: cas_rp_template.docx
---
<!-- This Source Code Form is subject to the terms of the Mozilla Public
- License, v. 2.0. If a copy of the MPL was not distributed with this
- file, You can obtain one at https://mozilla.org/MPL/2.0/. -->
```{r include=FALSE}
knitr::opts_chunk$set(
echo = FALSE,
include = FALSE,
message = FALSE,
warning = FALSE
)
```
# Introduction
## Research objective
_The initial paragraphs are a holdover from my final project. This explanation is probably too simple for an actuarial audience._
An insurance company takes on a financial obligation to a policyholder as soon as a policy is sold. However, although the premium is known with certainty, the amount owed for claims is not. Even after a claim occurs, the insurance company is still uncertain of the claim's ultimate cost. The amount associated with repairing a damaged car, the medical costs for someone injured in an accident, or the expenses to repair a damaged roof take time to become fully established. In some cases, this process may take years. For example, a plaintiff may seek recovery for a tort claim which will work its way through the civil courts and then --- in all likelihood --- be appealed after a jury has rendered a verdict. The resolution of asbestos cases --- which took decades and is still not fully resolved --- is an example. Finally, the existence of a claim may not be known for some period of time after a policy has expired. Again, asbestos liability provides a good case in point. Claimants who had been exposed to asbestos were unaware of the negative health affects for years. However, under prevailing civil liability law and the provisions of the insurance contract, they were still able to seek financial recovery and insurers were obligated to respond[^claims_made].
[^claims_made]: The policy requirement that an insurance contract respond many years after policy expiry lead to the development of the "claims-made" coverage trigger in the late 1970s. In these contracts, the policy covers only those claims which are made during the policy period.
Although insurance companies do not know what funds are owed to policyholders at the inception of the policy, they do know what premium has been charged. The premium is an estimate of the average funds needed to cover losses and expenses. Because the largest share of expenses is proportionate to premium, the premium has a nearly direct linear relationship to the estimated loss. Given that, a linear function of premium is often used as an estimate of loss when case reserve and payment data has yet to emerge.
_Here we get to the actuarial stuff_
In this paper, we will explore the use of longitudinal analysis to estimate loss payments for workers compensation insurance. Before doing so, we will review existing some work, which examines linear modeling techniques for loss reserving. Additionally, we will look at how these models, and others, have been extended
We will examine several potential covariates: development lag, prior cumulative paid loss, and net earned premium. Additionally, we will explore several covariance structures and examine the use of random effects which are tied to insurance company. Because each insurance company has its own portfolio of insured and its own claims department we may expect both the character of the claims and the settlement of claims to be specific an insurer. However, because each insurance company is subject to similar regulation[^similar_regulation], we can also expect some similarities in behavior.
[^similar_regulation]: In the United States, each state is responsible for the regulation of insurance. Insofar as insurance companies concentrate in particular states or geographic regions they may experience regulatory regimes that resemble the whole more or less.
```{r results='hide', message = FALSE}
#| label: load-tidyverse
library(tidyverse)
```
# Mathematical background
## Literature review
Start with the earliest Zehnwirth or Taylor paper? Then Mack, then Murphy. Circle back to Stanard and the responses to his paper.
Reference early works of linear mixed models. Cite a few textbooks: Frees, West/Welch, Gelman/Hill
Guszcza mixed model paper
Earliest reference to GLMs in reserving? Taylor/McGuire
# The data
We Glenn Meyers and Peng Shi collated ten years of financial statements to generate a $10 x 10$ matrix of financial amounts, giving ten development ages for ten accident years. Values exist for each of 132 companies, making this a balanced design. The data is publicly available and has been captured in the `raw` R package. For this paper, we will augment the original data by noting the prior amounts for cumulative paid and case reserve and also by calculating incremental values for the paid data. A sample of the data is shown in @tbl-sched-p-example.
In the data, the columns with a suffix of `_ep` refer to "earned premium" elements. The term "earned" means that it has been adjusted so that it is on the same accounting basis as the related losses - in this case accident year[^premium_earning]. Note that there are three different amounts shown. This reflects the usage of reinsurance: insurance bought by insurance companies. The `direct_ep` is the amount collected from policyholders, `ceded_ep` is the amount paid to reinsurers and `net_ep` is the difference between the two. Because the loss amounts reflect payments made by reinsurers, the `net_ep` column is the one which is most compatible with the observed losses.
_TO-DO: Add a cross-validation fold_
```{r }
#| label: get-data
library(raw)
source('wrangle.R')
tbl_wkcomp <- raw::wkcomp |>
group_by(Company) |>
wrangle_triangle(1997) |>
mutate(
zero_paid_incremental = incremental_paid == 0,
zero_incurred_incremental = incremental_incurred == 0
)
```
```{r}
#| label: tbl-sched-p-example
#| tbl-cap: "A sample of the data being studied"
#| include: TRUE
tbl_wkcomp |>
slice_sample(n = 5) |>
select(Company = company, `Accident Year` = accident_year,
Lag = lag, `Net Earned Premium` = net_ep,
`Incremental Paid` = incremental_paid,
`Case Reserve` = case) |>
knitr::kable()
```
## Exploratory data analysis
Before constructing models, we will first examine some exploratory and summary plots. Taking a look at the observed values in @fig-inc-paid-histogram and @fig-inc-paid-histogram-logged, we note two things. First, the response is highly skewed. Second, there is a probability mass at zero. The latter point will make it particularly challenging to fit using linear methods.
```{r}
#| label: fig-inc-paid-histogram
#| fig-cap: "A histogram of incremental paid losses, which exhibits skew"
#| include: TRUE
tbl_wkcomp |>
ggplot(aes(incremental_paid)) +
geom_histogram() +
labs(x = "Incremental Paid Loss") +
theme_minimal()
```
```{r}
#| label: fig-inc-paid-histogram-logged
#| fig-cap: "A histogram of incremental paid losses on a log scale, which shows a probability mass at zero"
#| include: TRUE
tbl_wkcomp |>
ggplot(aes(incremental_paid)) +
geom_histogram() +
labs(x = "Log of Incremental Paid Loss") +
scale_x_log10() +
theme_minimal()
```
_Do zero incremental payments depend on lag?_
```{r}
tbl_wkcomp |>
group_by(lag_factor) |>
summarise(
zero_paid_incremental = sum(zero_paid_incremental)
) |>
ggplot(aes(lag_factor, zero_paid_incremental)) +
geom_bar(stat = 'identity')
```
```{r}
tbl_wkcomp |>
count(lag_factor, zero_paid_incremental) |>
group_by(lag_factor) |>
mutate(n_pct = n / sum(n)) |>
filter(zero_paid_incremental) |>
ggplot(aes(lag_factor, n_pct)) +
geom_point() +
scale_y_continuous(limits = c(0,1)) +
theme_minimal()
```
```{r}
tbl_summary_lag <- tbl_wkcomp |>
group_by(lag) |>
summarise(
paid_mean = mean(incremental_paid, na.rm = TRUE),
paid_median = median(incremental_paid, na.rm = TRUE),
paid_mean_log = mean(log(incremental_paid), na.rm = TRUE),
paid_median_log = median(log(incremental_paid), na.rm = TRUE),
paid_sd = sd(incremental_paid, na.rm = TRUE),
paid_cv = paid_sd / paid_mean
)
```
```{r}
tbl_summary_lag |>
ggplot(aes(lag, paid_median_log)) +
geom_point() +
geom_line() +
scale_x_continuous(breaks = 1:10) +
labs(
x = 'Development Lag',
y = 'Mean log incremental paid loss'
) +
theme_minimal()
```
```{r}
tbl_summary_lag |>
ggplot(aes(lag, paid_median)) +
geom_point() +
geom_line() +
scale_x_continuous(breaks = 1:10) +
labs(
x = 'Development Lag',
y = 'Medan incremental paid loss'
) +
theme_minimal()
```
```{r}
#| label: fig-mean-paid-by-lag
#| fig-cap: 'Mean incremental paid loss by development lag'
#| include: TRUE
tbl_summary_lag |>
ggplot(aes(lag, paid_mean)) +
geom_point() +
geom_line() +
scale_x_continuous(breaks = 1:10) +
labs(
x = 'Development Lag',
y = 'Mean incremental paid loss'
) +
theme_minimal()
```
We next turn to the mean and standard deviation of incremental payments, grouped by lag. In @fig-mean-paid-by-lag, we note that the average payment amount rises between lags 1 and 2, but then declines. This pattern is also observed in the standard deviation shown in @fig-sd-paid-by-lag.
```{r}
#| label: fig-sd-paid-by-lag
#| fig-cap: 'Standard deviation of incremental paid loss by development lag'
#| include: TRUE
tbl_summary_lag |>
ggplot(aes(lag, paid_sd)) +
geom_point() +
geom_line() +
scale_x_continuous(breaks = 1:10) +
labs(
x = 'Development Lag',
y = 'Standard deviation of incremental paid loss'
) +
theme_minimal()
```
```{r}
tbl_wkcomp_prior <- tbl_wkcomp |>
filter(lag != 1)
```
```{r }
#| label: fig-inc-by-prior
#| fig-cap: 'Incremental paid losses against prior cumulative losses'
#| include: TRUE
tbl_wkcomp_prior |>
ggplot(aes(prior_cumulative_paid, incremental_paid)) +
geom_point() +
labs(
x = 'Prior cumulative paid loss',
y = 'Incremental paid loss'
) +
theme_minimal()
```
```{r}
#| label: fig-inc-by-prior-facet
#| fig-cap: 'Incremental paid losses against prior cumulative losses, by lag'
#| include: TRUE
tbl_wkcomp_prior |>
ggplot(aes(prior_cumulative_paid, incremental_paid)) +
geom_point() +
facet_wrap(~ lag, scales = 'free') +
labs(
x = 'Prior cumulative paid loss',
y = 'Incremental paid loss'
) +
theme_minimal()
```
In actuarial practice, it is common to model payments on the prior cumulative value of loss. However, this is generally done with a distinct model for each lag [@friedland]. @fig-inc-by-prior and @fig-inc-by-prior-facet offer a visual justification for this. In @fig-inc-by-prior, we note a cluster of points near the origin, consistent with the probability mass seen in @fig-inc-paid-histogram-logged. Although there is a very rough sense of linearity, there is no clear model suggested. When we separate the data by lag as in @fig-inc-by-prior-facet, we see a much stronger linear pattern emerge.
```{r}
tbl_wkcomp_ep <- tbl_wkcomp
```
```{r include=FALSE}
tbl_wkcomp_ep |>
ggplot(aes(net_ep, incremental_paid)) +
geom_point() +
labs(
x = 'Net earned premium',
y = 'Incremental paid loss'
) +
theme_minimal()
```
```{r}
#| include: false
tbl_wkcomp_ep |>
ggplot(aes(net_ep, incremental_paid)) +
geom_point() +
facet_wrap(~ lag, scales = 'free') +
labs(
x = 'Net earned premium',
y = 'Incremental paid loss'
) +
theme_minimal()
```
Based on the exploratory analysis, we will explore the following models:
*
# Methods
We will fit several different models for the incremental payments. In all of the formulas which follow, $Y_{ij}$ denotes incremental paid loss from company $i$ at development time $j$.
## Models w/ lag
As noted above, @fig-mean-paid-by-lag suggests a different treatment for lags greater than 2. Accordingly, we form the variable $\delta_{ij}$ to differentiate between the two time periods.
$$
\begin{aligned}
Y_{ij} &= \beta_0 + \beta_1 * \delta_{ij} * lag_{ij}
+ \beta_2\left(1 - \delta_{ij} \right) lag_{ij} + \epsilon_{ij} \\
\delta_{ij} &= \begin{cases}
1 & \text{if } lag_{ij} \le 2 \\
0 & \text{if } lag_{ij} > 2
\end{cases} \\
\epsilon_{ij} &\sim N\left(0, \pmb{\Sigma}_i(\omega)\right)
\end{aligned}
$$ {#eq-model-w-lag}
```{r }
tbl_wkcomp_lag <- tbl_wkcomp |>
mutate(
delta = ifelse(lag <= 2, 1, 0),
lag_le2 = delta * lag,
lag_gt2 = (1 - delta) * lag,
delta_ind = ifelse(delta, '_le2', '_gt2')
)
```
The first model we fit will use @eq-model-w-lag, with the assumption that $\epsilon_{ij}$ follows a compound symmetric covariance structure where variances are different from one company to another. That is:
$$
\pmb{\Sigma}_i(\omega) = \pmb{D} \begin{bmatrix}
1 & \rho & \ldots & \rho \\
& 1 & \ldots & \rho \\
& & \ddots & \vdots \\
& & & 1
\end{bmatrix}
\pmb{D}
$$
The vector $\pmb{D}$ will have an entry for each company.
```{r}
#| label: gls-one
library(nlme)
fit_lag_one <- gls(
incremental_paid ~ 0 + delta_ind + lag_le2 + lag_gt2,
correlation = corCompSymm(form = ~ 1 | company),
data = tbl_wkcomp_lag,
)
```
```{r}
#| label: club-sandwich
library(clubSandwich)
beta_hat <- fit_lag_one$coefficients
degrees_of_freedom <- nrow(tbl_wkcomp) - length(beta_hat)
alpha <- 0.05
t_crit <- qt(1 - alpha / 2, degrees_of_freedom)
standard_error <- fit_lag_one |>
vcovCR(type = "CR0") |>
diag() |>
sqrt()
tbl_beta_hat_fit_lag_one <- tibble(
Coefficient = names(beta_hat),
Estimate = beta_hat,
Lower = beta_hat - t_crit * standard_error,
Upper = beta_hat + t_crit * standard_error
)
```
```{r}
#| label: tbl-beta-hat-fit-lag-one
#| tbl-cap: Estimated coefficients using lag as a predictor with variances different by company
#| include: true
tbl_beta_hat_fit_lag_one |>
knitr::kable(digits = 2)
```
Our parameter estimates are shown in @tbl-beta-hat-fit-lag-one, which includes a 95% confidence interval around the estimate. All appear to be significant, though the range in value for the slope associated with lags less than two is wider than we might like. Conforming with @fig-mean-paid-by-lag, the signs of the slopes appear reasonable.
Our second model assumes that the variance changes by development lag. This is motivated by @fig-sd-paid-by-lag.
```{r }
#| label: fit-lag-two
#| cache: true
fit_lag_two <- gls(
incremental_paid ~ 0 + delta_ind + lag_le2 + lag_gt2,
correlation = corCompSymm(form = ~ 1 | company),
weights = varIdent(form = ~ 1 | lag),
data = tbl_wkcomp_lag
)
```
```{r}
beta_hat <- fit_lag_two$coefficients
degrees_of_freedom <- nrow(tbl_wkcomp) - length(beta_hat)
alpha <- 0.05
t_crit <- qt(1 - alpha / 2, degrees_of_freedom)
standard_error <- fit_lag_two |>
vcovCR(type = "CR0") |>
diag() |>
sqrt()
tbl_beta_hat_fit_lag_two <- tibble(
Coefficient = names(beta_hat),
Estimate = beta_hat,
Lower = beta_hat - t_crit * standard_error,
Upper = beta_hat + t_crit * standard_error
)
```
```{r}
#| label: tbl-beta-hat-fit-lag-two
#| tbl-cap: Estimated coefficients using lag as a predictor with variances different by company and weighted by lag
#| include: TRUE
tbl_beta_hat_fit_lag_two |>
knitr::kable(digits = 2)
```
Examining @tbl-beta-hat-fit-lag-two, we see that the model gives a negative value for the slope for lags <= 2. In addition, the confidence intervals for the coefficient estimates associated with lags <= 2 include zero, indicating that they we cannot readily conclude that their values are not zero.
```{r }
#| label: fit-lag-three
fit_lag_three <- lme(
fixed = incremental_paid ~ 0 + delta_ind + lag_le2 + lag_gt2,
random = list(company = pdBlocked(
list(~ 0 + lag_le2, ~ 0 + lag_gt2)
)),
data = tbl_wkcomp_lag
)
```
```{r}
standard_error <- fit_lag_three$varFix |>
diag() |>
sqrt()
degrees_of_freedom <- fit_lag_three$fixDF$X
alpha <- 0.05
crit <- qt(1 - alpha / 2, df = degrees_of_freedom)
beta_hat <- fixed.effects(fit_lag_three)
tbl_beta_hat_fit_lag_three <- tibble(
coefficient = names(beta_hat),
estimate = beta_hat,
degrees_of_freedom = degrees_of_freedom,
standard_error = standard_error,
lower = beta_hat - crit * standard_error,
upper = beta_hat + crit * standard_error
)
```
```{r}
#| label: tbl-beta-hat-fit-lag-three
#| tbl-cap: Estimated coefficients using lag as a predictor with random effects for the slopes
#| include: TRUE
tbl_beta_hat_fit_lag_three |>
knitr::kable(digits = 2)
```
We now examine a random effects model, using insurance company as the grouping element for the random effects. @tbl-beta-hat-fit-lag-three shows estimates for the fixed effects. The slope for lag <= 2 has an estimated standard error nearly equal to the coefficient itself, indicating this parameter is likely not significant.
```{r}
#| label: fit-lag-four
fit_lag_four <- lme(
fixed = incremental_paid ~ 0 + delta_ind + lag_le2 + lag_gt2,
random = list(company = pdBlocked(
list(~ 0 + delta_ind, ~ 0 + lag_le2, ~ 0 + lag_gt2)
)),
data = tbl_wkcomp_lag
)
```
```{r}
standard_error <- fit_lag_four$varFix |>
diag() |>
sqrt()
degrees_of_freedom <- fit_lag_four$fixDF$X
alpha <- 0.05
crit <- qt(1 - alpha / 2, df = degrees_of_freedom)
beta_hat <- fixed.effects(fit_lag_four)
tbl_beta_hat_fit_lag_four <- tibble(
Coefficient = names(beta_hat),
Estimate = beta_hat,
`Degrees of Freedom` = degrees_of_freedom,
`Standard Error` = standard_error,
Lower = beta_hat - crit * standard_error,
Upper = beta_hat + crit * standard_error
)
```
```{r}
#| label: tbl-beta-hat-fit-lag-four
#| tbl-cap: Estimated coefficients using lag as a predictor with random effects for the slopes and intercept
#| include: TRUE
tbl_beta_hat_fit_lag_four |>
knitr::kable(digits = 2)
```
Finally, we estimate a fixed effects model where both the intercept and the slope have random effects by company. @tbl-beta-hat-fit-lag-four shows that this helps resolve the issue with significance of the slope term for lags <- 2.
```{r}
tbl_aic_lag <- tibble(
model = paste0('Lag model ', c('one', 'two', 'three', 'four')),
description = c(
'GLS compound symmetric',
'GLS unequal variance',
'LMM w/random slope',
'LMM w/random slope and intercept'
),
aic = map_dbl(
list(fit_lag_one, fit_lag_two, fit_lag_three, fit_lag_four),
AIC
)
)
```
```{r}
#| label: tbl-aic-lag
#| tbl-cap: AIC for models based on development lag
#| include: TRUE
tbl_aic_lag |>
knitr::kable(format.args = list(big.mark = ",", nsmall = 0))
```
```{r}
tbl_wkcomp_lag <- tbl_wkcomp_lag |>
mutate(
predict_lag = predict(fit_lag_two),
residual_lag = incremental_paid - predict_lag
)
```
Examining @tbl-aic-lag, we see that the GLS fit with unequal variance had the lowest observed AIC. However, we reject this model for two reasons. One, the parameters do not align with our observations of the data. The signs of the slopes are reversed from our exploratory data analysis which suggests that the weighted fit is overly influenced by some extreme observations. Two, the coefficients for lags less than 2 are not significant.
## Models using prior cumulative
The use of cumulative paid loss at the prior evaluation date is a well-established technique in the actuarial profession. In general, the observed quantity being modeled is cumulative. However, see [@halliwell] for compelling reasons to favor modeling the incremental change between evaluation dates. In general, an intercept is not used. See [@murphy] for an example of a model which uses an intercept.
We will express this model as in @eq-prior-one where $X_{ij}$ denotes cumulative paid losses. In @eq-prior-one, note that the cumulative losses from the prior period are being used. This form means that we cannot make an estimate for lag 1. In actuarial practice this is not a concern as this value is known when financial statements are being prepared.
$$
\begin{aligned}
Y_{ij} &= \beta_0 + \beta_1 * X_{i, j-1} + \epsilon_{ij} \\
\end{aligned}
$$ {#eq-prior-one}
Our first example will use an intercept and make no distinction by lag. This is analogous to the first model using lag as a predictor.
```{r}
#| label: fit-prior-one
fit_prior_one <- gls(
incremental_paid ~ 1 + prior_cumulative_paid,
data = tbl_wkcomp_prior,
correlation = corCompSymm(form = ~ 1 | company)
)
```
```{r }
beta_hat <- fit_prior_one$coefficients
degrees_of_freedom <- nrow(tbl_wkcomp) - length(beta_hat)
standard_error <- fit_prior_one |>
vcovCR(type = "CR0") |>
diag() |>
sqrt()
tbl_beta_hat_fit_prior_one <- tibble(
Coefficient = names(beta_hat),
Estimate = beta_hat,
Lower = beta_hat - t_crit * standard_error,
Upper = beta_hat + t_crit * standard_error
)
```
```{r}
#| label: tbl-beta-hat-fit-prior-one
#| tbl-cap: Estimated coefficients using prior cumulative as a predictor with variances different by company
#| include: TRUE
tbl_beta_hat_fit_prior_one |>
knitr::kable(digits = 2)
```
@tbl-beta-hat-fit-prior-one shows that the intercept and $\beta_1$ are both significant. The negative slope results because later lags will tend to have smaller incrementals. Yet the prior cumulative always increases. We may control for this by including the lag in our model.
As suggested by @fig-inc-by-prior-facet, we will explore a model which uses the lag as an interaction term. This is effectively the same as having a different slope for each lag.
$$
\begin{aligned}
Y_{ij} &= \beta_0 + \beta_1 \delta_1 X{i, j-1} + \ldots + \beta_9 \delta_9 X_{i, j-1} + \epsilon_{ij} \\
\delta_{k} &= \begin{cases}
1 & \text{if } j - 1 = k \\
0 & \text{if } j - 1 \ne k
\end{cases}
\end{aligned}
$$ {#eq-prior-lag-interaction}
```{r}
fit_prior_two <- gls(
incremental_paid ~ 1 + prior_cumulative_paid:lag_factor,
data = tbl_wkcomp_prior,
correlation = corCompSymm(form = ~ 1 | company)
)
```
```{r }
beta_hat <- fit_prior_two$coefficients
degrees_of_freedom <- nrow(tbl_wkcomp) - length(beta_hat)
standard_error <- fit_prior_two$varBeta |>
diag() |>
sqrt()
tbl_beta_hat_fit_prior_two <- tibble(
Coefficient = names(beta_hat),
Estimate = beta_hat,
Lower = beta_hat - t_crit * standard_error,
Upper = beta_hat + t_crit * standard_error
)
```
```{r}
#| label: tbl-beta-hat-fit_prior_two
#| tbl-cap: Estimated coefficients using prior cumulative and lag
#| include: TRUE
tbl_beta_hat_fit_prior_two |>
knitr::kable(digits = 2)
```
Now all of the $\beta$ terms are positive and the intercept is no longer significant.
```{r}
fit_prior_three <- lme(
fixed = incremental_paid ~ 1 + prior_cumulative_paid:lag_factor,
random = ~ 1 | company,
data = tbl_wkcomp_prior
)
```
```{r}
beta_hat <- fixed.effects(fit_prior_three)
standard_error <- fit_prior_three$varFix |>
diag() |>
sqrt()
degrees_of_freedom <- fit_prior_three$fixDF$X
crit <- qt(1 - alpha / 2, df = degrees_of_freedom)
tbl_beta_hat_fit_prior_three <- tibble(
Coefficient = names(beta_hat),
Estimate = beta_hat,
`Degrees of Freedom` = degrees_of_freedom,
`Standard Error` = standard_error,
Lower = beta_hat - crit * standard_error,
Upper = beta_hat + crit * standard_error
)
```
```{r}
#| label: tbl-beta-hat-fit_prior_three
#| tbl-cap: Estimated coefficients using prior cumulative and lag with a random effect intercept by company
#| include: TRUE
tbl_beta_hat_fit_prior_three |>
knitr::kable(digits = 2)
```
Having seen that the inclusion of company as a random effect may improve the fit, we will construct the same model here, beginning with a random effect by company.
An attempt was made to fit random effects for the slopes, however the model did not converge[^rstudio_crashed]. Theorizing that this had something to do with a general variance structure for the random effects, we switched to a block design wherein each lag had its own assumed variance for random effects. This is very similar to assuming 9 independent models, one for each lag. (Recall that lag one cannot be estimated.) Although such an assumption may seem extreme, it is a common approach. [Again, see @friedland]
[^rstudio_crashed]: Actually, RStudio crashed.
$$
\begin{aligned}
\pmb{b_i} \sim N \begin{bmatrix}
0, \pmb{D} = \begin{pmatrix}
D_{11} & 0 & 0 & \ldots & 0 \\
& D_{22} & 0 & \ldots & 0 \\
& & \ddots & & \vdots \\
& & & & D_{99}
\end{pmatrix}
\end{bmatrix}
\end{aligned}
$$
```{r}
mat_int <- model.matrix(
incremental_paid ~ 0 + prior_cumulative_paid:lag_factor,
data = tbl_wkcomp_prior
)
mat_int <- mat_int[, -1]
colnames(mat_int) <- paste0('prior_', 2:10)
tbl_wkcomp_prior <- cbind(
tbl_wkcomp_prior, mat_int
)
```
```{r}
formula_prior <- paste(
'incremental_paid ~ 0 +',
paste0('prior_', 2:10, collapse = ' + ')
) |>
as.formula()
fit_prior_four <- lme(
fixed = formula_prior,
random = list(company = pdBlocked(list(
~ 0 + prior_2,
~ 0 + prior_3,
~ 0 + prior_4,
~ 0 + prior_5,
~ 0 + prior_6,
~ 0 + prior_7,
~ 0 + prior_8,
~ 0 + prior_9,
~ 0 + prior_10
))
),
data = tbl_wkcomp_prior
)
```
```{r}
beta_hat <- fixed.effects(fit_prior_four)
standard_error <- fit_prior_four$varFix |>
diag() |>
sqrt()
degrees_of_freedom <- fit_prior_four$fixDF$X
crit <- qt(1 - alpha / 2, df = degrees_of_freedom)
tbl_beta_hat_fit_prior_four <- tibble(
Coefficient = names(beta_hat),
Estimate = beta_hat,
`Degrees of Freedom` = degrees_of_freedom,
`Standard Error` = standard_error,
Lower = beta_hat - crit * standard_error,
Upper = beta_hat + crit * standard_error
)
```
```{r}
#| label: tbl-beta-hat-fit_prior_four
#| tbl-cap: Estimated coefficients using prior cumulative and lag with a random effect intercept by company
#| include: TRUE
tbl_beta_hat_fit_prior_four |>
knitr::kable(digits = 2)
```
The coefficients appear reasonable, that is, they are positive and decrease by lag. Notably, none of the confidence intervals include zero, leading us to presume that they are significant.
```{r}
tbl_aic_prior <- tibble(
model = paste0('Prior model ', c('one', 'two', 'three', 'four')),
description = c(
'GLS',
'GLS w/lag interaction',
'LMM w/random intercept',
'LMM w/random slope'
),
aic = map_dbl(
list(fit_prior_one, fit_prior_two, fit_prior_three, fit_prior_four),
AIC
)
)
```
```{r}
#| label: tbl-aic-prior
#| tbl-cap: AIC for models based on prior cumulative paid
#| include: TRUE
tbl_aic_prior |>
knitr::kable(format.args = list(big.mark = ",", nsmall = 0))
```
As shown in @tbl-aic-prior, the final model with random effects for slope has the lowest AIC.
```{r}
#| label: fig-beta-prior-four
#| fig-cap: 'Estimate and confidence interval of beta estimates for prior model four'
#| include: TRUE
tbl_beta_hat_fit_prior_four |>
mutate(
Coefficient = Coefficient |> fct_relevel('prior_10', after = Inf)
) |>
ggplot(aes(Coefficient)) +
geom_point(aes(y = Estimate)) +
geom_errorbar(aes(ymin = Lower, ymax = Upper))
```
```{r}
#| label: fig-beta-prior-four-lag-9-10
#| fig-cap: 'Estimate and confidence interval of beta estimates of lags 6 through 10 for prior model four'
#| include: TRUE
tbl_beta_hat_fit_prior_four |>
filter(Coefficient %in% paste0('prior_', c(6:10))) |>
mutate(
Coefficient = Coefficient |> fct_relevel('prior_10', after = Inf)
) |>
ggplot(aes(Coefficient)) +
geom_point(aes(y = Estimate)) +
geom_errorbar(aes(ymin = Lower, ymax = Upper)) +
theme_minimal()
```
```{r}
L_8_9 <- c(0, 0, 0, 0, 0, 0, 1, -1, 0)
L_9_10 <- c(0, 0, 0, 0, 0, 0, 0, 1, -1)
```
It is interesting to observe the confidence intervals around the fixed effect coefficients for model four. @fig-beta-prior-four and @fig-beta-prior-four-lag-9-10 shows the estimates and 95% confidence intervals. Looking particularly at @fig-beta-prior-four-lag-9-10, we may wonder whether those coefficients are truly different. Using contrasts of $L_1 = \left[0, 0, 0, 0, 0, 0, 1, -1, 0 \right]$ and $L_2 = \left[0, 0, 0, 0, 0, 0, 0, 1, -1 \right]$, we perform tests to assess whether $\pmb{L}\beta = 0$.
For testing $H_0: \beta_8 = \beta_9$, we find support to reject the null.
```{r}
anova.lme(fit_prior_four, L = L_8_9, adjustSigma = TRUE)
```
And we find the same for testing $H_0: \beta_9 = \beta_{10}$.
```{r}
anova.lme(fit_prior_four, L = L_9_10, adjustSigma = TRUE)
```
## Models using earned premium
We conclude our modeling with a look at one model using net earned premium as a predictor. Such a model was first suggested in a paper by @stanard.
As with modeling against the prior cumulative loss, we will assume that the random effects are independent. Note that --- in contrast to the use or prior cumulative --- we are able to estimate values for lag 1.
$$
\begin{aligned}
Y_{ij} &= \beta_0 + \beta_1 * X_{ij} + b_{i1} + b_{i2} + \ldots + b_{i10} + \epsilon_{ij} \\
\pmb{b_i} &\sim N \begin{bmatrix}
0, \pmb{D} = \begin{pmatrix}
D_{11} & 0 & 0 & \ldots & 0 \\
& D_{22} & 0 & \ldots & 0 \\
& & \ddots & & \vdots \\
& & & & D_{10,10}
\end{pmatrix}
\end{bmatrix}
\end{aligned}
$$
```{r}
mat_int <- model.matrix(
incremental_paid ~ 0 + net_ep:lag_factor,
data = tbl_wkcomp_ep
)
colnames(mat_int) <- paste0('net_ep_', 1:10)
tbl_wkcomp_ep <- cbind(
tbl_wkcomp_ep, mat_int
)
```
```{r}
formula_net_ep <- paste(
'incremental_paid ~ 0 +',
paste0('net_ep_', 1:10, collapse = ' + ')
) |>
as.formula()
fit_ep_four <- lme(
fixed = formula_net_ep,
random = list(company = pdBlocked(list(
~ 0 + net_ep_1,
~ 0 + net_ep_2,
~ 0 + net_ep_3,
~ 0 + net_ep_4,
~ 0 + net_ep_5,
~ 0 + net_ep_6,
~ 0 + net_ep_7,
~ 0 + net_ep_8,
~ 0 + net_ep_9,
~ 0 + net_ep_10
))
),
data = tbl_wkcomp_ep
)
```
```{r}
beta_hat <- fixed.effects(fit_ep_four)
standard_error <- fit_ep_four$varFix |>
diag() |>
sqrt()
degrees_of_freedom <- fit_ep_four$fixDF$X
crit <- qt(1 - alpha / 2, df = degrees_of_freedom)
tbl_beta_hat_fit_ep_four <- tibble(
Coefficient = names(beta_hat),
Estimate = beta_hat,
`Degrees of Freedom` = degrees_of_freedom,
`Standard Error` = standard_error,
Lower = beta_hat - crit * standard_error,
Upper = beta_hat + crit * standard_error
)
```
```{r}
#| label: tbl-beta-hat-fit_ep_four
#| tbl-cap: Estimated coefficients using prior cumulative and lag with a random effect intercept by company
#| include: TRUE
tbl_beta_hat_fit_ep_four |>
knitr::kable(digits = 2)
```
# Conclusion
For this data set we find that the use of covariates in addition to the lag provides a more reasonable estimate. In particular, the use of prior cumulative loss or the net earned premium appear to outperform a model which uses the lag alone. Additionally, treating the company as a random effect improves the fit. This is consistent with our understanding of the operation of insurance companies and the manner of regulation in the markets they serve.
Future work could focus on the granularity of insurance company as a random effect. Using market knowledge, one could group companies in similar industries to reduce the number of random effect parameters. Alternately, one could explore factor analysis to identify latent groupings in the data we have.
The data under examination exhibits significant amount of skew. A model type such a generalized linear mixed model may accommodate this better.
# References
::: {#refs}
:::