-
Notifications
You must be signed in to change notification settings - Fork 0
/
homework.Rmd
205 lines (148 loc) · 4.81 KB
/
homework.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
---
title: "Homework"
output:
pdf_document: default
html_document:
df_print: paged
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning=FALSE)
```
# Exercise 2
Data: The series are of various lengths but all end in 1988. The data set contains the following series: consumer price index, industrial production, nominal GNP, velocity, employment, interest rate, nominal wages, GNP deflator, money stock, real GNP, stock prices (S&P500), GNP per capita, realwages, unemployment.
We look only at the GNP per capita, nominal GNP and the real GNP.
Source: C. R. Nelson and C. I. Plosser (1982), Trends and Random Walks in Macroeconomic Time Series.Journal of Monetary Economics,10, 139–162. doi: 10.1016/03043932(82)900125.Formerly in the Journal of Business and Economic Statistics data archive, currently athttp://korora.econ.yale.edu/phillips/data/np&enp.dat.
## 1.
### Stationarity
```{r echo=FALSE, results='hide'}
require(MTS)
library(tseries)
require(sparsevar)
```
First we read the data and do some preprocessing.
```{r}
data(NelPlo)
gnp <- cbind(1,2,gnp.capita, gnp.nom, gnp.real)
n <- dim(gnp)[1]
```
We will look at 3 different versions of the data: Original, log-transformation, series of differences of the log-transformation.
```{r}
Y_orig <- gnp[,3:5]
Y_log=log(gnp[,3:5])
Y_rate <- Y_log[2:n,] - Y_log[1:(n-1),]
Y_rate <- 100*Y_rate
```
*Original:*
```{r}
par(mfrow=c(2,3))
plot(Y_orig[,1],type="l",xlab="",ylab="Log",main="GNP per Capita")
plot(Y_orig[,2],type="l",xlab="",ylab="Log",main="Nominal GNP")
plot(Y_orig[,3],type="l",xlab="",ylab="Log",main="Real GNP")
acf(Y_orig[,1],main="")
acf(Y_orig[,2],main="")
acf(Y_orig[,3],main="")
```
We see that the original data is not stationary.
*Log-Transformation:*
```{r fig.height=5, fig.width=9}
par(mfrow=c(2,3))
plot(Y_log[,1],type="l",xlab="",ylab="Log",main="GNP per Capita")
plot(Y_log[,2],type="l",xlab="",ylab="Log",main="Nominal GNP")
plot(Y_log[,3],type="l",xlab="",ylab="Log",main="Real GNP")
acf(Y_log[,1],main="")
acf(Y_log[,2],main="")
acf(Y_log[,3],main="")
```
Same goes for the log-transformation.
*Log-Transformation rates:*
```{r fig.height=5, fig.width=9}
par(mfrow=c(2,3))
plot(Y_rate[,1],type="l",xlab="",ylab="GNP per Capita")
plot(Y_rate[,2],type="l",xlab="",ylab="Nominal GNP")
plot(Y_rate[,3],type="l",xlab="",ylab="Real GNP")
acf(Y_rate[,1],main="")
acf(Y_rate[,2],main="")
acf(Y_rate[,3],main="")
```
For the rates we can find stationary for all three GNP series. For all three the autocorrelation vanhishes with a lag of 3 which results in $q=2$ for the MA.
Looking at the partial autocorrelation we find the following:
```{r fig.height=2}
par(mfrow=c(1,3))
pacf(Y_rate[,1], main="GNP per Capita")
pacf(Y_rate[,2], main="Nominal GNP")
pacf(Y_rate[,3], main="Real GNP")
```
The GNP partial autocorrelation vanishes after a lag of 2, which results in $p=1$ for the AR part.
### ARMA
We can create an ARMA model for each series individually.
*GNP per Capita:*
```{r}
arma.1 <- arma(Y_rate[,1], order = c(1, 2))
summary(arma.1)
```
*Nominal GNP:*
```{r}
arma.2 <- arma(Y_rate[,2], order = c(1, 2))
summary(arma.2)
```
*Real GNP:*
```{r}
arma.3 <- arma(Y_rate[,3], order = c(1, 2))
summary(arma.3)
```
We see that for each ARMA model the fit is not perfect. Especially the model for the Real GNP shows flaws.
## 2.
### VAR(1) model
```{r}
mod=VAR(Y_rate,1)
res=mod$residuals
```
Checking the WN assumption:
```{r, results='hide', fig.keep='all', fig.height=3}
mq(res,adj=1*3^2)
par(mfrow=c(1,3))
acf(res[,1], main="")
acf(res[,2], main="")
acf(res[,3], main="")
```
```{r results='hide'}
VARorder(Y_rate) # Selected order is 1
mod2=refVAR(mod,thres=1.96) #remove non significant coefficients using t stats
mod$aic
mod2$aic
```
Considering the AIC and BIC the reduced model performs better.
```{r message=FALSE, results='hide'}
pred1 <- VARpred(mod,1)
pred2 <- VARpred(mod2,1)
rmse <- rbind(mod1=pred1$rmse, mod2=pred2$rmse)
rownames(rmse) <- c("model1", "model2")
```
```{r echo=FALSE}
rmse
```
We can see that the prediction is better for the full model (mod1). But the difference is rather small. It might make sense to consider the simpler model (mod2) then.
## 3. VAR with LASSO
```{r}
mod_lasso=fitVAR(Y_rate,p=1,penalty="ENET",method="cv")
```
When we look at the coefficients we see that only the coefficients for the real GNP are of a considerable amplitude.
```{r}
coef=mod_lasso$A;A1lasso=coef[[1]]
plotMatrix(A1lasso)
```
Checking the WN assumption
```{r results='hide', fig.keep='all', fig.height=3}
res_lasso=mod_lasso$residuals
mq(res_lasso,adj=1*3^2)
par(mfrow=c(1,3))
acf(res_lasso[,1], main="")
acf(res_lasso[,2], main="")
acf(res_lasso[,3], main="")
```
We see that the White Noise assumption does hold for all three series.
### Comparison with the simple VAR
```{r}
mod_lasso$A
```
I don't know :(