-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path03-dependency.Rmd
109 lines (65 loc) · 5.86 KB
/
03-dependency.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
# Dependency
Generally speaking, there is a dependence that within the sequence of random variables.
Recall the difference between independent and dependent data:
*Definition:* **Independence**
$X_1, X_2, \ldots, X_T$ are independent and identically distributed if and only if
$$P\left(X_1 \le x_1, X_2 \le x_2,\ldots, X_{T} \le x_T \right) = P\left(X_1 \le x_1\right) P\left(X_2 \le x_2\right) \cdots P\left(X_{T} \le x_T \right)$$
for any $T \ge 2$ and $x_1, \ldots, x_T \in \mathbb{R}$.
*Definition:* **Dependence**
$X_1, X_2, \ldots, X_T$ are identically distributed but dependent, then
\[\left| {P\left( {{X_1} < {x_1},{X_2} < {x_2}, \ldots ,{X_T} < {x_T}} \right) - P\left( {{X_1} < {x_1}} \right)P\left( {{X_2} < {x_2}} \right) \cdots P\left( {{X_T} < {x_T}} \right)} \right| \ne 0\]
for some $x_1, \ldots, x_T \in \mathbb{R}$.
## Measuring (Linear) Dependence
There are many forms of dependency...
![dependency](images/1280px-Correlation_examples2.png)
However, the methods, covariance and correlation, that we will be using are specific to measuring linear dependence. As a result, these tools are less helpful to measure monotonic dependence and they are much less helpful to measure nonlinearly dependence.
### Autocovariance Function
Dependence between $T$ different RV is difficult to measure in one shot! So we consider just two random variables, $X_t$ and $X_{t+h}$. Then one
(linear) measure of dependence is the covariance between $\left(X_t , X_{t+h}\right)$. Since $X$ is the same RV observed at two different time
points, the covariance between $X_t$ and $X_{t+h}$ is defined as the Autocovariance.
*Definition:* **Autocovariance Function**
The **Autocovariance Function** is defined as the second moment product
\[{\gamma _x}\left( {t,t+h} \right) = \operatorname{cov} \left( {{x_t},{x_{t+h}}} \right) = E\left[ {\left( {{x_t} - {\mu _t}} \right)\left( {{x_{t+h}} - {\mu _{t+h}}} \right)} \right]\]
for all $t$ and $t+h$.
The notation used above corresponds to:
\[\begin{aligned}
\operatorname{cov} \left( {{X_t},{X_{t+h}}} \right) &= E\left[ {{X_t}{X_{t+h}}} \right] - E\left[ {{X_t}} \right]E\left[ {{X_{t+h}}} \right] \\
E\left[ {{X_t}} \right] &= \int\limits_{ - \infty }^\infty {x \cdot {f_x}\left( x \right)dx} \\
E\left[ {{X_t}{X_{t+h}}} \right] &= \int\limits_{ - \infty }^\infty {\int\limits_{ - \infty }^\infty {{x_1}{x_2} \cdot f\left( {{x_1},{x_2}} \right)d{x_1}d{x_2}} } \\
\end{aligned} \]
We normally drop the subscript referring to the time series if it is clear to the time series the autocovariance function is referencing.
e.g. ${\gamma _x}\left( {t,t+h} \right) = {\gamma}\left( {t,t+h} \right)$.
The more commonly used formulation for weakly stationary processes (more next section) is:
\[\gamma \left( {{X_t},{X_{t + h}}} \right) = \operatorname{cov} \left( {{X_t},{X_{t+h}}} \right) = \gamma \left( {h} \right)\]
A few other notes:
1. The covariance function is symmetric. That is, ${\gamma}\left( {t,t+h} \right) = {\gamma}\left( {t+h,t} \right)$
2. Just as any covariance, the ${\gamma}\left( {t,t+h} \right)$ is "scale dependent", ${\gamma}\left( {t,t+h} \right) \in \mathbb{R}$, or $-\infty \le {\gamma}\left( {t,t+h} \right) \le +\infty$
1. If $\left| {\gamma}\left( {t,t+h} \right) \right|$ is "close" to 0, then they are "less dependent"
2. If $\left| {\gamma}\left( {t,t+h} \right) \right|$ is "far" from 0, $X_t$ and $X_{t+h}$ are "more dependent".
3. ${\gamma}\left( {t,t+h} \right)=0$ does not imply $X_t$ and $X_{t+h}$ are independent.
4. If $X_t$ and $X_{t+h}$ are joint normally distributed then $X_t$ and $X_{t+h}$ are independent.
### Autocorrelation Function (ACF)
A "simplified" $\gamma \left(t, t+h\right)$ is the Autocorrelation (AC) between $X_t$ and $X_{t+h}$, which is scale free! It is simply defined as
$$\rho \left( {{X_t},{X_{t + h}}} \right) = Corr\left( {{X_t},{X_{t + h}}} \right) = \frac{{Cov\left( {{X_t},{X_{t + h}}} \right)}}{{{\sigma _{{X_t}}}{\sigma _{{X_{t + h}}}}}}$$
The more commonly used formulation for weakly stationary processes (more next section) is:
\[\rho \left( {{X_t},{X_{t + h}}} \right) = \frac{{Cov\left( {{X_t},{X_{t + h}}} \right)}}{{{\sigma _{{X_t}}}{\sigma _{{X_{t + h}}}}}} = \frac{{\gamma \left( h \right)}}{{\gamma \left( 0 \right)}} = \rho \left( h \right)\]
Therefore, the autocorrelation function is only a function of the lag $h$ between observations.
Just as any correlation:
1. $\rho \left( {{X_t},{X_{t + h}}} \right)$ is scale free
2. $\rho \left( {{X_t},{X_{t + h}}} \right)$ is closer to $\pm 1 \Rightarrow \left({ X_t, X_{t+h} } \right)$ "more dependent."
Remember... When using correlation....
![correlation_sillies](images/correlation-does-not-imply-causation.jpg)
### Cross dependency functions
Consider two time series: $\left(X_t \right)$ and $\left(Y_t \right)$.
Then the cross-covariance function between two series $\left(X_t \right)$ and $\left(Y_t \right)$ is:
\[{\gamma _{XY}}\left( {t,t + h} \right) = \operatorname{cov} \left( {{X_t},{Y_{t + h}}} \right) = E\left[ {\left( {{X_t} - E\left[ {{X_t}} \right]} \right)\left( {{Y_{t + h}} - E\left[ {{Y_{t + h}}} \right]} \right)} \right]\]
The cross-correlation function is given by
\[{\rho _{XY}}\left( {t,t + h} \right) = Corr\left( {{X_t},{Y_{t + h}}} \right) = \frac{{{\gamma _{XY}}\left( {t,t + h} \right)}}{{{\sigma _{{X_t}}}{\sigma _{{Y_{t + h}}}}}}\]
These ideas can extended beyond the bivariate case to a general multivariate setting.
### Sample Autocovariance and Autocorrelation Functions
*Definition:* **Sample Autocovariance Function**
The **Sample Autocovariance Function** is defined as:
\[\hat \gamma \left( h \right) = \frac{1}{T}\sum\limits_{t = 1}^{T - h} {\left( {{X_t} - \bar X} \right)\left( {{X_{t + h}} - \bar X} \right)} \]
*Definition:* **Sample Autocorrelation function**
The **Sample Autocorrelation function** is defined as:
\[\hat \rho \left( h \right) = \frac{{\hat \gamma \left( h \right)}}{{\hat \gamma \left( 0 \right)}}\]