Skip to content

Commit

Permalink
Merge pull request #23 from rickecon/mle
Browse files Browse the repository at this point in the history
Merging
  • Loading branch information
rickecon authored Dec 6, 2023
2 parents 51dc8ca + a38fda4 commit 66afef4
Show file tree
Hide file tree
Showing 8 changed files with 48 additions and 50 deletions.
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion docs/book/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ parts:
numbered: True
chapters:
- file: struct_est/intro
- file: struct_est/MaxLikelihood
- file: struct_est/MLE
- file: struct_est/GMM
- file: struct_est/SMM
- caption: Appendix
Expand Down
2 changes: 1 addition & 1 deletion docs/book/basic_empirics/BasicEmpirMethods.md
Original file line number Diff line number Diff line change
Expand Up @@ -386,7 +386,7 @@ results = reg1.fit()
type(results)
```

We now have the fitted regression model stored in `results` (see [statsmodels.regression.linear_model.RegressionResultsWrapper](http://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.RegressionResults.html)). The `results` from the `reg1.fit()` command is a regression results object with a lot of information, similar to the results object of the `scipy.optimize.minimize()` function we worked with in the {ref}`Chap_MaxLikeli` and {ref}`Chap_GMM` chapters.
We now have the fitted regression model stored in `results` (see [statsmodels.regression.linear_model.RegressionResultsWrapper](http://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.RegressionResults.html)). The `results` from the `reg1.fit()` command is a regression results object with a lot of information, similar to the results object of the `scipy.optimize.minimize()` function we worked with in the {ref}`Chap_MLE` and {ref}`Chap_GMM` chapters.

To view the OLS regression results, we can call the `.summary()` method.

Expand Down
2 changes: 1 addition & 1 deletion docs/book/basic_empirics/LogisticReg.md
Original file line number Diff line number Diff line change
Expand Up @@ -442,4 +442,4 @@ The footnotes from this chapter.

[^GMM]: See the {ref}`Chap_GMM` chapter of this book.

[^MaxLikeli]: See the {ref}`Chap_MaxLikeli` chapter of this book.
[^MaxLikeli]: See the {ref}`Chap_MLE` chapter of this book.
Original file line number Diff line number Diff line change
Expand Up @@ -10,19 +10,19 @@ kernelspec:
name: python3
---

(Chap_MaxLikeli)=
(Chap_MLE)=
# Maximum Likelihood Estimation

This chapter describes the maximum likelihood estimation (MLE) method. All data and images from this chapter can be found in the data directory ([./data/maxlikeli/](https://github.com/OpenSourceEcon/CompMethods/tree/main/data/maxlikeli/)) and images directory ([./images/maxlikeli/](https://github.com/OpenSourceEcon/CompMethods/tree/main/images/maxlikeli/)) for the GitHub repository for this online book.
This chapter describes the maximum likelihood estimation (MLE) method. All data and images from this chapter can be found in the data directory ([./data/mle/](https://github.com/OpenSourceEcon/CompMethods/tree/main/data/mle/)) and images directory ([./images/mle/](https://github.com/OpenSourceEcon/CompMethods/tree/main/images/mle/)) for the GitHub repository for this online book.


(SecMaxLikeli_GenModel)=
(SecMLE_GenModel)=
## General characterization of a model and data generating process

Each of the model estimation approaches that we will discuss in this section on Maximum Likelihood estimation (MLE) and in subsequent sections on generalized method of moments (GMM) and simulated method of moments (SMM) involves choosing values of the parameters of a model to make the model match some number of properties of the data. Define a model or a data generating process (DGP) as,

```{math}
:label: EqMaxLikeli_GenMod
:label: EqMLE_GenMod
F(x_t, z_t|\theta) = 0
```

Expand All @@ -31,45 +31,45 @@ where $x_t$ and $z_t$ are variables, $\theta$ is a vector of parameters, and $F(
In richer examples, a model could also include inequalities representing constraints. But this is sufficient for our discussion. The goal of maximum likelihood estimation (MLE) is to choose the parameter vector of the model $\theta$ to maximize the likelihood of seeing the data produced by the model $(x_t, z_t)$.


(SecMaxLikeli_GenModel_SimpDist)=
(SecMLE_GenModel_SimpDist)=
### Simple distribution example

A simple example of a model is a statistical distribution [e.g., the normal distribution $N(\mu, \sigma)$].

```{math}
:label: EqMaxLikeli_GenMod_NormDistPDF
:label: EqMLE_GenMod_NormDistPDF
Pr(x|\theta) = \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{(x - \mu)^2}{2\sigma^2}}
```

The probability of drawing value $x_i$ from the distribution $f(x|\theta)$ is $f(x_i|\theta)$. The probability of drawing the following vector of two observations $(x_1,x_2)$ from the distribution $f(x|\theta)$ is $f(x_1|\theta)\times f(x_2|\theta)$. We define the likelihood function of $N$ draws $(x_1,x_2,...x_N)$ from a model or distribution $f(x|\theta)$ as $\mathcal{L}$.

```{math}
:label: EqMaxLikeli_GenMod_NormDistLike
:label: EqMLE_GenMod_NormDistLike
\mathcal{L}(x_1,x_2,...x_N|\theta) \equiv \prod_{i=1}^N f(x_i|\theta)
```

Because it can be numerically difficult to maximize a product of percentages (one small value can make dominate the entire product), it is almost always easier to use the log likelihood function $\ln(\mathcal{L})$.

```{math}
:label: EqMaxLikeli_GenMod_NormDistLnLike
:label: EqMLE_GenMod_NormDistLnLike
\ln\Bigl(\mathcal{L}(x_1,x_2,...x_N|\theta)\Bigr) \equiv \sum_{i=1}^N \ln\Bigl(f(x_i|\theta)\Bigr)
```

The maximum likelihood estimate $\hat{\theta}_{MLE}$ is the following:

```{math}
:label: EqMaxLikeli_GenMod_NormDistMLE
:label: EqMLE_GenMod_NormDistMLE
\hat{\theta}_{MLE} = \theta:\quad \max_\theta \: \ln\mathcal{L} = \sum_{i=1}^N\ln\Bigl(f(x_i|\theta)\Bigr)
```


(SecMaxLikeli_GenModel_Econ)=
(SecMLE_GenModel_Econ)=
### Economic example

An example of an economic model that follows the more general definition of $F(x_t, z_t|\theta) = 0$ is {cite}`BrockMirman:1972`. This model has multiple nonlinear dynamic equations, 7 parameters, 1 exogenous time series of variables, and about 5 endogenous time series of variables. Let's look at a simplified piece of that model--the production function--which is commonly used in total factor productivity estimations.

```{math}
:label: EqMaxLikeli_GenMod_EconProdFunc
:label: EqMLE_GenMod_EconProdFunc
Y_t = e^{z_t}(K_t)^\alpha(L_t)^{1-\alpha} \quad\text{where}\quad z_t = \rho z_{t-1} + (1 - \rho)\mu + \varepsilon_t \quad\text{and}\quad \varepsilon_t\sim N(0,\sigma^2)
```

Expand All @@ -82,54 +82,47 @@ The likelihood of a given data point is determined by $\varepsilon_t = z_t - \rh
The likelihood function of all the data is:

```{math}
:label: EqMaxLikeli_GenMod_EconProdFuncLike
:label: EqMLE_GenMod_EconProdFuncLike
\mathcal{L}\left(z_1,z_2,...z_T|\rho,\mu,\sigma\right) = \prod_{t=2}^T f(z_{t+1},z_t|\rho,\mu,\sigma)
```

The log likelihood function of all the data is:

```{math}
:label: EqMaxLikeli_GenMod_EconProdFuncLnLike
:label: EqMLE_GenMod_EconProdFuncLnLike
\ln\Bigl(\mathcal{L}\bigl(z_1,z_2,...z_T|\rho,\mu,\sigma\bigr)\Bigr) = \sum_{t=2}^T \ln\Bigl(f(z_{t+1},z_t|\rho,\mu,\sigma)\Bigr)
```

The maximum likelihood estimate of $\rho$, $\mu$, and $\sigma$ is given by the following maximization problem.

```{math}
:label: EqMaxLikeli_GenMod_EconProdFuncMLE
:label: EqMLE_GenMod_EconProdFuncMLE
(\hat{\rho}_{MLE},\hat{\mu}_{MLE},\hat{\sigma}_{MLE})=(\rho,\mu,\sigma):\quad \max_{\rho,\mu,\sigma}\ln\mathcal{L} = \sum_{t=2}^T \ln\Bigl(f(z_{t+1},z_t|\rho,\mu,\sigma)\Bigr)
```


(SecMaxLikeli_DistData)=
(SecMLE_DistData)=
## Comparisons of distributions and data

Import some data from the total points earned by all the students in two sections of an intermediate macroeconomics class for undergraduates at an unnamed University in a certain year (two semesters).
Import some data from the total points earned by all the students in two sections of an intermediate macroeconomics class for undergraduates at an unnamed University in a certain year (two semesters). Let's create a histogram of the data.

```{code-cell} ipython3
:tags: []
# Import the necessary libraries
import numpy as np
import scipy.stats as sts
import matplotlib.pyplot as plt
import requests
# Download and save the data file Econ381totpts.txt
# Download and save the data file Econ381totpts.txt as NumPy array
url = ('https://raw.githubusercontent.com/OpenSourceEcon/CompMethods/' +
'main/data/maxlikeli/Econ381totpts.txt')
# data_file = requests.get(url, allow_redirects=True)
# open('../../../data/maxlikeli/Econ381totpts.txt', 'wb').write(data_file.content)
# Load the data as a NumPy array
data = np.loadtxt('../../../data/maxlikeli/Econ381totpts.txt')
```

Let's create a histogram of the data.

```{code-cell} ipython3
:tags: []
import matplotlib.pyplot as plt
'main/data/mle/Econ381totpts.txt')
data_file = requests.get(url)
if data_file.status_code == 200:
# Load the downloaded data into a NumPy array
data = np.loadtxt(data_file.content)
else:
print('Error downloading the file')
num_bins = 30
count, bins, ignored = plt.hist(data, num_bins, density=True,
Expand All @@ -138,15 +131,24 @@ plt.title('Intermediate macro scores: 2011-2012', fontsize=15)
plt.xlabel(r'Total points')
plt.ylabel(r'Percent of scores')
plt.xlim([0, 550]) # This gives the xmin and xmax to be plotted"
plt.show()
```
<!-- ```{figure} ../../../images/mle/Econ381scores_hist.png
---
height: 500px
name: FigMLE_EconScoreHist
---
Intermediate macroeconomics midterm scores over two semesters
``` -->


(SecMaxLikeli_Exerc)=
(SecMLE_Exerc)=
## Exercises



(SecMaxLikeliFootnotes)=
(SecMLEfootnotes)=
## Footnotes

The footnotes from this chapter.
Expand Down
22 changes: 9 additions & 13 deletions docs/book/struct_est/SMM.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ Let the data be represented, in general, by $x$. This could have many variables,
\theta \equiv \left[\theta_1, \theta_2, ...\theta_K\right]^T
```

In the {ref}`Chap_MaxLikeli` chapter, we used data $x$ and model parameters $\theta$ to maximize the likelihood of drawing that data $x$ from the model given parameters $\theta$,
In the {ref}`Chap_MLE` chapter, we used data $x$ and model parameters $\theta$ to maximize the likelihood of drawing that data $x$ from the model given parameters $\theta$,

```{math}
:label: EqSMM_MLestimator
Expand Down Expand Up @@ -271,7 +271,7 @@ Let the parameter vector $\theta$ have length $K$ such that $K$ parameters are b

Recall that each element of $e(\tilde{x},x|\theta)$ is an average moment error across all simulations. $\hat{\Omega}$ from the previous section is the $R\times R$ variance-covariance matrix of the $R$ moment errors used to identify the $K$ parameters $\theta$ to be estimated. The estimated variance-covariance matrix $\hat{\Sigma}$ of the estimated parameter vector is a $K\times K$ matrix. We say the model is *exactly identified* if $K = R$ (number of parameters $K$ equals number of moments $R$). We say the model is *overidentified* if $K<R$. We say the model is *not identified* or *underidentified* if $K>R$.

Similar to the inverse Hessian estimator of the variance-covariance matrix of the maximum likelihood estimator from the {ref}`Chap_MaxLikeli` chapter, the SMM variance-covariance matrix is related to the derivative of the criterion function with respect to each parameter. The intuition is that if the second derivative of the criterion function with respect to the parameters is large, there is a lot of curvature around the criterion minimizing estimate. In other words, the parameters of the model are precisely estimated. The inverse of the Hessian matrix will be small.
Similar to the inverse Hessian estimator of the variance-covariance matrix of the maximum likelihood estimator from the {ref}`Chap_MLE` chapter, the SMM variance-covariance matrix is related to the derivative of the criterion function with respect to each parameter. The intuition is that if the second derivative of the criterion function with respect to the parameters is large, there is a lot of curvature around the criterion minimizing estimate. In other words, the parameters of the model are precisely estimated. The inverse of the Hessian matrix will be small.

Define $R\times K$ matrix $d(\tilde{x},x|\theta)$ as the Jacobian matrix of derivatives of the $R\times 1$ error vector $e(\tilde{x},x|\theta)$ from {eq}`EqSMM_MomError_vec`.

Expand Down Expand Up @@ -324,12 +324,12 @@ The following is a centered second-order finite difference numerical approximati
(SecSMM_CodeExmp)=
## Code Examples

In this section, we will use SMM to estimate parameters of the models from the {ref}`Chap_MaxLikeli` chapter and from the {ref}`Chap_GMM` chapter.
In this section, we will use SMM to estimate parameters of the models from the {ref}`Chap_MLE` chapter and from the {ref}`Chap_GMM` chapter.

(SecSMM_CodeExmp_MacrTest)=
### Fitting a truncated normal to intermediate macroeconomics test scores

Let's revisit the problem from the MLE and GMM notebooks of fitting a truncated normal distribution to intermediate macroeconomics test scores. The data are in the text file [`Econ381totpts.txt`](https://github.com/OpenSourceEcon/CompMethods/blob/main/data/smm/Econ381totpts.txt). Recall that these test scores are between 0 and 450. {numref}`Figure %s <FigSMM_EconScoreTruncNorm>` below shows a histogram of the data, as well as three truncated normal PDF's with different values for $\mu$ and $\sigma$. The black line is the maximum likelihood estimate of $\mu$ and $\sigma$ of the truncated normal pdf from the {ref}`Chap_MaxLikeli` chapter. The red, green, and black lines are just the PDF's of two "arbitrarily" chosen combinations of the truncated normal parameters $\mu$ and $\sigma$.[^TruncNorm]
Let's revisit the problem from the MLE and GMM notebooks of fitting a truncated normal distribution to intermediate macroeconomics test scores. The data are in the text file [`Econ381totpts.txt`](https://github.com/OpenSourceEcon/CompMethods/blob/main/data/smm/Econ381totpts.txt). Recall that these test scores are between 0 and 450. {numref}`Figure %s <FigSMM_EconScoreTruncNorm>` below shows a histogram of the data, as well as three truncated normal PDF's with different values for $\mu$ and $\sigma$. The black line is the maximum likelihood estimate of $\mu$ and $\sigma$ of the truncated normal pdf from the {ref}`Chap_MLE` chapter. The red, green, and black lines are just the PDF's of two "arbitrarily" chosen combinations of the truncated normal parameters $\mu$ and $\sigma$.[^TruncNorm]

```{code-cell} ipython3
:tags: ["hide-input", "remove-output"]
Expand Down Expand Up @@ -394,20 +394,16 @@ def trunc_norm_pdf(xvals, mu, sigma, cut_lb, cut_ub):
return pdf_vals
# Download and save the data file Econ381totpts.txt
# Download and save the data file Econ381totpts.txt as NumPy array
url = ('https://raw.githubusercontent.com/OpenSourceEcon/CompMethods/' +
'main/data/smm/Econ381totpts.txt')
data_file = requests.get(url, allow_redirects=True)
open('../../../data/smm/Econ381totpts.txt', 'wb').write(data_file.content)
# Load the data as a NumPy array
data = np.loadtxt('../../../data/smm/Econ381totpts.txt')
data = np.loadtxt(url)
num_bins = 30
count, bins, ignored = plt.hist(
data, num_bins, density=True, edgecolor='k', label='data'
)
plt.title('Econ 381 scores: 2011-2012', fontsize=20)
plt.title('Intermediate macro scores: 2011-2012', fontsize=20)
plt.xlabel(r'Total points')
plt.ylabel(r'Percent of scores')
plt.xlim([0, 550]) # This gives the xmin and xmax to be plotted"
Expand Down Expand Up @@ -975,7 +971,7 @@ name: FigSMM_Econ381_SMM1
SMM-estimated PDF function and data histogram, 2 moments, identity weighting matrix, Econ 381 scores (2011-2012)
```

That looks just like the maximum likelihood estimate from the {ref}`Chap_MaxLikeli` chapter. {numref}`Figure %s <FigSMM_Econ381_crit1>` below shows what the minimizer is doing. The figure shows the criterion function surface for different of $\mu$ and $\sigma$ in the truncated normal distribution. The minimizer is searching for the parameter values that give the lowest criterion function value.
That looks just like the maximum likelihood estimate from the {ref}`Chap_MLE` chapter. {numref}`Figure %s <FigSMM_Econ381_crit1>` below shows what the minimizer is doing. The figure shows the criterion function surface for different of $\mu$ and $\sigma$ in the truncated normal distribution. The minimizer is searching for the parameter values that give the lowest criterion function value.

```{code-cell} ipython3
:tags: ["remove-output"]
Expand Down Expand Up @@ -1071,7 +1067,7 @@ In the next section, we see if we can get more accurate estimates (lower criteri

(SecSMM_CodeExmp_MacrTest_2m2st)=
#### Two moments, two-step optimal weighting matrix
Similar to the maximum likelihood estimation problem in Chapter {ref}`Chap_MaxLikeli`, it looks like the minimum value of the criterion function shown in {numref}`Figure %s <FigSMM_Econ381_crit1>` is roughly equal for a specific portion increase of $\mu$ and $\sigma$ together. That is, the estimation problem with these two moments probably has a correspondence of values of $\mu$ and $\sigma$ that give roughly the same minimum criterion function value. This issue has two possible solutions.
Similar to the maximum likelihood estimation problem in Chapter {ref}`Chap_MLE`, it looks like the minimum value of the criterion function shown in {numref}`Figure %s <FigSMM_Econ381_crit1>` is roughly equal for a specific portion increase of $\mu$ and $\sigma$ together. That is, the estimation problem with these two moments probably has a correspondence of values of $\mu$ and $\sigma$ that give roughly the same minimum criterion function value. This issue has two possible solutions.

1. Maybe we need the two-step variance covariance estimator to calculate a "more" optimal weighting matrix $W$.
2. Maybe our two moments aren't very good moments for fitting the data.
Expand Down
Binary file added images/mle/Econ381scores_hist.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 66afef4

Please sign in to comment.