Merge pull request #23 from rickecon/mle

Merging
OpenSourceEcon · Dec 6, 2023 · 66afef4 · 66afef4
2 parents 51dc8ca + a38fda4
commit 66afef4
Show file tree

Hide file tree

Showing 8 changed files with 48 additions and 50 deletions.
diff --git a/data/maxlikeli/Econ381totpts.txt → data/mle/Econ381totpts.txt b/data/maxlikeli/Econ381totpts.txt → data/mle/Econ381totpts.txt
diff --git a/data/maxlikeli/clms.txt → data/mle/clms.txt b/data/maxlikeli/clms.txt → data/mle/clms.txt
diff --git a/docs/book/_toc.yml b/docs/book/_toc.yml
@@ -38,7 +38,7 @@ parts:
     numbered: True
     chapters:
     - file: struct_est/intro
-    - file: struct_est/MaxLikelihood
+    - file: struct_est/MLE
     - file: struct_est/GMM
     - file: struct_est/SMM
   - caption: Appendix

diff --git a/docs/book/basic_empirics/BasicEmpirMethods.md b/docs/book/basic_empirics/BasicEmpirMethods.md
@@ -386,7 +386,7 @@ results = reg1.fit()
 type(results)
 ```
 
-We now have the fitted regression model stored in `results` (see [statsmodels.regression.linear_model.RegressionResultsWrapper](http://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.RegressionResults.html)). The `results` from the `reg1.fit()` command is a regression results object with a lot of information, similar to the results object of the `scipy.optimize.minimize()` function we worked with in the {ref}`Chap_MaxLikeli` and {ref}`Chap_GMM` chapters.
+We now have the fitted regression model stored in `results` (see [statsmodels.regression.linear_model.RegressionResultsWrapper](http://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.RegressionResults.html)). The `results` from the `reg1.fit()` command is a regression results object with a lot of information, similar to the results object of the `scipy.optimize.minimize()` function we worked with in the {ref}`Chap_MLE` and {ref}`Chap_GMM` chapters.
 
 To view the OLS regression results, we can call the `.summary()` method.
 

diff --git a/docs/book/basic_empirics/LogisticReg.md b/docs/book/basic_empirics/LogisticReg.md
@@ -442,4 +442,4 @@ The footnotes from this chapter.
 
 [^GMM]: See the {ref}`Chap_GMM` chapter of this book.
 
-[^MaxLikeli]: See the {ref}`Chap_MaxLikeli` chapter of this book.
+[^MaxLikeli]: See the {ref}`Chap_MLE` chapter of this book.
diff --git a/docs/book/struct_est/MaxLikelihood.md → docs/book/struct_est/MLE.md b/docs/book/struct_est/MaxLikelihood.md → docs/book/struct_est/MLE.md
@@ -10,19 +10,19 @@ kernelspec:
   name: python3
 ---
 
-(Chap_MaxLikeli)=
+(Chap_MLE)=
 # Maximum Likelihood Estimation
 
-This chapter describes the maximum likelihood estimation (MLE) method. All data and images from this chapter can be found in the data directory ([./data/maxlikeli/](https://github.com/OpenSourceEcon/CompMethods/tree/main/data/maxlikeli/)) and images directory ([./images/maxlikeli/](https://github.com/OpenSourceEcon/CompMethods/tree/main/images/maxlikeli/)) for the GitHub repository for this online book.
+This chapter describes the maximum likelihood estimation (MLE) method. All data and images from this chapter can be found in the data directory ([./data/mle/](https://github.com/OpenSourceEcon/CompMethods/tree/main/data/mle/)) and images directory ([./images/mle/](https://github.com/OpenSourceEcon/CompMethods/tree/main/images/mle/)) for the GitHub repository for this online book.
 
 
-(SecMaxLikeli_GenModel)=
+(SecMLE_GenModel)=
 ## General characterization of a model and data generating process
 
 Each of the model estimation approaches that we will discuss in this section on Maximum Likelihood estimation (MLE) and in subsequent sections on generalized method of moments (GMM) and simulated method of moments (SMM) involves choosing values of the parameters of a model to make the model match some number of properties of the data. Define a model or a data generating process (DGP) as,
 
 ```{math}
-    :label: EqMaxLikeli_GenMod
+    :label: EqMLE_GenMod
     F(x_t, z_t|\theta) = 0
 ```
 
@@ -31,45 +31,45 @@ where $x_t$ and $z_t$ are variables, $\theta$ is a vector of parameters, and $F(
 In richer examples, a model could also include inequalities representing constraints. But this is sufficient for our discussion. The goal of maximum likelihood estimation (MLE) is to choose the parameter vector of the model $\theta$ to maximize the likelihood of seeing the data produced by the model $(x_t, z_t)$.
 
 
-(SecMaxLikeli_GenModel_SimpDist)=
+(SecMLE_GenModel_SimpDist)=
 ### Simple distribution example
 
 A simple example of a model is a statistical distribution [e.g., the normal distribution $N(\mu, \sigma)$].
 
 ```{math}
-    :label: EqMaxLikeli_GenMod_NormDistPDF
+    :label: EqMLE_GenMod_NormDistPDF
     Pr(x|\theta) = \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{(x - \mu)^2}{2\sigma^2}}
 ```
 
 The probability of drawing value $x_i$ from the distribution $f(x|\theta)$ is $f(x_i|\theta)$. The probability of drawing the following vector of two observations $(x_1,x_2)$ from the distribution $f(x|\theta)$ is $f(x_1|\theta)\times f(x_2|\theta)$. We define the likelihood function of $N$ draws $(x_1,x_2,...x_N)$ from a model or distribution $f(x|\theta)$ as $\mathcal{L}$.
 
 ```{math}
-    :label: EqMaxLikeli_GenMod_NormDistLike
+    :label: EqMLE_GenMod_NormDistLike
     \mathcal{L}(x_1,x_2,...x_N|\theta) \equiv \prod_{i=1}^N f(x_i|\theta)
 ```
 
 Because it can be numerically difficult to maximize a product of percentages (one small value can make dominate the entire product), it is almost always easier to use the log likelihood function $\ln(\mathcal{L})$.
 
 ```{math}
-    :label: EqMaxLikeli_GenMod_NormDistLnLike
+    :label: EqMLE_GenMod_NormDistLnLike
     \ln\Bigl(\mathcal{L}(x_1,x_2,...x_N|\theta)\Bigr) \equiv \sum_{i=1}^N \ln\Bigl(f(x_i|\theta)\Bigr)
 ```
 
 The maximum likelihood estimate $\hat{\theta}_{MLE}$ is the following:
 
 ```{math}
-    :label: EqMaxLikeli_GenMod_NormDistMLE
+    :label: EqMLE_GenMod_NormDistMLE
     \hat{\theta}_{MLE} = \theta:\quad \max_\theta \: \ln\mathcal{L} = \sum_{i=1}^N\ln\Bigl(f(x_i|\theta)\Bigr)
 ```
 
 
-(SecMaxLikeli_GenModel_Econ)=
+(SecMLE_GenModel_Econ)=
 ### Economic example
 
 An example of an economic model that follows the more general definition of $F(x_t, z_t|\theta) = 0$ is {cite}`BrockMirman:1972`. This model has multiple nonlinear dynamic equations, 7 parameters, 1 exogenous time series of variables, and about 5 endogenous time series of variables. Let's look at a simplified piece of that model--the production function--which is commonly used in total factor productivity estimations.
 
 ```{math}
-    :label: EqMaxLikeli_GenMod_EconProdFunc
+    :label: EqMLE_GenMod_EconProdFunc
     Y_t = e^{z_t}(K_t)^\alpha(L_t)^{1-\alpha} \quad\text{where}\quad z_t = \rho z_{t-1} + (1 - \rho)\mu + \varepsilon_t \quad\text{and}\quad \varepsilon_t\sim N(0,\sigma^2)
 ```
 
@@ -82,54 +82,47 @@ The likelihood of a given data point is determined by $\varepsilon_t = z_t - \rh
 The likelihood function of all the data is:
 
 ```{math}
-    :label: EqMaxLikeli_GenMod_EconProdFuncLike
+    :label: EqMLE_GenMod_EconProdFuncLike
     \mathcal{L}\left(z_1,z_2,...z_T|\rho,\mu,\sigma\right) = \prod_{t=2}^T f(z_{t+1},z_t|\rho,\mu,\sigma)
 ```
 
 The log likelihood function of all the data is:
 
 ```{math}
-    :label: EqMaxLikeli_GenMod_EconProdFuncLnLike
+    :label: EqMLE_GenMod_EconProdFuncLnLike
     \ln\Bigl(\mathcal{L}\bigl(z_1,z_2,...z_T|\rho,\mu,\sigma\bigr)\Bigr) = \sum_{t=2}^T \ln\Bigl(f(z_{t+1},z_t|\rho,\mu,\sigma)\Bigr)
 ```
 
 The maximum likelihood estimate of $\rho$, $\mu$, and $\sigma$ is given by the following maximization problem.
 
 ```{math}
-    :label: EqMaxLikeli_GenMod_EconProdFuncMLE
+    :label: EqMLE_GenMod_EconProdFuncMLE
     (\hat{\rho}_{MLE},\hat{\mu}_{MLE},\hat{\sigma}_{MLE})=(\rho,\mu,\sigma):\quad \max_{\rho,\mu,\sigma}\ln\mathcal{L} = \sum_{t=2}^T \ln\Bigl(f(z_{t+1},z_t|\rho,\mu,\sigma)\Bigr)
 ```
 
 
-(SecMaxLikeli_DistData)=
+(SecMLE_DistData)=
 ## Comparisons of distributions and data
 
-Import some data from the total points earned by all the students in two sections of an intermediate macroeconomics class for undergraduates at an unnamed University in a certain year (two semesters).
+Import some data from the total points earned by all the students in two sections of an intermediate macroeconomics class for undergraduates at an unnamed University in a certain year (two semesters). Let's create a histogram of the data.
 
 ```{code-cell} ipython3
 :tags: []
 
 # Import the necessary libraries
 import numpy as np
-import scipy.stats as sts
+import matplotlib.pyplot as plt
 import requests
 
-# Download and save the data file Econ381totpts.txt
+# Download and save the data file Econ381totpts.txt as NumPy array
 url = ('https://raw.githubusercontent.com/OpenSourceEcon/CompMethods/' +
-       'main/data/maxlikeli/Econ381totpts.txt')
-# data_file = requests.get(url, allow_redirects=True)
-# open('../../../data/maxlikeli/Econ381totpts.txt', 'wb').write(data_file.content)
-
-# Load the data as a NumPy array
-data = np.loadtxt('../../../data/maxlikeli/Econ381totpts.txt')
-```
-
-Let's create a histogram of the data.
-
-```{code-cell} ipython3
-:tags: []
-
-import matplotlib.pyplot as plt
+       'main/data/mle/Econ381totpts.txt')
+data_file = requests.get(url)
+if data_file.status_code == 200:
+    # Load the downloaded data into a NumPy array
+    data = np.loadtxt(data_file.content)
+else:
+    print('Error downloading the file')
 
 num_bins = 30
 count, bins, ignored = plt.hist(data, num_bins, density=True,
@@ -138,15 +131,24 @@ plt.title('Intermediate macro scores: 2011-2012', fontsize=15)
 plt.xlabel(r'Total points')
 plt.ylabel(r'Percent of scores')
 plt.xlim([0, 550])  # This gives the xmin and xmax to be plotted"
+
+plt.show()
 ```
+<!-- ```{figure} ../../../images/mle/Econ381scores_hist.png
+---
+height: 500px
+name: FigMLE_EconScoreHist
+---
+Intermediate macroeconomics midterm scores over two semesters
+``` -->
 
 
-(SecMaxLikeli_Exerc)=
+(SecMLE_Exerc)=
 ## Exercises
 
 
 
-(SecMaxLikeliFootnotes)=
+(SecMLEfootnotes)=
 ## Footnotes
 
 The footnotes from this chapter.

diff --git a/docs/book/struct_est/SMM.md b/docs/book/struct_est/SMM.md
@@ -28,7 +28,7 @@ Let the data be represented, in general, by $x$. This could have many variables,
     \theta \equiv \left[\theta_1, \theta_2, ...\theta_K\right]^T
 ```
 
-In the {ref}`Chap_MaxLikeli` chapter, we used data $x$ and model parameters $\theta$ to maximize the likelihood of drawing that data $x$ from the model given parameters $\theta$,
+In the {ref}`Chap_MLE` chapter, we used data $x$ and model parameters $\theta$ to maximize the likelihood of drawing that data $x$ from the model given parameters $\theta$,
 
 ```{math}
     :label: EqSMM_MLestimator
@@ -271,7 +271,7 @@ Let the parameter vector $\theta$ have length $K$ such that $K$ parameters are b
 
 Recall that each element of $e(\tilde{x},x|\theta)$ is an average moment error across all simulations. $\hat{\Omega}$ from the previous section is the $R\times R$ variance-covariance matrix of the $R$ moment errors used to identify the $K$ parameters $\theta$ to be estimated. The estimated variance-covariance matrix $\hat{\Sigma}$ of the estimated parameter vector is a $K\times K$ matrix. We say the model is *exactly identified* if $K = R$ (number of parameters $K$ equals number of moments $R$). We say the model is *overidentified* if $K<R$. We say the model is *not identified* or *underidentified* if $K>R$.
 
-Similar to the inverse Hessian estimator of the variance-covariance matrix of the maximum likelihood estimator from the {ref}`Chap_MaxLikeli` chapter, the SMM variance-covariance matrix is related to the derivative of the criterion function with respect to each parameter. The intuition is that if the second derivative of the criterion function with respect to the parameters is large, there is a lot of curvature around the criterion minimizing estimate. In other words, the parameters of the model are precisely estimated. The inverse of the Hessian matrix will be small.
+Similar to the inverse Hessian estimator of the variance-covariance matrix of the maximum likelihood estimator from the {ref}`Chap_MLE` chapter, the SMM variance-covariance matrix is related to the derivative of the criterion function with respect to each parameter. The intuition is that if the second derivative of the criterion function with respect to the parameters is large, there is a lot of curvature around the criterion minimizing estimate. In other words, the parameters of the model are precisely estimated. The inverse of the Hessian matrix will be small.
 
 Define $R\times K$ matrix $d(\tilde{x},x|\theta)$ as the Jacobian matrix of derivatives of the $R\times 1$ error vector $e(\tilde{x},x|\theta)$ from {eq}`EqSMM_MomError_vec`.
 
@@ -324,12 +324,12 @@ The following is a centered second-order finite difference numerical approximati
 (SecSMM_CodeExmp)=
 ## Code Examples
 
-In this section, we will use SMM to estimate parameters of the models from the {ref}`Chap_MaxLikeli` chapter and from the {ref}`Chap_GMM` chapter.
+In this section, we will use SMM to estimate parameters of the models from the {ref}`Chap_MLE` chapter and from the {ref}`Chap_GMM` chapter.
 
 (SecSMM_CodeExmp_MacrTest)=
 ### Fitting a truncated normal to intermediate macroeconomics test scores
 
-Let's revisit the problem from the MLE and GMM notebooks of fitting a truncated normal distribution to intermediate macroeconomics test scores. The data are in the text file [`Econ381totpts.txt`](https://github.com/OpenSourceEcon/CompMethods/blob/main/data/smm/Econ381totpts.txt). Recall that these test scores are between 0 and 450. {numref}`Figure %s <FigSMM_EconScoreTruncNorm>` below shows a histogram of the data, as well as three truncated normal PDF's with different values for $\mu$ and $\sigma$. The black line is the maximum likelihood estimate of $\mu$ and $\sigma$ of the truncated normal pdf from the {ref}`Chap_MaxLikeli` chapter. The red, green, and black lines are just the PDF's of two "arbitrarily" chosen combinations of the truncated normal parameters $\mu$ and $\sigma$.[^TruncNorm]
+Let's revisit the problem from the MLE and GMM notebooks of fitting a truncated normal distribution to intermediate macroeconomics test scores. The data are in the text file [`Econ381totpts.txt`](https://github.com/OpenSourceEcon/CompMethods/blob/main/data/smm/Econ381totpts.txt). Recall that these test scores are between 0 and 450. {numref}`Figure %s <FigSMM_EconScoreTruncNorm>` below shows a histogram of the data, as well as three truncated normal PDF's with different values for $\mu$ and $\sigma$. The black line is the maximum likelihood estimate of $\mu$ and $\sigma$ of the truncated normal pdf from the {ref}`Chap_MLE` chapter. The red, green, and black lines are just the PDF's of two "arbitrarily" chosen combinations of the truncated normal parameters $\mu$ and $\sigma$.[^TruncNorm]
 
 ```{code-cell} ipython3
 :tags: ["hide-input", "remove-output"]
@@ -394,20 +394,16 @@ def trunc_norm_pdf(xvals, mu, sigma, cut_lb, cut_ub):
     return pdf_vals
 
 
-# Download and save the data file Econ381totpts.txt
+# Download and save the data file Econ381totpts.txt as NumPy array
 url = ('https://raw.githubusercontent.com/OpenSourceEcon/CompMethods/' +
        'main/data/smm/Econ381totpts.txt')
-data_file = requests.get(url, allow_redirects=True)
-open('../../../data/smm/Econ381totpts.txt', 'wb').write(data_file.content)
-
-# Load the data as a NumPy array
-data = np.loadtxt('../../../data/smm/Econ381totpts.txt')
+data = np.loadtxt(url)
 
 num_bins = 30
 count, bins, ignored = plt.hist(
     data, num_bins, density=True, edgecolor='k', label='data'
 )
-plt.title('Econ 381 scores: 2011-2012', fontsize=20)
+plt.title('Intermediate macro scores: 2011-2012', fontsize=20)
 plt.xlabel(r'Total points')
 plt.ylabel(r'Percent of scores')
 plt.xlim([0, 550])  # This gives the xmin and xmax to be plotted"
@@ -975,7 +971,7 @@ name: FigSMM_Econ381_SMM1
 SMM-estimated PDF function and data histogram, 2 moments, identity weighting matrix, Econ 381 scores (2011-2012)
 ```
 
-That looks just like the maximum likelihood estimate from the {ref}`Chap_MaxLikeli` chapter. {numref}`Figure %s <FigSMM_Econ381_crit1>` below shows what the minimizer is doing. The figure shows the criterion function surface for different of $\mu$ and $\sigma$ in the truncated normal distribution. The minimizer is searching for the parameter values that give the lowest criterion function value.
+That looks just like the maximum likelihood estimate from the {ref}`Chap_MLE` chapter. {numref}`Figure %s <FigSMM_Econ381_crit1>` below shows what the minimizer is doing. The figure shows the criterion function surface for different of $\mu$ and $\sigma$ in the truncated normal distribution. The minimizer is searching for the parameter values that give the lowest criterion function value.
 
 ```{code-cell} ipython3
 :tags: ["remove-output"]
@@ -1071,7 +1067,7 @@ In the next section, we see if we can get more accurate estimates (lower criteri
 
 (SecSMM_CodeExmp_MacrTest_2m2st)=
 #### Two moments, two-step optimal weighting matrix
-Similar to the maximum likelihood estimation problem in Chapter {ref}`Chap_MaxLikeli`, it looks like the minimum value of the criterion function shown in {numref}`Figure %s <FigSMM_Econ381_crit1>` is roughly equal for a specific portion increase of $\mu$ and $\sigma$ together. That is, the estimation problem with these two moments probably has a correspondence of values of $\mu$ and $\sigma$ that give roughly the same minimum criterion function value. This issue has two possible solutions.
+Similar to the maximum likelihood estimation problem in Chapter {ref}`Chap_MLE`, it looks like the minimum value of the criterion function shown in {numref}`Figure %s <FigSMM_Econ381_crit1>` is roughly equal for a specific portion increase of $\mu$ and $\sigma$ together. That is, the estimation problem with these two moments probably has a correspondence of values of $\mu$ and $\sigma$ that give roughly the same minimum criterion function value. This issue has two possible solutions.
 
 1. Maybe we need the two-step variance covariance estimator to calculate a "more" optimal weighting matrix $W$.
 2. Maybe our two moments aren't very good moments for fitting the data.

diff --git a/images/mle/Econ381scores_hist.png b/images/mle/Econ381scores_hist.png