Various notes and Stata programs to aid in power analysis.
I write the notes mainly to explain to myself how to write power_reg
and simci
. The former can compute parametric power with clustering and stratification. The latter can simulate it. I wrote them mainly because I could not figure out how to do either for the simple (OLS) case with built-in Stata commands.
I only have access to Stata 13.1, so I impose that to be the minimum.
The command is really simple, however, so I would not be surprised if it
worked with earlier versions. The exception would be the fast
option,
which was compiled using C and v2.0 of the Stata Plugin Interface (SPI).
This might be tied to Stata 13.1. See how to recompile below.
net install power_tools, from(https://raw.githubusercontent.com/mcaceresb/stata-power/master/)
To update, run
adoupdate, update
To uninstall, run
ado uninstall power_tools
sysuse auto, clear
tempfile auto
save `auto'
qui forvalues i = 1 / 20 {
append using `auto'
}
replace price = price + runiform()
local depvar price
local controls mpg rep78
local cluster rep78
local stratum gear_ratio
local categorical foreign
* Parametric power
power_reg `depvar' `controls'
power_reg `depvar' `controls', cluster(`cluster')
power_reg `depvar' `controls', cluster(`cluster') strata(`stratum') nstrata(2)
* Simulate a CI
simci `depvar' `controls', reps(1000)
simci `depvar' `controls', reps(1000) cluster(`cluster')
simci `depvar' `controls' `stratum', reps(1000) cluster(`cluster') ///
strata(`stratum') nstrata(2)
simci `depvar' `controls' i.`categorical', reps(1000) ///
strata(`categorical') nstrata(0)
* Simulate MDE given power
simci `depvar' `controls', reps(1000) power(kappa(0.8) direction(neg))
simci `depvar' `controls', reps(1000) cluster(`cluster') ///
power(kappa(0.8) direction(neg))
simci foreign `controls' `stratum', reps(1000) strata(`stratum') nstrata(2) ///
power(binary dir(pos))
simci `depvar' `controls', reps(1000) cluster(`cluster') strata(`stratum') ///
nstrata(2) power(dir(pos))
* Simulate power given MDE
simci foreign `controls', reps(1000) effect(effect(-0.5) ///
bounds(-0.2 0.2) binary)
simci `depvar' `controls', reps(1000) cluster(`cluster') ///
effect(effect(-0.5) bounds(-0.2 0.2))
simci `depvar' `controls', reps(1000) strata(`stratum') nstrata(2) ///
effect(effect(-0.5) bounds(-0.2 0.2))
* Simulate a CI using C plugin (only tested under Linux)
simci `depvar' `controls', reps(1000) fast
* You can check it's actually faster
net install benchmark, from(https://raw.githubusercontent.com/mcaceresb/stata-benchmark/master/)
benchmark, disp reps(10): qui simci `depvar' `controls', reps(1000)
benchmark, disp reps(10): qui simci `depvar' `controls', reps(1000) fast
Note that the fast
option depends on a plugin. The
only non-standard library is the GNU Scientific Library
(GSL), which is statically linked, so
there shouldn't be any problems on most systems. However, if any of the
dynamically linked libraries are missing, you may need to add their path
(assuming they are somewhere in your system) to LD_LIBRARY_PATH
; e.g.
LD_LIBRARY_PATH=LD_LIBRARY_PATH:/usr/local/lib
export LD_LIBRARY_PATH
before starting Stata. If this fixes the issue, you can add those lines
to ~/.bashrc
to avoid having to do that every time you log into a session.
The fast
option uses a Stata plugin (compiled in C). To compile in Linux/Unix:
git clone https://github.com/mcaceresb/stata-power
cd stata-power
make SPI=3.0 # SPI v3.0, Stata 14 and later
make SPI=2.0 # SPI v2.0, Stata 13 and earlier
The advantage is twofold
- First, C runs much faster than mata, which is how the function is implemented.
- Second, C allows parallel loop execution. Since the simulation
computes regression coefficients
reps
times, using N threads should result in an approximately Nx speed improvement. This works even with Stata/IC.
Note Mata runs faster than Stata's reg largely because this simulation uses just the regression coefficients; reg computes a lot of additional elements that the program does not use.
- The GNU Scientific Library (GSL)
- OpenMP
- Stata Plugin Interface (SPI version 2.0 for Stata < 14; version 3.0 for Stata >= 14)
I have only tested this in Linux so far. See Stata's plugin documentation for instructions on how to build the plugin in other platforms.
- Add documentation for
simci
- Add documentation for
power_reg
- Add examples for
power_reg
- Compile
fast
option for Windows and OSX. - Finish writing
fast
plugin so it works for clustering and stratification. - Finish writing
fast
plugin so it works witheffect()
andpower()
.