Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

saturated event study + joint tests for cohort-level heterogeneity #762

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
285 changes: 285 additions & 0 deletions pyfixest/did/saturated_twfe.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,285 @@
from typing import Optional

Check warning on line 1 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L1

Added line #L1 was not covered by tests

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

Check warning on line 5 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L3-L5

Added lines #L3 - L5 were not covered by tests

from pyfixest.estimation.estimation import feols

Check warning on line 7 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L7

Added line #L7 was not covered by tests

from .did2s import DID

Check warning on line 9 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L9

Added line #L9 was not covered by tests


class SaturatedEventStudy(DID):

Check warning on line 12 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L12

Added line #L12 was not covered by tests
"""
Saturated event study with cohort-specific effect curves.

Attributes
----------
data : pd.DataFrame
Dataframe containing the data.
yname : str
Name of the outcome variable.
idname : str
Name of the unit identifier variable.
tname : str
Name of the time variable.
gname : str
Name of the treatment variable.
cluster : str
The name of the cluster variable.
xfml : str
The formula for the fixed effects.
att : bool
Whether to use the average treatment effect.

"""

def __init__(

Check warning on line 37 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L37

Added line #L37 was not covered by tests
self,
data: pd.DataFrame,
yname: str,
idname: str,
tname: str,
gname: str,
cluster: Optional[str] = None,
xfml: Optional[str] = None,
att: Optional[bool] = True,
):
super().__init__(

Check warning on line 48 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L48

Added line #L48 was not covered by tests
data=data,
yname=yname,
idname=idname,
tname=tname,
gname=gname,
cluster=cluster,
xfml=xfml,
att=att,
)
self._estimator = "Saturated Event Study"

Check warning on line 58 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L58

Added line #L58 was not covered by tests

# create a treatment variable
self._data["ATT"] = (self._data[self._tname] >= self._data[self._gname]) * (

Check warning on line 61 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L61

Added line #L61 was not covered by tests
self._data[self._gname] > 0
)

def estimate(self):

Check warning on line 65 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L65

Added line #L65 was not covered by tests
"""
Estimate the model.

Returns
-------
pd.DataFrame
A DataFrame containing the estimates.
"""
self.mod = _saturated_event_study(

Check warning on line 74 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L74

Added line #L74 was not covered by tests
self._data,
outcome=self._yname,
treatment="ATT",
time_id=self._tname,
unit_id=self._idname,
)

return self.mod

Check warning on line 82 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L82

Added line #L82 was not covered by tests

# !TODO - implement the rest of the methods
def vcov(self):

Check warning on line 85 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L85

Added line #L85 was not covered by tests
"""
Get the covariance matrix.

Returns
-------
pd.DataFrame
A DataFrame containing the covariance matrix.
"""
pass

Check warning on line 94 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L94

Added line #L94 was not covered by tests

def iplot(self):

Check warning on line 96 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L96

Added line #L96 was not covered by tests
"""Plot DID estimates."""
pass

Check warning on line 98 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L98

Added line #L98 was not covered by tests

def tidy(self):

Check warning on line 100 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L100

Added line #L100 was not covered by tests
"""Tidy result dataframe."""
return self.mod.tidy()

Check warning on line 102 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L102

Added line #L102 was not covered by tests

def summary(self):

Check warning on line 104 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L104

Added line #L104 was not covered by tests
"""Get summary table."""
return self.mod.summary()

Check warning on line 106 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L106

Added line #L106 was not covered by tests


######################################################################


def _saturated_event_study(

Check warning on line 112 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L112

Added line #L112 was not covered by tests
df: pd.DataFrame,
outcome: str = "outcome",
treatment: str = "treated",
time_id: str = "time",
unit_id: str = "unit",
ax: plt.Axes = None,
):
######################################################################
# this chunk creates gname internally here - assume that data already contains it?
df = df.merge(

Check warning on line 122 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L122

Added line #L122 was not covered by tests
df.assign(first_treated_period=df[time_id] * df[treatment])
.groupby(unit_id)["first_treated_period"]
.apply(lambda x: x[x > 0].min()),
on=unit_id,
)
df["rel_time"] = df[time_id] - df["first_treated_period"]
df["first_treated_period"] = (

Check warning on line 129 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L128-L129

Added lines #L128 - L129 were not covered by tests
df["first_treated_period"].replace(np.nan, 0).astype("int")
)
df["rel_time"] = df["rel_time"].replace(np.nan, np.inf)
cohort_dummies = pd.get_dummies(

Check warning on line 133 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L132-L133

Added lines #L132 - L133 were not covered by tests
df.first_treated_period, drop_first=True, prefix="cohort_dummy"
)
df_int = pd.concat([df, cohort_dummies], axis=1)

Check warning on line 136 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L136

Added line #L136 was not covered by tests
######################################################################
# formula
ff = f"""

Check warning on line 139 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L139

Added line #L139 was not covered by tests
{outcome} ~
{'+'.join([f"i(rel_time, {x}, ref = -1.0)" for x in df_int.filter(like = "cohort_dummy", axis = 1).columns])}
| {unit_id} + {time_id}
"""
m = feols(ff, df_int, vcov={"CRV1": unit_id})

Check warning on line 144 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L144

Added line #L144 was not covered by tests
# !TODO move this into a separate function / integrate logic into iplot method that overrides feols's plot
if ax:

Check warning on line 146 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L146

Added line #L146 was not covered by tests
# plot
res = m.tidy()

Check warning on line 148 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L148

Added line #L148 was not covered by tests
# create a dict with cohort specific effect curves
res_dict = {}
for c in cohort_dummies.columns:
res_cohort = res.filter(like=c, axis=0)
event_time = (

Check warning on line 153 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L150-L153

Added lines #L150 - L153 were not covered by tests
res_cohort.index.str.extract(r"\[T\.(-?\d+\.\d+)\]")
.astype(float)
.values.flatten()
)
res_dict[c] = {"est": res_cohort, "time": event_time}

Check warning on line 158 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L158

Added line #L158 was not covered by tests

cmp = plt.get_cmap("Set1")
for i, (k, v) in enumerate(res_dict.items()):
ax.plot(v["time"], v["est"]["Estimate"], marker=".", label=k, color=cmp(i))
ax.fill_between(

Check warning on line 163 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L160-L163

Added lines #L160 - L163 were not covered by tests
v["time"], v["est"]["2.5%"], v["est"]["97.5%"], alpha=0.2, color=cmp(i)
)
ax.axvline(-1, color="black", linestyle="--")
ax.axhline(0, color="black", linestyle=":")
return m

Check warning on line 168 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L166-L168

Added lines #L166 - L168 were not covered by tests


def test_treatment_heterogeneity(

Check warning on line 171 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L171

Added line #L171 was not covered by tests
df: pd.DataFrame,
outcome: str = "Y_it",
treatment: str = "W_it",
unit_id: str = "unit_id",
time_id: str = "time_id",
retmod: bool = False,
):
"""
Test for treatment heterogeneity in the event study design.

For details, see https://github.com/apoorvalal/TestingInEventStudies

Parameters
----------
df : pd.DataFrame
Dataframe containing the data.
outcome : str
Name of the outcome variable.
treatment : str
Name of the treatment variable.
unit_id : str
Name of the unit identifier variable.
time_id : str
Name of the time variable.
retmod : bool
Whether to return the model object.

Returns
-------
float
The p-value of the test.
"""
# Get treatment timing info
df = df.merge(

Check warning on line 205 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L205

Added line #L205 was not covered by tests
df.assign(first_treated_period=df[time_id] * df[treatment])
.groupby(unit_id)["first_treated_period"]
.apply(lambda x: x[x > 0].min()),
on=unit_id,
)
df["rel_time"] = df[time_id] - df["first_treated_period"]
df["first_treated_period"] = (

Check warning on line 212 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L211-L212

Added lines #L211 - L212 were not covered by tests
df["first_treated_period"].replace(np.nan, 0).astype("int")
)
df["rel_time"] = df["rel_time"].replace(np.nan, np.inf)

Check warning on line 215 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L215

Added line #L215 was not covered by tests
# Create dummies but drop TWO cohorts - one serves as base for pooled effects
cohort_dummies = pd.get_dummies(

Check warning on line 217 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L217

Added line #L217 was not covered by tests
df.first_treated_period, drop_first=True, prefix="cohort_dummy"
).iloc[
:, 1:
] # drop an additional cohort - drops interactions for never treated and baseline

df_int = pd.concat([df, cohort_dummies], axis=1)

Check warning on line 223 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L223

Added line #L223 was not covered by tests

# Modified formula with base effects + cohort-specific deviations
ff = f"""

Check warning on line 226 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L226

Added line #L226 was not covered by tests
{outcome} ~
i(rel_time, ref=-1.0) +
{'+'.join([f"i(rel_time, {x}, ref = -1.0)" for x in df_int.filter(like = "cohort_dummy", axis = 1).columns])}
| {unit_id} + {time_id}
"""

model = feols(ff, df_int, vcov={"CRV1": unit_id})
P = model.coef().shape[0]

Check warning on line 234 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L233-L234

Added lines #L233 - L234 were not covered by tests

if retmod:
return model
mmres = model.tidy().reset_index()
mmres[["time", "cohort"]] = mmres.Coefficient.str.split(":", expand=True)
mmres["time"] = mmres.time.str.extract(r"\[T\.(-?\d+\.\d+)\]").astype(float)
mmres["cohort"] = mmres.cohort.str.extract(r"(\d+)")

Check warning on line 241 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L236-L241

Added lines #L236 - L241 were not covered by tests
# indices of coefficients that are deviations from common event study coefs
event_study_coefs = mmres.loc[~(mmres.cohort.isna()) & (mmres.time > 0)].index

Check warning on line 243 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L243

Added line #L243 was not covered by tests
# Method 2 (K x P) - more efficient
K = len(event_study_coefs)
R2 = np.zeros((K, P))
for i, idx in enumerate(event_study_coefs):
R2[i, idx] = 1

Check warning on line 248 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L245-L248

Added lines #L245 - L248 were not covered by tests

test_result = model.wald_test(R=R2, distribution="chi2")
return test_result["pvalue"]

Check warning on line 251 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L250-L251

Added lines #L250 - L251 were not covered by tests


def _test_dynamics(

Check warning on line 254 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L254

Added line #L254 was not covered by tests
df,
outcome="Y",
treatment="W",
time_id="time",
unit_id="unit",
vcv={"CRV1": "unit"},
):
# Fit models
df = df.merge(

Check warning on line 263 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L263

Added line #L263 was not covered by tests
df.assign(first_treated_period=df[time_id] * df[treatment])
.groupby(unit_id)["first_treated_period"]
.apply(lambda x: x[x > 0].min()),
on=unit_id,
)
df["rel_time"] = df[time_id] - df["first_treated_period"]
df["rel_time"] = df["rel_time"].replace(np.nan, np.inf)
restricted = feols(f"{outcome} ~ i({treatment}) | {unit_id} + {time_id}", df)
unrestricted = feols(

Check warning on line 272 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L269-L272

Added lines #L269 - L272 were not covered by tests
f"{outcome} ~ i(rel_time, ref=0) | {unit_id} + {time_id}", df, vcov=vcv
)
# Get the restricted estimate
restricted_effect = restricted.coef().iloc[0]

Check warning on line 276 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L276

Added line #L276 was not covered by tests
# Create R matrix - each row tests one event study coefficient
# against restricted estimate
n_evstudy_coefs = unrestricted.coef().shape[0]
R = np.eye(n_evstudy_coefs)

Check warning on line 280 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L279-L280

Added lines #L279 - L280 were not covered by tests
# q vector is the restricted estimate repeated
q = np.repeat(restricted_effect, n_evstudy_coefs)

Check warning on line 282 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L282

Added line #L282 was not covered by tests
# Conduct Wald test
pv = unrestricted.wald_test(R=R, q=q, distribution="chi2")["pvalue"]
return pv

Check warning on line 285 in pyfixest/did/saturated_twfe.py

View check run for this annotation

Codecov / codecov/patch

pyfixest/did/saturated_twfe.py#L284-L285

Added lines #L284 - L285 were not covered by tests
Loading