Several minor improvements #804

kbattocchi · 2023-08-04T20:15:53Z

Support direct covariance fitting for DRIV
Ensure that groups can be passed to DMLIV and DRIV
Dependency cleanup:
- Enable newer versions of shap, matplotlib, seaborn, and dowhy
- Drop support for sklearn<1.0 and enable support for sklearn 1.3
CI improvements:
- Run doctests as part of build
- Don't fail fast when building packages fails on one platform
- Store test output in an artifact

fverac · 2023-10-11T21:10:56Z

econml/iv/dml/_dml.py

@@ -526,7 +526,7 @@ def score(self, Y, T, Z, X=None, W=None, sample_weight=None):
            The MSE of the final CATE model on the new data.
        """
        # Replacing score from _OrthoLearner, to enforce Z to be required and improve the docstring
-        return super().score(Y, T, X=X, W=W, Z=Z, sample_weight=sample_weight)
+        return super().score(Y, T, X=X, W=W, Z=Z, sample_weight=sample_weight, groups=None)


should it be groups=groups here?

Yes, good catch. (It doesn't affect the results since groups are never used in scoring, but I'll fix it in the next set of changes).

Upon further consideration, I've removed groups from the DMLIV and DRIV scoring methods, because they are never used and so there's no point in including them.

The groups argument needs to exist on the nuisance models, because the signatures for fit, predict, and score all need to be compatible for how we do cross-fitting, but there's no need for them to pollute the estimators themselves, and indeed our existing classes like LinearDML do not have groups on their scoring methods.

fverac · 2023-10-11T21:11:11Z

econml/iv/dr/_dr.py

@@ -318,7 +318,7 @@ def predict(self, X=None):
        X = self._transform_X(X, fitting=False)
        return self._model_final.predict(X).reshape((-1,) + self.d_y + self.d_t)

-    def score(self, Y, T, X=None, W=None, Z=None, nuisances=None, sample_weight=None):
+    def score(self, Y, T, X=None, W=None, Z=None, nuisances=None, sample_weight=None, groups=None):


groups=groups?

This is the method definition, so groups=None is correct.

fverac · 2023-10-11T21:14:47Z

econml/tests/test_dmliv.py

+        ]
+
+        for est in est_list:
+            est.fit(y, T, Z=Z, X=X, W=W, groups=groups)


Is there a way to make sure the groups are actually being used here? To avoid problems like when groups is accidentally left as None in the call to super().score() instead of threaded through from the args.

fverac · 2023-10-12T14:52:54Z

econml/iv/dml/_dml.py

@@ -526,7 +526,7 @@ def score(self, Y, T, Z, X=None, W=None, sample_weight=None):
            The MSE of the final CATE model on the new data.


minor since groups aren't really used for scoring but they are not included in the docstring as parameters

As mentioned in a previous comment, removed groups from scoring on the estimator since they do nothing

fverac · 2023-10-12T14:53:19Z

econml/iv/dml/_dml.py

@@ -837,7 +837,7 @@ def fit(self, Y, T, *, Z, X=None, W=None, sample_weight=None, freq_weight=None,
                           sample_weight=sample_weight, freq_weight=freq_weight, sample_var=sample_var, groups=groups,
                           cache_values=cache_values, inference=inference)

-    def score(self, Y, T, Z, X=None, W=None, sample_weight=None):
+    def score(self, Y, T, Z, X=None, W=None, sample_weight=None, groups=None):
        """
        Score the fitted CATE model on a new data set. Generates nuisance parameters
        for the new data set based on the fitted residual nuisance models created at fit time.


groups missing from docstring

As mentioned, removed groups from this method.

fverac · 2023-10-12T14:58:36Z

econml/tests/test_dml.py

@@ -1151,7 +1151,7 @@ def test_groups(self):
            est.fit(y, t, groups=groups)

        # test outer grouping
-        est = LinearDML(model_y=LinearRegression(), model_t=LinearRegression(), cv=GroupKFold(2))


Is it worth adding some check to verify that a GroupKFold splitter was used under the hood?

fverac · 2023-10-12T15:01:42Z

econml/tests/test_driv.py

        for est in ests_list:
            with self.subTest(est=est):
+                # no heterogeneity


Minor question but is there a benefit to moving this inside the for loop?

The test passes :-). The default is for fit_cov_directly to be True, which means that the previous random seed doesn't generate identical results to what they were before, which lead to a marginal failure on this test, but just slightly reorganizing it made it pass again.

Logically, I think this makes more sense anyway: it's weird to have different loops creating two sets of identical subtests that test different things; if you run the tests locally via unittest you'll see one result per subtest but there won't be any way to tell which was which.

fverac · 2023-10-12T15:17:28Z

...ooks/CustomerScenarios/Case Study - Multi-investment Attribution at A Software Company.ipynb

The diff is hard to parse here for some reason even though the actual changes are minimal just like the econml+dowhy version of the notebook.
Not sure why. Different jupyter version?

That was indeed very weird; fixed.

fverac · 2023-10-12T18:40:33Z

econml/tests/test_drlearner.py

@@ -793,9 +793,12 @@ def test_groups(self):
            est.fit(y, t, W=w, groups=groups)

        # test outer grouping
-        # NOTE: we should ideally use a stratified split with grouping, but sklearn doesn't have one yet
+        # NOTE: StratifiedGroupKFold has a bug when shuffle is True where it doesn't always stratify properly


Is this bug worth worrying about for our users since we use crossfit uses StratifiedGroupKFold with shuffle=True?

Hopefully it will be fixed in sklearn and then it will have the right behavior, but until then it's possible that users can run into it (although the buggy behavior only occurs with certain datasets, so hopefully it works most of the time).

However I don't think there's any good fix on our end - in general we do want to shuffle, it's just that for the purposes of this one test we can ignore that, but it wouldn't be an appropriate substitute in general.

fverac · 2023-10-20T16:59:54Z

econml/tests/test_dml.py

@@ -1151,7 +1151,7 @@ def test_groups(self):
            est.fit(y, t, groups=groups)

        # test outer grouping
-        est = LinearDML(model_y=LinearRegression(), model_t=LinearRegression(), cv=GroupKFold(2))
+        est = LinearDML(model_y=LinearRegression(), model_t=LinearRegression())
        est.fit(y, t, groups=groups)



Suggested change

assert isinstance(est.splitter, GroupKFold)

What about adding something like this. Just to protect against the case where groups isn't actually used under the hood.

Though seems like currently we don't save the splitter to our ests as an attribute

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

fverac

Looks good

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

kbattocchi force-pushed the kebatt/minorFixes branch from c51ae9a to 7e6ae64 Compare August 5, 2023 17:28

kbattocchi force-pushed the kebatt/minorFixes branch 3 times, most recently from ffbc253 to 063fc6a Compare August 14, 2023 21:20

kbattocchi force-pushed the kebatt/minorFixes branch from 063fc6a to 0fff7da Compare September 6, 2023 18:03

kbattocchi force-pushed the kebatt/minorFixes branch 4 times, most recently from 4f19d3e to 8975714 Compare October 11, 2023 19:07

kbattocchi requested a review from fverac October 11, 2023 19:09

kbattocchi force-pushed the kebatt/minorFixes branch from 8975714 to ad295a9 Compare October 11, 2023 19:38

fverac reviewed Oct 11, 2023

View reviewed changes

kbattocchi force-pushed the kebatt/minorFixes branch from ad295a9 to e42ea65 Compare October 11, 2023 21:37

fverac reviewed Oct 12, 2023

View reviewed changes

kbattocchi force-pushed the kebatt/minorFixes branch from 7229fef to e391be2 Compare October 12, 2023 18:58

fverac linked an issue Oct 13, 2023 that may be closed by this pull request

Changing covariance logic in DRIV #809

Closed

kbattocchi force-pushed the kebatt/minorFixes branch 3 times, most recently from d6aa09e to ff63e62 Compare October 20, 2023 16:11

kbattocchi force-pushed the kebatt/minorFixes branch from ff63e62 to 86f7bcb Compare October 20, 2023 16:42

kbattocchi marked this pull request as ready for review October 20, 2023 16:43

fverac reviewed Oct 20, 2023

View reviewed changes

Drop support for sklearn<1.0

ec3b161

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

kbattocchi force-pushed the kebatt/minorFixes branch from 86f7bcb to 2496e32 Compare October 21, 2023 03:54

fverac approved these changes Oct 23, 2023

View reviewed changes

kbattocchi force-pushed the kebatt/minorFixes branch from 2496e32 to f3f1d90 Compare October 24, 2023 17:14

kbattocchi added 9 commits October 24, 2023 13:52

Support direct covariance fitting in DRIV

e260532

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Enable sklearn 1.3

b2fe63a

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Ensure groups work with DRIV, DMLIV

91b750e

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Update __init__.py to reflect current structure

90bbbb8

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Make changes to support dowhy 0.10.1 in tests

5166055

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Allow newer shap, matlab, and seaborn versions

532695d

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Make minor CI improvements

f45fabb

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Save notebook outputs during CI

3be18da

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Remove legacy assertWarns hack

0be5bf4

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

kbattocchi force-pushed the kebatt/minorFixes branch from f3f1d90 to 0be5bf4 Compare October 24, 2023 17:52

kbattocchi merged commit 5423183 into main Oct 25, 2023
65 checks passed

kbattocchi deleted the kebatt/minorFixes branch October 25, 2023 05:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Several minor improvements #804

Several minor improvements #804

kbattocchi commented Aug 4, 2023 •

edited

Loading

fverac Oct 11, 2023

kbattocchi Oct 11, 2023

kbattocchi Oct 20, 2023

fverac Oct 11, 2023

kbattocchi Oct 11, 2023

fverac Oct 11, 2023

fverac Oct 12, 2023

kbattocchi Oct 20, 2023

fverac Oct 12, 2023

kbattocchi Oct 20, 2023

fverac Oct 12, 2023

fverac Oct 12, 2023

kbattocchi Oct 12, 2023 •

edited

Loading

fverac Oct 12, 2023

kbattocchi Oct 12, 2023

fverac Oct 12, 2023

kbattocchi Oct 12, 2023

fverac Oct 20, 2023

fverac Oct 20, 2023

fverac left a comment

		@@ -526,7 +526,7 @@ def score(self, Y, T, Z, X=None, W=None, sample_weight=None):
		The MSE of the final CATE model on the new data.

Several minor improvements #804

Several minor improvements #804

Conversation

kbattocchi commented Aug 4, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kbattocchi Oct 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fverac left a comment

Choose a reason for hiding this comment

kbattocchi commented Aug 4, 2023 •

edited

Loading

kbattocchi Oct 12, 2023 •

edited

Loading