Scaling ortholearners using Ray #800

v-shaal · 2023-08-02T18:32:25Z

issue : 793

Added Implementation of Ray based distributed parallelization to crossfit.
set flag use_ray = True or False to use ray implementation vs normal implementation
parallelized fit_nuisance via ray .
Added Testcases to compare ray vs regular implementation
Current PR implementation is for DML , can be extended to other estimators using _Othrolearners as baseclass

Signed-off-by: Vishal Verma <vishalmverma27@gmail.com>

kbattocchi

Overall, this looks like a great addition to the library. However, there are a few changes that need to be addressed before it can be added.

First of all, please revert your changes to setup.cfg, merge the latest main back into your branch, and then make those changes to pyproject.toml instead - sorry that we changed this out from under you while your PR was in progress, but the package metadata has been moved there instead.

In addition to my comments on individual files, here are some other thoughts:

To be broadly useful, these changes need to be propagated to at least the main DML subclasses, rather than just OrthoLearner, RLearner, and DML, but really ideally to everything that uses _crossfit.
The coverage report shows that most of the new code in the _OrthoLearner class is never run by any of the tests, since you set use_ray=False for all of the tests that use the class directly. Setting use_ray=True should fix that specific coverage issue, but consider whether additional tests for RLearner or DML would also be useful.
This seems like a potentially very helpful feature, so it's probably worth creating a documentation page or notebook, or at the very least an FAQ entry, describing when/why/how to use it.

kbattocchi · 2023-08-13T23:27:10Z

.github/workflows/ci.yml

-            extras: "[tf,plt]"
+            extras: "[tf,plt,ray]"
          - kind: other
            opts: '-m "cate_api" -n auto'
-            extras: "[tf,plt]"
+            extras: "[tf,plt,ray]"
          - kind: dml
            opts: '-m "dml"'
-            extras: "[tf,plt]"
+            extras: "[tf,plt,ray]"
          - kind: main
            opts: '-m "not (notebook or automl or dml or serial or cate_api or treatment_featurization)" -n 2'
-            extras: "[tf,plt,dowhy]"
+            extras: "[tf,plt,dowhy,ray]"
          - kind: treatment
            opts: '-m "treatment_featurization" -n auto'
-            extras: "[tf,plt]"


I believe ray only needs to be added to the main test kind, since that is where the test_ortho_learner tests are run.

Fixed in latest commit

kbattocchi · 2023-08-13T23:27:57Z

.github/workflows/ci.yml

        - kind: "except-customer-scenarios"
-          extras: "[tf,plt]"
+          extras: "[tf,plt,ray]"
          pattern: "(?!CustomerScenarios)"
          install_graphviz: true
          version: '3.8' # no supported version of tensorflow for 3.9
        - kind: "customer-scenarios"
-          extras: "[plt,dowhy]"
+          extras: "[plt,dowhy,ray]"
          pattern: "CustomerScenarios"
          version: '3.9'


Unless you make any changes to the notebooks to take advantage of the new ray functionality, these changes should not be necessary.

Fixed in latest commit

econml/dml/dml.py

kbattocchi · 2023-08-13T23:33:12Z

econml/dml/dml.py

-                 random_state=None):
+                 random_state=None,
+                 use_ray=False,
+                 **ray_remote_func_options


Suggested change

**ray_remote_func_options

ray_remote_func_options={}

I think it would be better to make this an explicit dictionary argument, rather than having it implicitly include any other keyword arguments passed to the DML initializer since in the future we might want similar arguments for other compute backends.

(This also applies all the way up the hierarchy, to the RLearner and OrthoLearner initializer arguments)

Fixed in latest commit

kbattocchi · 2023-08-13T23:34:17Z

setup.cfg

@@ -66,6 +66,8 @@ plt =
    matplotlib < 3.6.0
 dowhy = 
    dowhy < 0.9
+ray =


Apologies for the inconvenience but these changes now need to be made to pyproject.toml instead - we've tried to move as much of the static metadata for the project as possible to that file.

kbattocchi · 2023-08-14T01:48:07Z

econml/dml/_rlearner.py

@@ -272,15 +272,17 @@ def _gen_rlearner_model_final(self):
    """

    def __init__(self, *, discrete_treatment, treatment_featurizer, categories,
-                 cv, random_state, mc_iters=None, mc_agg='mean'):
+                 cv, random_state, mc_iters=None, mc_agg='mean', use_ray=False, **ray_remote_func_options):


Suggested change

cv, random_state, mc_iters=None, mc_agg='mean', use_ray=False, **ray_remote_func_options):

cv, random_state, mc_iters=None, mc_agg='mean', use_ray=False, ray_remote_func_options=ray_remote_func_options):

kbattocchi · 2023-08-14T01:50:56Z

econml/_ortho_learner.py

+    return nuisance_temp, model, test_idxs, (score_temp if calculate_scores else None)
+
+
+def _crossfit(model, use_ray, folds, ray_remote_fun_option, *args, **kwargs):


I think it makes more sense for folds to come before the ray arguments (and certainly for the ray arguments to be adjacent), and these changes make the specification match the docstring.

Suggested change

def _crossfit(model, use_ray, folds, ray_remote_fun_option, *args, **kwargs):

def _crossfit(model, folds, use_ray=False, ray_remote_fun_option={}, *args, **kwargs):

kbattocchi · 2023-08-14T01:51:51Z

econml/_ortho_learner.py

@@ -60,6 +120,10 @@ def _crossfit(model, folds, *args, **kwargs):
        function estimates a model of the nuisance function, based on the input
        data to fit. Predict evaluates the fitted nuisance function on the input
        data to predict.
+    use_ray: bool, default False (optional)


Suggested change

use_ray: bool, default False (optional)

use_ray: bool, default False

having a default implies optional

kbattocchi · 2023-08-14T01:52:18Z

econml/_ortho_learner.py

@@ -60,6 +120,10 @@ def _crossfit(model, folds, *args, **kwargs):
        function estimates a model of the nuisance function, based on the input
        data to fit. Predict evaluates the fitted nuisance function on the input
        data to predict.
+    use_ray: bool, default False (optional)
+        Flag to indicate whether to use ray to parallelize the cross-fitting step.
+    ray_remote_fun_option: dict, default None (optional)


Suggested change

ray_remote_fun_option: dict, default None (optional)

ray_remote_fun_option: dict, default {}

Having a default implies optional

kbattocchi · 2023-08-14T01:54:23Z

econml/_ortho_learner.py

-        nuisance, model_list, fitted_inds, scores = _crossfit(Wrapper(model), folds, X, y, W=y, Z=None)
+        use_ray = False
+        ray_remote_fun_option = {}
+        nuisance, model_list, fitted_inds, scores = _crossfit(Wrapper(model),use_ray, folds,ray_remote_fun_option,


Suggested change

nuisance, model_list, fitted_inds, scores = _crossfit(Wrapper(model),use_ray, folds,ray_remote_fun_option,

nuisance, model_list, fitted_inds, scores = _crossfit(Wrapper(model), folds, use_ray, ray_remote_fun_option,

# Conflicts: # setup.cfg

1) Fixed ci.yml extras dependencies 2) Added Description of all the added option in doc string in case of dml and rlearner 3) Addressed chaneges suggested for _ortho_learner.py 4)Removed ray.shutdown(), it can be taken care of explicitly on case to case basis . 5)Made ray_remote_func_options as explicit dictionary argument. What has been added ? 1) Extended the changes to all estimators using _crossfit. 2) Added Test case to run for with_ray and without_ray for above changes 3) Added Notebook on how to use this feature. Signed-off-by: Vishal Verma <vishalmverma27@gmail.com>

v-shaal · 2023-08-16T18:59:29Z

What have been fixed since last commit ?

Fixed ci.yml extras dependencies
Added Description of all the added option in doc string in case of dml and rlearner
Addressed chaneges suggested for _ortho_learner.py
4)Removed ray.shutdown(), it can be taken care of explicitly on case to case basis .
5)Made ray_remote_func_options as explicit dictionary argument.

What has been added ?

Extended the changes to all estimators using _crossfit.
Added Test case to run for with_ray and without_ray for above changes
Added Notebook on how to use this feature.

@kbattocchi kindly review the latest commit and provide feedback if any !

… testcases Signed-off-by: Vishal Verma <vishalmverma27@gmail.com>

…mode for tests. Signed-off-by: Vishal Verma <vishalmverma27@gmail.com>

fverac · 2023-08-22T19:43:44Z

econml/dml/dml.py

+                 use_ray=False,
+                 ray_remote_func_options=None,
+                 ):
+        if ray_remote_func_options is None:


Can you move this logic to within the OrthoLearner fit function, and remove this logic from all subclass __init__ functions. That way we avoid redundant code in all of the subclass __init__ functions and maintain a scikit-learn-like API. If interested in more context, see the "Instantiation" section of this page https://scikit-learn.org/stable/developers/develop.html#apis-of-scikit-learn-objects.

For instance, imagine a user does the following

est = LinearDML(use_ray=some_dict) est.use_ray = None # user changes their mind about use_ray est.fit(…)

We want the logic of converting None to an empty dict in .fit so we can allow for this kind of behavior.

Noted make sense, I will move the redundant code, to fit function within Ortholearner

fverac · 2023-08-22T19:45:47Z

econml/dml/dml.py

@@ -642,6 +657,12 @@ class LinearDML(StatsModelsCateEstimatorMixin, DML):
        If None, the random number generator is the :class:`~numpy.random.mtrand.RandomState` instance used
        by :mod:`np.random<numpy.random>`.

+    use_ray: bool, default False
+        Whether to use Ray to parallelize the cross-fitting step. If True, Ray must be installed.


Can you fix the spacing here with a new line in between the arg descriptions. Same for SparseLinearDML and KernelDML

-removed redundant code for ray_remote_function and moved to ortholearner's fit Signed-off-by: Vishal Verma <vishalmverma27@gmail.com>

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

kbattocchi

I've broken the tests out into a new mark and I think things look good, so I'll merge once the checks pass. Thanks for this contribution!

kbattocchi · 2023-08-21T15:44:52Z

econml/_ortho_learner.py

@@ -412,7 +418,6 @@ def _gen_ortho_learner_model_final(self):
                           discrete_instrument=False, categories='auto', random_state=None)
        est.fit(y, X[:, 0], W=X[:, 1:])



Suggested change

# Or (for parallelization using ray)

Sorry if my previous comment was unclear: I think including the comment is helpful for understanding why est is being redefined; it's just that it needs to be a comment so that the entire block is valid python code that can be run.

kbattocchi · 2023-08-21T16:07:58Z

econml/_ortho_learner.py

+        if ray_remote_func_options is None:
+            ray_remote_func_options = {}


Consider whether just making the default {} instead of None would make sense. In general, we try not to put any logic in our initializers, because it's possible the user will do something like this:

est = LinearDML() est.use_ray = True est.ray_remote_options = None

and then the logic to turn it into {} won't run. So I think it's fine to require it to be an actual dictionary instead of None and skip the extra logic.

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Added Implementation of ray-based distributed parallelization to crossfit. --------- Signed-off-by: Vishal Verma <vishalmverma27@gmail.com> Signed-off-by: Keith Battocchi <kebatt@microsoft.com> Co-authored-by: Keith Battocchi <kebatt@microsoft.com>

v-shaal added 4 commits August 2, 2023 23:54

Update _ortho_learner.py

1128981

Signed-off-by: Vishal Verma <vishalmverma27@gmail.com>

added ray args dml.py

d5af162

Signed-off-by: Vishal Verma <vishalmverma27@gmail.com>

Update _rlearner.py

bb772c2

Signed-off-by: Vishal Verma <vishalmverma27@gmail.com>

Update test_ortho_learner.py

9b7540d

Signed-off-by: Vishal Verma <vishalmverma27@gmail.com>

v-shaal mentioned this pull request Aug 2, 2023

Scaling Othrolearners using Ray #793

Closed

v-shaal marked this pull request as ready for review August 2, 2023 18:56

Update setup.cfg added ray dependencies

e1d3aba

Signed-off-by: Vishal Verma <vishalmverma27@gmail.com>

v-shaal marked this pull request as draft August 3, 2023 03:17

v-shaal marked this pull request as ready for review August 3, 2023 05:26

v-shaal marked this pull request as draft August 3, 2023 05:42

v-shaal added 2 commits August 3, 2023 14:09

Fixed linting issue in test_ortho_learner.py

b10c804

Signed-off-by: Vishal Verma <vishalmverma27@gmail.com>

Update ci.yml added Ray in extras

274e788

Signed-off-by: Vishal Verma <vishalmverma27@gmail.com>

kbattocchi requested changes Aug 14, 2023

View reviewed changes

v-shaal added 2 commits August 16, 2023 22:41

Merge remote-tracking branch 'upstream/main' into scaling_ortholearners

b190e8f

# Conflicts: # setup.cfg

v-shaal added 2 commits August 17, 2023 09:53

Fixed Notebook test for new notebook, and default to use_ray=False in…

f09f2e3

… testcases Signed-off-by: Vishal Verma <vishalmverma27@gmail.com>

set cpu to 1 for ray , to avoid OOM error while running ray in local …

6d81b37

…mode for tests. Signed-off-by: Vishal Verma <vishalmverma27@gmail.com>

fverac reviewed Aug 22, 2023

View reviewed changes

v-shaal and others added 2 commits August 26, 2023 19:31

-updated spacing issue in docstring

f877ae0

-removed redundant code for ray_remote_function and moved to ortholearner's fit Signed-off-by: Vishal Verma <vishalmverma27@gmail.com>

Merge branch 'main' into scaling_ortholearners

d28199f

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

kbattocchi force-pushed the scaling_ortholearners branch 2 times, most recently from a7c168a to d76dff0 Compare October 25, 2023 05:37

Split ray tests into new mark

3c2eb4b

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

kbattocchi force-pushed the scaling_ortholearners branch from d76dff0 to 3c2eb4b Compare October 27, 2023 14:04

kbattocchi added 3 commits October 27, 2023 10:12

Merge branch 'main' into scaling_ortholearners

3289831

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Fixup test driv

b67bdab

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Fix spacing in ortholearner docstring

d97b132

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

kbattocchi marked this pull request as ready for review October 27, 2023 18:49

kbattocchi approved these changes Oct 27, 2023

View reviewed changes

kbattocchi added 2 commits October 27, 2023 14:59

Make tests more robust

57b78b1

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

Fixup OrthoLearner docstring

9fbbe7d

Signed-off-by: Keith Battocchi <kebatt@microsoft.com>

kbattocchi merged commit 01899a8 into py-why:main Oct 27, 2023
72 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scaling ortholearners using Ray #800

Scaling ortholearners using Ray #800

v-shaal commented Aug 2, 2023

kbattocchi left a comment

kbattocchi Aug 13, 2023

v-shaal Aug 16, 2023

kbattocchi Aug 13, 2023

v-shaal Aug 16, 2023

kbattocchi Aug 13, 2023

kbattocchi Aug 13, 2023

v-shaal Aug 16, 2023

kbattocchi Aug 13, 2023

kbattocchi Aug 14, 2023

kbattocchi Aug 14, 2023

kbattocchi Aug 14, 2023

kbattocchi Aug 14, 2023

kbattocchi Aug 14, 2023

v-shaal commented Aug 16, 2023

fverac Aug 22, 2023 •

edited

Loading

v-shaal Aug 23, 2023

fverac Aug 22, 2023

v-shaal Aug 23, 2023

kbattocchi left a comment

kbattocchi Aug 21, 2023

kbattocchi Aug 21, 2023

kbattocchi Aug 21, 2023

	cv, random_state, mc_iters=None, mc_agg='mean', use_ray=False, **ray_remote_func_options):
	cv, random_state, mc_iters=None, mc_agg='mean', use_ray=False, ray_remote_func_options=ray_remote_func_options):

		return nuisance_temp, model, test_idxs, (score_temp if calculate_scores else None)


		def _crossfit(model, use_ray, folds, ray_remote_fun_option, args, *kwargs):

	def _crossfit(model, use_ray, folds, ray_remote_fun_option, args, *kwargs):
	def _crossfit(model, folds, use_ray=False, ray_remote_fun_option={}, args, *kwargs):

	use_ray: bool, default False (optional)
	use_ray: bool, default False

	ray_remote_fun_option: dict, default None (optional)
	ray_remote_fun_option: dict, default {}

	nuisance, model_list, fitted_inds, scores = _crossfit(Wrapper(model),use_ray, folds,ray_remote_fun_option,
	nuisance, model_list, fitted_inds, scores = _crossfit(Wrapper(model), folds, use_ray, ray_remote_fun_option,

		@@ -412,7 +418,6 @@ def _gen_ortho_learner_model_final(self):
		discrete_instrument=False, categories='auto', random_state=None)
		est.fit(y, X[:, 0], W=X[:, 1:])

		if ray_remote_func_options is None:
		ray_remote_func_options = {}

Scaling ortholearners using Ray #800

Scaling ortholearners using Ray #800

Conversation

v-shaal commented Aug 2, 2023

kbattocchi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

v-shaal commented Aug 16, 2023

fverac Aug 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kbattocchi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fverac Aug 22, 2023 •

edited

Loading