Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce sklearn dependency in causalml #686

Merged

Conversation

alexander-pv
Copy link
Collaborator

@alexander-pv alexander-pv commented Oct 1, 2023

Proposed changes

Hi!
Some people faced into binary incompatibility during package installation because causalml trees Cython code was linked to specific sklearn builds. Others got issues with numpy versions support.

This PR is aimed to resolve most of such issues.
In essence, Cython private code about trees was placed into the package with minimal changes instead of importing from scikit-learn.
Even though I initially didn't really like this idea, I came to the conclusion that it was necessary for the ease of the package use.

I think that code inside causalml/inference/tree/_tree should be updated rarely which basically makes sense in case of perfomance updates in scikit-learn trees codebase. Perhaps, these tree structures could be also reused in uplift trees.

Other minor changes:

  • Multiprocess Cython code compilation in setup.py
  • A couple of additional details in CONTRIBUTING.md
  • sklearn and numpy requirements were moved back to previous restrictions without upper bound: numpy>=1.18.5, scikit-learn>=0.22.0

Related issues: #579 #581 #619 #628 #671 #679 #680 #682 #684

Types of changes

What types of changes does your code introduce to CausalML?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Update (if none of the other choices apply)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have read the CONTRIBUTING doc
  • I have signed the CLA
  • Lint and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in downstream modules

Further comments

If this is a relatively large or complex change, kick off the discussion by explaining why you chose the solution you did and what alternatives you considered, etc. This PR template is adopted from appium.

@jeongyoonlee jeongyoonlee added the dependencies Pull requests that update a dependency file label Oct 3, 2023
@jeongyoonlee
Copy link
Collaborator

Thanks, @alexander-pv for the contribution - as always!

One quick comment: Have you considered adding scikit-learn/scikit-learn/tree as a subtree (ref) instead of copying the code over?

@jeongyoonlee
Copy link
Collaborator

Thanks, @alexander-pv for the contribution - as always!

One quick comment: Have you considered adding scikit-learn/scikit-learn/tree as a subtree (ref) instead of copying the code over?

There's no way to import only the tree subfolder as either a submodule or a subtree. :(
Let's take your approach - adding scikit-learn's tree code directly.

Can you fix the build error and resolve the conflict? Once it's cleared, I'll approve and merge it.

@alexander-pv
Copy link
Collaborator Author

Thanks, @alexander-pv for the contribution - as always!
One quick comment: Have you considered adding scikit-learn/scikit-learn/tree as a subtree (ref) instead of copying the code over?

There's no way to import only the tree subfolder as either a submodule or a subtree. :( Let's take your approach - adding scikit-learn's tree code directly.

Can you fix the build error and resolve the conflict? Once it's cleared, I'll approve and merge it.

Hi! I've learned about tree subfolders anyways and agree that neither subfodlers nor submodules could help to implement the initial idea. I will update the PR soon.

@alexander-pv
Copy link
Collaborator Author

It looks like everything is OK now.

Copy link
Collaborator

@jeongyoonlee jeongyoonlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants