Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster speed for compare_survival #215

Closed
wants to merge 2 commits into from
Closed

Faster speed for compare_survival #215

wants to merge 2 commits into from

Conversation

raynardj
Copy link
Contributor

@raynardj raynardj commented Jul 9, 2021

Checklist

What does this implement/fix? Explain your changes

  • The change can improve the speed for function sksurv.compare.compare_survival
  • Use more numpy native syntax/ operation to accelerate the python iteration on a huge cartetian product
  • The actual scale of improvement depends on how large the n_groups is
  • Per testing, the function will return same chisq, pvalue, covar matrix, table

Copy link
Owner

@sebp sebp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your help!

The algorithm looks good to me, I just would use a couple more numpy functions.

@@ -75,6 +75,7 @@ def compare_survival(y, group_indicator, return_stats=False):
observed = numpy.zeros(n_groups, dtype=numpy.int_)
expected = numpy.zeros(n_groups, dtype=numpy.float_)
covar = numpy.zeros((n_groups, n_groups), dtype=numpy.float_)
group_eye = numpy.eye(n_groups, dtype=bool)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are only using it for indexing covar, using numpy.diag_indices makes more sense to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know such thing exist, sure, way less memory consumption

covar[g1, g2] -= temp * at_risk[g2] / total_at_risk
temp = at_risk * multiplier
covar[group_eye] += temp
covar -= (temp[:, None] * at_risk[None, :] / total_at_risk)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need the brackets, and temp[:, None] * at_risk[None, :] is the same as numpy.outer(temp, at_risk).

@codecov
Copy link

codecov bot commented Jul 10, 2021

Codecov Report

Merging #215 (391c11d) into master (c4fb764) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #215      +/-   ##
==========================================
- Coverage   98.33%   98.33%   -0.01%     
==========================================
  Files          37       37              
  Lines        3127     3126       -1     
  Branches      460      458       -2     
==========================================
- Hits         3075     3074       -1     
  Misses         28       28              
  Partials       24       24              
Impacted Files Coverage Δ
sksurv/compare.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c4fb764...391c11d. Read the comment docs.

@raynardj raynardj requested a review from sebp July 10, 2021 09:56
sebp added a commit that referenced this pull request Jul 11, 2021
@sebp
Copy link
Owner

sebp commented Jul 11, 2021

Thanks! Merged in commit ccc6913

@sebp sebp closed this Jul 11, 2021
@raynardj raynardj deleted the faster-compare-survival branch July 11, 2021 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Accelerate compare_survival function
2 participants