-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for arbitrary functions #94
Comments
I think that the solution here would be to make new S3 methods for I like the idea of linear model covariance matrices here but we have to make sure that there is not problematic amounts of feature-creep. |
That would be a powerful feature! Make sense wide this idea for categorical features as well? For instance, a Cramer's V or Chisq matrix. (actually it makes me imagine a list-matrix, a 2D version of list columns =P) |
Closed in #116!!! 🚀 |
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue. |
I've really enjoyed using
corrr
; it is an excellent package! Thanks for all the great work! One thing that I think would make it even more useful would be if there was support for arbitrary functions. At the moment we are limited to only creating dataframes of correlations (of various types). That's useful, but there are a number of different kind of pairwise statistics that can be calculated for the variables of a dataframe. I can see on a separate issue (#42) there has been a request for covariance to also be supported.But it seems a more robust and elegant solution would be to have a function that could take an arbitrary function. And would return a
cor_df
like object, but with values output from that arbitrary function rather than correlations. There would have to be a few changes, like those already mentioned in the covariance issue. Not all outputs would be on a scale from -1 to 1 for example.A relatively simple example would be a linear regression using
lm()
. If there were this form of thecorrelate()
function, it could make the arbitrary function output the beta from regressing each of the variables onto one another. (The beta of regressing y on x is not the same as regressing x on y, so there wouldn't be duplication like in the case of correlations). If you wanted to find the p-values of those regressions, you could simply change the arbitrary function to output the p-values instead. Same thing if you wanted the R^2.The text was updated successfully, but these errors were encountered: