-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cor sometimes returns values > 1 #17420
Comments
We should probably throw some |
I think we can be smarter than that and I have a fix (soon). At least the fix will fix the cases where the two series are exactly identical. The present version also doesn't vectorize so we can even get a nice speedup. |
Note that sequences being identical is not required for the bug, and it happens with floats as well. @show cor(1:100, 101:200) > 1 # true
@show cor(1:100, 2*(1:100)) > 1 # true
@show cor(linspace(1, 85, 100), linspace(1, 85, 100)) > 1 # true |
These cases are also fixed with my new version but it's hard to tell if it will be that case for all possible vectors. |
I agree with Stefan, as long as you can't guarantee that the result will be in Another interesting test case that produces different results (all previous ones were giving just a = linspace(1, 85, 100)
b = collect(a)
c = Vector{Float32}(b)
@show cor(a,a) # = 1.0000000000000002
@show cor(a,c) # = 1.0000000885771385
@show cor(c,c) # = 1.0000001f0 |
We should probably have both the smarter, faster algorithm and stick a clamp at the end. |
There is still an issue when cor(repmat(1:17, 1, 17))[2] <= 1.0 I've made a PR to fix it by adding |
This happens due to FP precision, but it can potentially break downstream functions that rely on -1 <= cor <= 1
The text was updated successfully, but these errors were encountered: