-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimizing the types used for intermediate calculations #477
Comments
BTW, I was wondering why the required accuracy of Line 48 in ddfc19b
And then I found two typos in the test data. Lines 36 to 37 in ddfc19b
((60.2574, -34.0099, 36.2"6"77), (60.4626, -34.1751, 39.4387), 1.2644), The fixes will allow us to minimize the tolerance (i.e. |
One of the most costly parts of the Lines 201 to 206 in 9f27da0
When |
The accuracy bottleneck in calculating Prior to PR #476, this calculation was done in PR #476 changed the code to calculate Another problem related to Lines 181 to 182 in bf61512
|
The following is effective against the precompile problem in that it does not call Edit: I added the workaround for subnormal numbers. The const DE2000_SINEXP_F32 = [Float32(π/3 * exp(-i)) for i = 0.0:0.25:87.25]
@inline function _de2000_rot(mh::Float32)
dh2 = ((mh - 275.0f0) * (1.0f0 / 25))^2
di = reinterpret(UInt32, dh2 + Float32(0x3p20))
i = di % UInt16 # round(UInt16, dh2 * 4.0)
i >= UInt16(350) && return 0.0f0 # avoid subnormal numbers
t = (reinterpret(Float32, di) - Float32(0x3p20)) - dh2 # |t| <= 0.125
sinexp = @inbounds DE2000_SINEXP_F32[i + 1] # π/3 * exp(-dh2) = (π/3 * exp(-i/4)) * exp(t)
em1 = @evalpoly(t, 1.0f0, 0.49999988f0, 0.16666684f0, 0.041693877f0, 0.008323605f0) * t
ex = muladd(sinexp, em1, sinexp)
ex < eps(0.5f0) && return ex
sn = @evalpoly(ex^2, -0.16666667f0, 0.008333333f0, -0.00019841234f0, 2.7550889f-6, -2.4529042f-8)
return muladd(sn * ex, ex^2, ex)
end |
There is still room for this kind of optimization in many places. However, in the interest of importance, I would like to close this issue with PR #506. |
Currently, many Float64-based calculations are used for color space conversions and color difference calculations, regardless of the input/output type. However,
Float32
is accurate enough for practical use in many color and image related applications.I intend to speed up
RGB{N0f8}
-->XYZ{Float32}
-->Lab{Float32}
conversions andcolordiff
.Probably the breaking change is that the
colordiff
forRGB{N0f8}
will returnFloat32
instead ofFloat64
.The text was updated successfully, but these errors were encountered: