-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Lab
<-->XYZ
conversions
#486
Conversation
Benchmarkusing Colors, BenchmarkTools
rgb_f64 = rand(RGB{Float64}, 1000, 1000);
xyz_f64 = XYZ{Float64}.(rgb_f64);
xyz_f32 = XYZ{Float32}.(xyz_f64);
lab_f64 = Lab{Float64}.(xyz_f64);
lab_f32 = Lab{Float32}.(lab_f64);
function Colors.fxyz2lab(v) # force invalidation
ka = oftype(v, 841 / 108) # (29/6)^2 / 3 = xyz_kappa / 116
kb = oftype(v, 16 / 116) # 4/29
v > oftype(v, Colors.xyz_epsilon) ? cbrt(v) : muladd(ka, v, kb)
end julia> @btime convert.(XYZ, $lab_f64);
7.433 ms (2 allocations: 22.89 MiB) # before
5.433 ms (2 allocations: 22.89 MiB) # after
julia> @btime convert.(XYZ, $lab_f32);
11.288 ms (2 allocations: 11.44 MiB) # before
2.359 ms (2 allocations: 11.44 MiB) # after
julia> @btime convert.(Lab, $xyz_f64);
31.515 ms (2 allocations: 22.89 MiB) # before
29.489 ms (2 allocations: 22.89 MiB) # after
julia> @btime convert.(Lab, $xyz_f32);
33.801 ms (2 allocations: 11.44 MiB) # before
23.568 ms (2 allocations: 11.44 MiB) # after |
Codecov Report
@@ Coverage Diff @@
## master #486 +/- ##
==========================================
+ Coverage 92.82% 93.12% +0.29%
==========================================
Files 9 9
Lines 1004 1018 +14
==========================================
+ Hits 932 948 +16
+ Misses 72 70 -2
Continue to review full report at Codecov.
|
Are there any related issues to this? Edit: Oh, I guess you mean you're doing a manual inline? 😄 |
This adds the separate methods for omitting the white points to facilitate constant folding. This also suppresses the promotion of intermediate variables to `Float64` due to white point being `XYZ{Float64}`.
Right. That's exactly what you presumed. |
F = promote_type(T, eltype(c)) | ||
|
||
fy1 = c.l * F(0x1p-7) | ||
fy2 = muladd(c.l, F(3 / 3712), F(16 / 116)) | ||
fy = fy1 + fy2 # (c.l + 16) / 116 | ||
fx = fy1 + muladd(c.a, F( 0x1p-9), muladd(c.a, F(3 / 64000), fy2)) # fy + c.a / 500 | ||
fz = fy1 + muladd(c.b, F(-0x1p-8), muladd(c.b, F(-7 / 6400), fy2)) # fy - c.b / 200 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in #427 (comment), this avoidance of divisions is effective on the Lab
-->XYZ
conversion.
The accuracy is mostly equivalent to the division version, in the environment that supports the FMA instruction.
Note that the constant folding of |
This adds the separate methods for omitting the white points to facilitate constant folding.
This also suppresses the promotion of intermediate variables to
Float64
due to white point beingXYZ{Float64}
. (cf. issue #477)A part of this PR has been separated from PR #482.
I'm not sure of the reason, but I avoid
mapc
for now because it makes the precompilation problem worse. This PR does not solve the precompilation problem. 😕