Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Lab<-->XYZ conversions #486

Merged
merged 1 commit into from
Jun 19, 2021
Merged

Conversation

kimikage
Copy link
Collaborator

This adds the separate methods for omitting the white points to facilitate constant folding.
This also suppresses the promotion of intermediate variables to Float64 due to white point being XYZ{Float64}. (cf. issue #477)

A part of this PR has been separated from PR #482.

I'm not sure of the reason, but I avoid mapc for now because it makes the precompilation problem worse. This PR does not solve the precompilation problem. 😕

@kimikage
Copy link
Collaborator Author

Benchmark

using Colors, BenchmarkTools
rgb_f64 = rand(RGB{Float64}, 1000, 1000);
xyz_f64 = XYZ{Float64}.(rgb_f64);
xyz_f32 = XYZ{Float32}.(xyz_f64);
lab_f64 = Lab{Float64}.(xyz_f64);
lab_f32 = Lab{Float32}.(lab_f64);

function Colors.fxyz2lab(v) # force invalidation
    ka = oftype(v, 841 / 108) # (29/6)^2 / 3 = xyz_kappa / 116
    kb = oftype(v, 16 / 116) # 4/29
    v > oftype(v, Colors.xyz_epsilon) ? cbrt(v) : muladd(ka, v, kb)
end
julia> @btime convert.(XYZ, $lab_f64);
  7.433 ms (2 allocations: 22.89 MiB) # before
  5.433 ms (2 allocations: 22.89 MiB) # after

julia> @btime convert.(XYZ, $lab_f32);
  11.288 ms (2 allocations: 11.44 MiB) # before
  2.359 ms (2 allocations: 11.44 MiB) # after

julia> @btime convert.(Lab, $xyz_f64);
  31.515 ms (2 allocations: 22.89 MiB) # before
  29.489 ms (2 allocations: 22.89 MiB) # after

julia> @btime convert.(Lab, $xyz_f32);
  33.801 ms (2 allocations: 11.44 MiB) # before
  23.568 ms (2 allocations: 11.44 MiB) # after

@codecov
Copy link

codecov bot commented Jun 16, 2021

Codecov Report

Merging #486 (34d8a0c) into master (4645452) will increase coverage by 0.29%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #486      +/-   ##
==========================================
+ Coverage   92.82%   93.12%   +0.29%     
==========================================
  Files           9        9              
  Lines        1004     1018      +14     
==========================================
+ Hits          932      948      +16     
+ Misses         72       70       -2     
Impacted Files Coverage Δ
src/conversions.jl 99.68% <100.00%> (+0.67%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4645452...34d8a0c. Read the comment docs.

@johnnychen94
Copy link
Member

johnnychen94 commented Jun 16, 2021

I'm not sure of the reason, but I avoid mapc for now because it makes the precompilation problem worse.

Are there any related issues to this?

Edit:

Oh, I guess you mean you're doing a manual inline? 😄

This adds the separate methods for omitting the white points to facilitate constant folding.
This also suppresses the promotion of intermediate variables to `Float64`
due to white point being `XYZ{Float64}`.
@kimikage
Copy link
Collaborator Author

Oh, I guess you mean you're doing a manual inline? 😄

Right. That's exactly what you presumed.
The mapc for white point scaling was manually inlined just for the purpose of constant folding. On the other hand, mapc(fxyz2lab, c) is causing invalidation to not work properly. 😵

Comment on lines +334 to +340
F = promote_type(T, eltype(c))

fy1 = c.l * F(0x1p-7)
fy2 = muladd(c.l, F(3 / 3712), F(16 / 116))
fy = fy1 + fy2 # (c.l + 16) / 116
fx = fy1 + muladd(c.a, F( 0x1p-9), muladd(c.a, F(3 / 64000), fy2)) # fy + c.a / 500
fz = fy1 + muladd(c.b, F(-0x1p-8), muladd(c.b, F(-7 / 6400), fy2)) # fy - c.b / 200
Copy link
Collaborator Author

@kimikage kimikage Jun 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in #427 (comment), this avoidance of divisions is effective on the Lab-->XYZ conversion.

The accuracy is mostly equivalent to the division version, in the environment that supports the FMA instruction.

@kimikage
Copy link
Collaborator Author

Note that the constant folding of WP_DEFAULT is also effective for Luv. I plan to submit a separate PR after merging the PR #489. I am still tweaking the cube root for Float32.

@kimikage kimikage merged commit 12f95f5 into JuliaGraphics:master Jun 19, 2021
@kimikage kimikage deleted the xyz_lab branch June 19, 2021 02:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants