Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move compile timing to inside @time's main timing block. Fixes >100% compilation time reports #41286

Merged
merged 4 commits into from
Jul 19, 2021

Conversation

IanButterworth
Copy link
Sponsor Member

@IanButterworth IanButterworth commented Jun 20, 2021

Fixes #41281

It seems like the case in #41281 that results in @time reporting >100% compilation time is fixed by pre-running the timing sampling lines within @time to ensure they're compiled(?)

Perhaps instead of this PR, I wondered if there's some type instability/invalidation going on that could be fixed, given that @time 1+1 is already baked into the sysimage?
Or, perhaps this PR is a failsafe solution?

Master

julia> @time map(x -> 2x, 1:3);
  0.035300 seconds (52.03 k allocations: 3.258 MiB, 107.03% compilation time)

julia> @time map(x -> 2x, 1:3);
  0.028745 seconds (49.72 k allocations: 3.057 MiB, 99.36% compilation time)
julia> x = rand(2,2);

julia> @time x * x;
  0.651758 seconds (2.54 M allocations: 140.827 MiB, 5.07% gc time, 99.95% compilation time)

julia> @time x * x;
  0.000006 seconds (1 allocation: 112 bytes)

This PR

julia> @time map(x -> 2x, 1:3);
  0.032076 seconds (52.04 k allocations: 3.241 MiB, 98.54% compilation time)

julia> @time map(x -> 2x, 1:3);
  0.029221 seconds (49.72 k allocations: 3.057 MiB, 99.45% compilation time)
julia> x = rand(2,2);

julia> @time x * x;
  0.632906 seconds (2.54 M allocations: 140.827 MiB, 7.51% gc time, 99.94% compilation time)

julia> @time x * x;
  0.000005 seconds (1 allocation: 112 bytes)

If anyone has any other @time calls that result in >100% I'd happily test them with this

base/timing.jl Outdated
@@ -207,6 +207,12 @@ macro time(ex)
local stats = gc_num()
local compile_elapsedtime = cumulative_compile_time_ns_before()
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the order of this and the next line should be swapped, since compile time should be a subset of total time. (same at the end of the block)

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree but I didn't do that originally because adding the compile time lookups would add a small amount of time to the time report, and would make small duration things look slower. Maybe it's insignificant and people should be using @btime for very fast things?

Copy link
Sponsor Member Author

@IanButterworth IanButterworth Jul 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made this change. It does seem like the right thing to do and the overhead seems too small to register

Master

julia> @time rand();
  0.000000 seconds

This PR

julia> @time rand();
  0.000000 seconds

base/timing.jl Outdated
@@ -207,6 +207,12 @@ macro time(ex)
local stats = gc_num()
local compile_elapsedtime = cumulative_compile_time_ns_before()
local elapsedtime = time_ns()
## ensure time samplers are compiled
compile_elapsedtime = cumulative_compile_time_ns_after() - compile_elapsedtime
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be enough just to call cumulative_compile_time_ns_after? Looks like that is the only thing that might not have been compiled yet.

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. That does fix it.
I went with precompile instead of just calling them because cumulative_compile_time_ns_after disables timing on the thread, which I worried could complicate nested @time's

This is after the latest commit

julia> @time map(x -> 2x, 1:3);
  0.054205 seconds (48.70 k allocations: 2.691 MiB, 99.65% compilation time)

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the other ordering, is precompiling even necessary?

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I just tested and it's not. Removed!

@chriselrod
Copy link
Contributor

Here's an example of >100% compile time:

julia> using Distributed

julia> split = Sys.free_memory() / ((1<<30) * Sys.CPU_THREADS) < 2
false

julia> add_per = split ? Sys.CPU_THREADS÷2 : Sys.CPU_THREADS
36

julia> addprocs(add_per, exeflags="--project=$(Base.active_project())");

julia> @everywhere begin
         @time using DiffEqFlux, StaticArrays, ProgressMeter, UnPack
         using DiffEqFlux: initial_params, sciml_train
       end
  0.000295 seconds (1.37 k allocations: 107.141 KiB, 826.61% compilation time)
      From worker 7:      0.000266 seconds (1.35 k allocations: 104.953 KiB, 938.51% compilation time)
      From worker 17:     0.000272 seconds (1.35 k allocations: 104.953 KiB, 913.14% compilation time)
      From worker 19:     0.000262 seconds (1.35 k allocations: 104.953 KiB, 936.60% compilation time)
      From worker 24:     0.000261 seconds (1.35 k allocations: 104.953 KiB, 941.09% compilation time)
      From worker 26:     0.000255 seconds (1.35 k allocations: 104.953 KiB, 962.57% compilation time)
      From worker 9:      0.000258 seconds (1.35 k allocations: 104.953 KiB, 1004.29% compilation time)
      From worker 29:     0.000255 seconds (1.35 k allocations: 104.953 KiB, 966.03% compilation time)
      From worker 31:     0.000265 seconds (1.35 k allocations: 104.953 KiB, 930.41% compilation time)
      From worker 23:     0.000265 seconds (1.35 k allocations: 104.953 KiB, 929.11% compilation time)
      From worker 15:     0.000262 seconds (1.35 k allocations: 104.953 KiB, 947.68% compilation time)
      From worker 14:     0.000262 seconds (1.35 k allocations: 104.953 KiB, 951.89% compilation time)
      From worker 22:     0.000264 seconds (1.35 k allocations: 104.953 KiB, 938.09% compilation time)
      From worker 13:     0.000266 seconds (1.35 k allocations: 104.953 KiB, 935.63% compilation time)
      From worker 8:      0.000241 seconds (1.35 k allocations: 104.953 KiB, 1043.94% compilation time)
      From worker 21:     0.000240 seconds (1.35 k allocations: 104.953 KiB, 1040.41% compilation time)
      From worker 10:     0.000232 seconds (1.35 k allocations: 104.953 KiB, 1073.77% compilation time)
      From worker 2:      0.000252 seconds (1.35 k allocations: 104.953 KiB, 1023.88% compilation time)
      From worker 3:      0.000258 seconds (1.35 k allocations: 104.953 KiB, 990.59% compilation time)
      From worker 25:     0.000251 seconds (1.35 k allocations: 104.953 KiB, 1014.71% compilation time)
      From worker 30:     0.000246 seconds (1.35 k allocations: 104.953 KiB, 1032.69% compilation time)
      From worker 5:      0.000272 seconds (1.35 k allocations: 104.953 KiB, 914.95% compilation time)
      From worker 20:     0.000263 seconds (1.35 k allocations: 104.953 KiB, 968.01% compilation time)
      From worker 11:     0.000261 seconds (1.35 k allocations: 104.953 KiB, 956.71% compilation time)
      From worker 18:     0.000262 seconds (1.35 k allocations: 104.953 KiB, 953.40% compilation time)
      From worker 4:      0.000272 seconds (1.35 k allocations: 104.953 KiB, 914.89% compilation time)
      From worker 33:     0.000241 seconds (1.35 k allocations: 104.953 KiB, 1043.96% compilation time)
      From worker 27:     0.000244 seconds (1.35 k allocations: 104.953 KiB, 1075.19% compilation time)
      From worker 34:     0.000239 seconds (1.35 k allocations: 104.953 KiB, 1072.98% compilation time)
      From worker 12:     0.000260 seconds (1.35 k allocations: 104.953 KiB, 984.52% compilation time)
      From worker 32:     0.000248 seconds (1.35 k allocations: 104.953 KiB, 1015.97% compilation time)
      From worker 36:     0.000234 seconds (1.35 k allocations: 104.953 KiB, 1133.23% compilation time)
      From worker 28:     0.000228 seconds (1.35 k allocations: 104.953 KiB, 1149.49% compilation time)
      From worker 35:     0.000234 seconds (1.35 k allocations: 104.953 KiB, 1149.36% compilation time)
      From worker 16:     0.000255 seconds (1.35 k allocations: 104.953 KiB, 1025.12% compilation time)
      From worker 6:      0.000249 seconds (1.35 k allocations: 104.953 KiB, 1031.25% compilation time)
      From worker 37:     0.000243 seconds (1.35 k allocations: 104.953 KiB, 1058.20% compilation time)

julia> versioninfo()
Julia Version 1.6.2-pre.2
Commit ff1827d117* (2021-05-02 02:37 UTC)
Platform Info:
  OS: Linux (x86_64-generic-linux)
  CPU: Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, cascadelake)
Environment:
  JULIA_NUM_THREADS = 36

@IanButterworth
Copy link
Sponsor Member Author

@chriselrod that seems like a different issue, perhaps Distributed-related. Could different processes be using the same thread-specific compile timers? I didn't think so, but I don't have another explanation.

This PR is more about the slightly > 100% reports

@IanButterworth
Copy link
Sponsor Member Author

@JeffBezanson I think this is RTM?

@IanButterworth IanButterworth added backport 1.7 backport 1.6 Change should be backported to release-1.6 labels Jul 14, 2021
@IanButterworth
Copy link
Sponsor Member Author

Bump

@IanButterworth IanButterworth changed the title Ensure @time's inner timing operations are compiled. Fixes >100% compilation time reports Move compile timing to inside @time's main timing block. Fixes >100% compilation time reports Jul 16, 2021
@KristofferC KristofferC merged commit 0dbd3f7 into JuliaLang:master Jul 19, 2021
KristofferC pushed a commit that referenced this pull request Jul 19, 2021
…% compilation time reports (#41286)

* ensure `@time`'s inner timing functions are compiled

(cherry picked from commit 0dbd3f7)
KristofferC pushed a commit that referenced this pull request Jul 20, 2021
…% compilation time reports (#41286)

* ensure `@time`'s inner timing functions are compiled

(cherry picked from commit 0dbd3f7)
KristofferC pushed a commit that referenced this pull request Jul 26, 2021
…% compilation time reports (#41286)

* ensure `@time`'s inner timing functions are compiled

(cherry picked from commit 0dbd3f7)
KristofferC pushed a commit that referenced this pull request Aug 31, 2021
…% compilation time reports (#41286)

* ensure `@time`'s inner timing functions are compiled

(cherry picked from commit 0dbd3f7)
KristofferC pushed a commit that referenced this pull request Sep 3, 2021
…% compilation time reports (#41286)

* ensure `@time`'s inner timing functions are compiled

(cherry picked from commit 0dbd3f7)
@KristofferC KristofferC removed the backport 1.6 Change should be backported to release-1.6 label Sep 7, 2021
@IanButterworth IanButterworth deleted the ib/time_tweak branch March 4, 2022 04:48
staticfloat pushed a commit that referenced this pull request Dec 23, 2022
…% compilation time reports (#41286)

* ensure `@time`'s inner timing functions are compiled

(cherry picked from commit 0dbd3f7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Greater than 100% compilation time reported by @time
4 participants