Question and Proposal: Allow single-run execution of precompile file #486

NHDaly · 2021-01-02T23:47:09Z

Hello! This is a combo question and proposal, where I am hoping to reduce the time it takes to build a statically compiled sysimg or executable, by avoiding compiling everything twice, as we currently do.

My understanding of the current process of static compilation (as run from, say, PackageCompiler.create_app()) includes at least the following steps:

Run the provided precompile script(s) with --trace-compile=tmpname(), to record as a string representation what functions+types to compile when building the sysimg. This produces a file(s) containing several precompile(foo, (Bar, Baz)) statements.
Start a new julia session with --output-o, import the Package(s) being compiled, and then load and execute the precompile statements generated by the precompile scripts and provided in any precompile_statements_file(s).

My concern is that this means that in order to produce a statically compiled binary, we have to pay for compilation latency twice for every function compiled from in the precompile_execution_file.

My question is whether we could (maybe optionally) combine these into a single step, running the precompile_execution_file with --output-o directly, and then also loading and executing any provided precompile_statements_files in the same session.

Could you help me understand why this is currently performed in two separate steps now, and what we might need to do in order to allow us to combine these into one step?

Some problems that we have from the current setup:

It's slow: we have to pay for compiling all the code twice, and for our software this currently takes >1 hour and we aren't snooping as much as we would like. I'd like us to snoop much more, but we're afraid of making our CI build times too long.
Since we currently write every method instance to disk as a string and then read it back in a new process, there are currently bugs that cause us to drop some functions.
- For more details, see:
  - Precompile statement emitter sometimes emits invalid statements julia#28808 (comment)
  - Log precompile _failures_ as well as errors. Pass log level through to helper process. #457
- If we were to avoid this round-trip through a text file, we could get much better recall via static compilation. Currently we have around 3,000 / 15,000 precompile statements not actually working in our build (😢), and I hope avoiding the round trip could help?

Some reasons that I can imagine that might motivate why we currently do this in two steps include:
A) In order to avoid method invalidations, perhaps we want to ensure that we aren't eval'ing new definitions (by loading new packages for example) during execution of the process running with --output-o?
- My thinking is that if we were to run --output-o during the main process, we might accidentally invalidate some of the functions we mean to statically compile after we've emitted them by loading some new package halfway through the precompilation script, and then I don't know how --output-o would handle that. Would that cause problems?
B) Perhaps running with --output-o makes julia significantly slower, to the point where it might be faster overall to run once without that flag, record the results to disk, and then run again with the flag, only performing the output? But this seems dubious to me when the compilation itself is the bottleneck in a precompilation script (which it hopefully should be, for a well-written precompilation script).
C) A precompilation script may load other packages in order to trigger all the compilations desired to be statically compiled, but we don't necessarily want to precompile the functions from those other packages.
- To solve this case, I imagine that we could perhaps update julia's --output-o flag to take a list of top-level module names from which to emit object code, and it could ignore methods and/or types coming from outside that list? That should replicate the current behavior.

I'm very interested to hear if there are other things I'm missing! :) Sorry if this is rehashing old discussions; i haven't been able to find anything on this when searching.

If this doesn't make sense all the time, perhaps we could support it with a flag, or something?
Anyway, thanks for your time!
Happy 2021!

The text was updated successfully, but these errors were encountered:

NHDaly · 2021-01-02T23:47:42Z

CC: @omus, since I know you all are statically compiling some fairly large binaries at Invenia, as well :) 👍

KristofferC · 2021-01-05T13:14:22Z

One issue is that __init__ methods for modules are not run during --output-o so packages act weirdly in that process. That was a major source of issues with the old PackageCompiler. Another is that you don't really want to carry over any state from the precompile process to the created sysimage. For example, sometimes people leave dangling tasks (which then causes creating the sysimage to fail to be created since tasks cannot be serialized).

NHDaly · 2021-01-11T04:58:44Z

I see. those are all good points. Thanks, that's quite helpful to understand! :)

NHDaly · 2021-01-14T19:31:49Z

Does this seem like the kind of thing where you could imagine having a flag to control this behavior? We're likely going to get to the point where our precompilation build takes on the order of 2-6 hours, depending on how many data structure specializations we decide to warm up, and it would be way nice to cut this time in half.

Are those problems things we could detect and explain to the user so that they're fixable on the user's side?

thanks for engaging on this.

NHDaly mentioned this issue Apr 12, 2021

Faster incremental sysimg rebuilds JuliaLang/julia#40414

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question and Proposal: Allow single-run execution of precompile file #486

Question and Proposal: Allow single-run execution of precompile file #486

NHDaly commented Jan 2, 2021

NHDaly commented Jan 2, 2021

KristofferC commented Jan 5, 2021 •

edited

Loading

NHDaly commented Jan 11, 2021

NHDaly commented Jan 14, 2021

Question and Proposal: Allow single-run execution of precompile file #486

Question and Proposal: Allow single-run execution of precompile file #486

Comments

NHDaly commented Jan 2, 2021

NHDaly commented Jan 2, 2021

KristofferC commented Jan 5, 2021 • edited Loading

NHDaly commented Jan 11, 2021

NHDaly commented Jan 14, 2021

KristofferC commented Jan 5, 2021 •

edited

Loading