Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dune support #1687

Merged
merged 1 commit into from
Jan 1, 2023
Merged

Dune support #1687

merged 1 commit into from
Jan 1, 2023

Conversation

Alizter
Copy link
Collaborator

@Alizter Alizter commented Dec 31, 2022

We switch the opam job to use Dune to build the library. The opam file is now generated by Dune. We also run coqchk in the opam job.

For regular users of make, nothing should change. I plan to switch completely to Dune in the future, but for that I need to do some more work to adapt our scripts.

This also means that if you have Dune >= 3.5 you can build the library using dune build and run tests (just coqchk for now) with dune build @runtest. Dune has a number of advantages as a build system over make such as the ability to compose projects easily (i.e. plop coq and HoTT in the same workspace), cache build commands, etc. Dune doesn't work on timestamps like make but rather the hashed contents (digests) of the contents of the workspace. The build artifacts appear in _build/default allowing for a clean separation from the sources. Editors should be able to work with Dune as I have patched _CoqProject to take into account the _build directory. This might only work if the editor is started in the HoTT directory however.

@Alizter Alizter mentioned this pull request Dec 31, 2022
@Alizter Alizter force-pushed the ps/rr/dune_support branch 2 times, most recently from 81ec160 to 5c9b714 Compare December 31, 2022 16:17
<!-- ps-id: 954ea380-2db6-46f1-b33c-6f3384074297 -->

Signed-off-by: Ali Caglayan <alizter@gmail.com>
@jdchristensen
Copy link
Collaborator

I tested this and it works for me. I did notice that building with dune is ~5% slower than building with make with -j8 and about 8% slower with -j1. Not sure why this would be. But since this PR doesn't eliminate the use of make, that's ok.

@Alizter
Copy link
Collaborator Author

Alizter commented Jan 1, 2023

Very well, merging this now. I will investigate the performance issues, but my suspicion is the calls to coqdep. Unfortunately coqdep is not a very lean program and does lots of questionable and slow things. I have a private experimental branch of Dune which is smart enough to resolve dependencies internally. From initial tests, it gives a vast speedup over the coqdep based build. Until we can plug this 10% gap in performance however, I am not comfortable fully switching to Dune.

@Alizter Alizter merged commit d1072e4 into HoTT:master Jan 1, 2023
@Alizter Alizter deleted the ps/rr/dune_support branch January 1, 2023 20:27
jarlg pushed a commit to jarlg/HoTT that referenced this pull request Feb 7, 2023
<!-- ps-id: 954ea380-2db6-46f1-b33c-6f3384074297 -->

Signed-off-by: Ali Caglayan <alizter@gmail.com>
@Alizter
Copy link
Collaborator Author

Alizter commented Feb 26, 2023

@jdchristensen Would you be able to test with Dune 3.7? I fixed the slow calls of coqdep (they are now called once per coq.theory). There should be no observable difference between the Dune and Makefile at this point.

@jdchristensen
Copy link
Collaborator

I tested building Coq-HoTT commit 8d1c6a7 with Dune 3.7.0 vs GNU Make 4.3. With -j1, the time was almost identical, with dune being 0.3% faster (which is probably in the noise). With -j8, dune was 5% slower. I repeated the -j8 timings and they were stable. So there's been a big improvement to dune's -j1 build speed.

The parallel build time is very sensitive to the build order, since dependencies can prevent 8 jobs from being run at once. I know make is doing things suboptimally, but it's likely that dune by chance is choosing an even worse order. Does dune use any heuristics to decide which of the possible targets to build next? One possible heuristic is to prefer targets with long chains of dependencies relying on them. One could also imagine a build system that records timing info for each build and uses it when scheduling the jobs in the next run.

An example illustrating what I mean is at https://lists.gnu.org/archive/html/bug-make/2009-12/msg00023.html

(BTW, I ran dune with --display=quiet and with stdout and stderr redirected to /dev/null. If you let it write to the screen in the usual way, that slows down the -j8 build by an additional 7%. I also ran make with stdout and stderr redirected to /dev/null.)

@jdchristensen
Copy link
Collaborator

Another issue that might contribute to this is that the dune process uses a few seconds of cpu time. With -j1 that might not slow down the build, because it could happen on a different core. But when all cores are in use, it would slow down the build. make uses 0s of cpu time, to the nearest second. Most of the cpu time dune uses happens in the first two seconds, and then more gradually accumulates. Maybe some inner loops can be optimized? Does ocaml support profiling?

@ejgallego
Copy link
Contributor

Thanks a lot for the measurements @jdchristensen , I think there may have been some regression in Dune in some of the last recent releases, the overheard you are witnessing should not be happening, and I have noticed it in my own projects as well.

We are investigating more.

@ejgallego
Copy link
Contributor

@Alizter points to some problems with the display setup in dune, an interesting option to try too is --no-buffer; dune by defaults buffers output of processes as to avoid races with terminal output.

@jdchristensen
Copy link
Collaborator

I tried with --no-buffer, --display=quiet and -j8, and there was no difference to the timing or the overhead.

I did a bit more investigating, and some of the slowness of dune is definitely at the start. If I do touch start; make -j8 ... and touch start; dune build -j8 ..., and then check the precise timestamps on the start file and the first four files to be built, I get the following:

make:

14:15:27.79 start
14:15:28.68 file1
14:15:28.72 file2
14:15:28.79 file3
14:15:28.93 file4

dune:

14:16:05.30 start
14:16:06.20 file1
14:16:08.10 file2
14:16:08.16 file3
14:16:08.24 file4

Both dune and make get the first file built in about 0.9s. But then dune waits almost 2s before doing the second file, while make gets the second file done within 0.04s of the first file. (The second file is very short.) Each of these files depends on the previous one, so there is no parallelization at this stage. I repeated the timings to make sure they are reproducible.

Since the entire build takes ~40s with make and ~42s with dune, that 2s explains the 5% difference.

With -j1, the behaviour is different. It takes almost 3s for file1 to be built, and then file2 and the others quickly follow. So there is still an extra 2s, but it is before file1 rather than after it.

@jdchristensen
Copy link
Collaborator

I wonder if the delay at the start has something to do with dune having 2770 targets to process, even though there are only 553 .v files. In the first second or two, it jumps up to 1693/2770 done.

On another note, I was hoping that if I made an innocuous change to a file, e.g. just changing a comment, then after it got rebuilt, the hash of its output file would be unchanged, so that its dependencies wouldn't need to be rebuilt. But I'm not observing that. Is this because Coq build artefacts include some information about character positions in a file, or something like that? Even so, I would have hoped that second-generation dependencies then wouldn't need to be built, but they do.

@ejgallego
Copy link
Contributor

Thanks for all the testing @jdchristensen , a few comments:

  • indeed the delay could be due to newer dune being lazier in discovering the build targets; this can be beneficial for some setups, but we need to understand why the overhead seems to have increased for small projects like this. The number of targets HoTT has is not large for Dune
  • regarding the rebuild, that's an interesting problem, and we know about it. There is discussion upstream about what to do about this, newer projects like coq-lsp will try to depend not on the hash of the whole .vo file but only in the hash of the required parts. Where to put location information is indeed a tricky issue, putting them in the .vo files has some advantages, but some downsides. Feel free to open a bug upstream so at least we are aware of your particular case so we are aware of it while discussion how to make Coq builds more incremental.

@jdchristensen
Copy link
Collaborator

I wasn't mentioning 2770 as a large number in absolute terms. But it is almost five times larger than the actual number of source files, so I wondered if there is some repetition causing an issue, especially since around 60% of the targets get handled during the 2s delay at the start. @Alizter probably knows where these phantom targets are coming from.

(Also, @Alizter : I noticed that etc/generate_coqproject.sh doesn't realize we are in a git repo when the build is done using dune, probably because we have cd'd into a subdir. Maybe we should check the return value of git rev-parse --is-inside-work-tree > /dev/null 2>&1?)

Thanks for the other info. I would be really excited if incremental builds managed to avoid building certain things that aren't affected by the changes. Even if just whitespace / comment changes were ignored, that would be helpful. ... Ah, I see that if I change a comment without changing its length, no dependencies get rebuilt. So it does seem like it must be location information that is causing the issue, and certainly dependencies don't need to be rebuilt if only character positions in the file change.

@ejgallego
Copy link
Contributor

The number of targets is large because for each a.v file we generate 4 files: a.v a.vo a.vos a.glob (the source is also a target as it gets copied to the build dir) But indeed, that is not an issue, in the sense that Dune setups a number of rules 1:1 to the number of compiled modules.

Thanks for the other info. I would be really excited if incremental builds managed to avoid building certain things that aren't affected by the changes. Even if just whitespace / comment changes were ignored, that would be helpful. ... Ah, I see that if I change a comment without changing its length, no dependencies get rebuilt. So it does seem like it must be location information that is causing the issue, and certainly dependencies don't need to be rebuilt if only character positions in the file change.

Yes, that'd be great, that kind of increental setup already works with coq-lsp: it will ignore whitespace changes, and a few of other common operations, in addition of having an intra-document cache.

But indeed at some point we should be able to do that for whole projects (coq-lsp will only work inside the same file); please don't hesitate to open bugs upstream to signal these kind of issues.

As of today Coq embeds very little location info into .vo files so it'd be good to know which particular part of the system is causing your issue.

@jdchristensen
Copy link
Collaborator

Ok, I see that in many cases, Coq will produce a bitwise identical .vo file, even if you use different tactics to produce the same proof term. This is great, and makes me a full convert to dune. I also managed to isolate some lines that caused this issue in my testing. I happened to test using Basics/Overture.v, which has lines that cause the misbehaviour. I opened coq/coq#17312 with specific examples.

@ejgallego
Copy link
Contributor

Thanks for the report, indeed it is important to understand how much of these cases happens as we are in the process of maybe adding some more metadata (such as documentation or location) to .vo files.

@jdchristensen
Copy link
Collaborator

@ejgallego By the way, I can't find anywhere in the dune documentation that it mentions that rebuilding is determined by the contents of the dependencies rather than their timestamps. This is a fundamental difference (and huge improvement) from what most people are used to, so I would expect it to be mentioned in the very first page (Overview) as well as in other pages. It's also not in the "dune-build" man page / dune build --help output. In particular, I can't find any information on how dune determines whether there are changes. Is it keeping a database of hashes somewhere? Or is it comparing the files in the source tree to the copies in the _build folder? If it is computing hashes of all files and build artifacts, I wonder whether that contributes to the 2s delay at startup.

For others listening in, I have to also advertise the caching feature of dune (which you have to enable in your config). When you move around in history, make copies of your tree and rebuild, or make a change and then undo it, your next build can be almost instantaneous if most parts have been built before.

@ejgallego
Copy link
Contributor

ejgallego commented Mar 3, 2023

@jdchristensen indeed it seems documentation is non-existent on when rules are rebuilt.

While I guess Dune developers didn't want to expose too much of the internals to users (as to for example allow them some room to change how rules are re-executed) I agree that documentation should say something about it.

It is agreed upon that Dune documentation could use a lot of improvement.

Would you mind opening a bug upstream https://github.com/ocaml/dune ?

As to how rules are reexecuted, as of today that's a mix of factors, basically dune sees rules as pure functions, where the input is all the dependencies that are tracked, then the rule itself is hashes, so things are reexcuted when the rules change, or the dependencies of the rule are changed, this includes the commands involved in the rule action.

@ejgallego
Copy link
Contributor

Indeed the cache is great, and at some point people how to extend the caching to be more distributed so effectively you could checkout a HoTT branch and pull the build from a network cache. Caching is nice as thanks to Dune architecture all you have to do is to check if a rule hash is already in the cached db.

By the way the 2sec delay likely comes from the rule analysis itself, but indeed it has to be some bad codepath due to newer Memoization tech, we are investigating.

@Alizter
Copy link
Collaborator Author

Alizter commented Mar 3, 2023

I mentioned the lack of documentation to the other dune devs by the way. It was agreed that we could probably have a high level sketch about how Dune works.

@Alizter
Copy link
Collaborator Author

Alizter commented Mar 24, 2023

I did some more detailed benchmarks yesterday using a tool called https://github.com/sharkdp/hyperfine. I tested make, dune 3.7 and an experimental version of dune using a different hashing algorithm ocaml/dune#7392 (let's call it xxH dune).

We ran each builder at different levels of parallelism from 1 to 20 in steps of 3. They all seemed to agree that 7 was optimal, so I will only summarise those results (7 isn't special here, it could easily be 8 or 9).

Here is a summary of the results:

Builder make dune xxH dune
Mean (s) 65.519 68.799 66.285
Standard deviation σ 0.325 0.292 0.222
Min 65.068 68.450 66.041
Max 66.031 69.175 66.728

Dune serialises data into a string (using OCaml marshalling) and then hashes it. So optimising the hashing of rules has two critical points. The hashing algorithm in dune today is md5. The xxH hashing algorithm is extremely fast which is evident from how much it was able to catch up to make. I suspect if work is done to avoid marshalling data, then we could probably even overtake make.

#!/bin/sh

hyperfine --prepare 'make clean' --parameter-scan num_threads 1 20 -D 3 'make -j {num_threads}'
git clean -xffdq
hyperfine --prepare 'dune clean' --parameter-scan num_threads 1 20 -D 3 'dune build @install --cache disabled -j {num_threads}'
hyperfine --prepare 'dune clean' --parameter-scan num_threads 1 20 -D 3 '~/dune/_build/install/default/bin/dune build @install --cache disabled -j {num_threads}'

And here is what hyperfine did overnight:

Raw command output
Benchmark 1: make -j 1
  Time (mean ± σ):     183.352 s ±  0.585 s    [User: 145.733 s, System: 37.385 s]
  Range (min … max):   182.592 s … 184.393 s    10 runs

Benchmark 2: make -j 4
  Time (mean ± σ):     71.107 s ±  0.237 s    [User: 219.216 s, System: 56.129 s]
  Range (min … max):   70.575 s … 71.357 s    10 runs

Benchmark 3: make -j 7
  Time (mean ± σ):     65.519 s ±  0.325 s    [User: 321.863 s, System: 79.186 s]
  Range (min … max):   65.068 s … 66.031 s    10 runs

Benchmark 4: make -j 10
  Time (mean ± σ):     68.299 s ±  0.223 s    [User: 438.104 s, System: 101.886 s]
  Range (min … max):   67.994 s … 68.617 s    10 runs

Benchmark 5: make -j 13
  Time (mean ± σ):     68.423 s ±  0.202 s    [User: 525.767 s, System: 122.536 s]
  Range (min … max):   68.105 s … 68.716 s    10 runs

Benchmark 6: make -j 16
  Time (mean ± σ):     68.872 s ±  0.233 s    [User: 595.704 s, System: 138.352 s]
  Range (min … max):   68.546 s … 69.385 s    10 runs

Benchmark 7: make -j 19
  Time (mean ± σ):     68.719 s ±  0.082 s    [User: 638.164 s, System: 146.497 s]
  Range (min … max):   68.565 s … 68.816 s    10 runs

Summary
  'make -j 7' ran
    1.04 ± 0.01 times faster than 'make -j 10'
    1.04 ± 0.01 times faster than 'make -j 13'
    1.05 ± 0.01 times faster than 'make -j 19'
    1.05 ± 0.01 times faster than 'make -j 16'
    1.09 ± 0.01 times faster than 'make -j 4'
    2.80 ± 0.02 times faster than 'make -j 1'
Benchmark 1: dune build @install --cache disabled -j 1
  Time (mean ± σ):     200.104 s ±  1.583 s    [User: 152.155 s, System: 47.968 s]
  Range (min … max):   199.152 s … 204.321 s    10 runs

  Warning: The first benchmarking run for this command was significantly slower than the rest (204.321 s). This could be caused by (filesystem) caches that were not filled until after the first run. You are already using the '--prepare' option which can be used to clear caches. If you did not use a cache-clearing command with '--prepare', you can either try that or consider using the '--warmup' option to fill those caches before the actual benchmark.

Benchmark 2: dune build @install --cache disabled -j 4
  Time (mean ± σ):     76.585 s ±  0.444 s    [User: 229.656 s, System: 69.537 s]
  Range (min … max):   75.908 s … 77.255 s    10 runs

Benchmark 3: dune build @install --cache disabled -j 7
  Time (mean ± σ):     68.799 s ±  0.292 s    [User: 327.959 s, System: 94.780 s]
  Range (min … max):   68.450 s … 69.175 s    10 runs

Benchmark 4: dune build @install --cache disabled -j 10
  Time (mean ± σ):     71.071 s ±  0.333 s    [User: 450.719 s, System: 119.391 s]
  Range (min … max):   70.608 s … 71.672 s    10 runs

Benchmark 5: dune build @install --cache disabled -j 13
  Time (mean ± σ):     70.900 s ±  0.281 s    [User: 539.915 s, System: 143.434 s]
  Range (min … max):   70.534 s … 71.318 s    10 runs

Benchmark 6: dune build @install --cache disabled -j 16
  Time (mean ± σ):     70.976 s ±  0.207 s    [User: 609.650 s, System: 162.421 s]
  Range (min … max):   70.599 s … 71.225 s    10 runs

Benchmark 7: dune build @install --cache disabled -j 19
  Time (mean ± σ):     71.065 s ±  0.192 s    [User: 660.059 s, System: 177.993 s]
  Range (min … max):   70.850 s … 71.366 s    10 runs

Summary
  'dune build @install --cache disabled -j 7' ran
    1.03 ± 0.01 times faster than 'dune build @install --cache disabled -j 13'
    1.03 ± 0.01 times faster than 'dune build @install --cache disabled -j 16'
    1.03 ± 0.01 times faster than 'dune build @install --cache disabled -j 19'
    1.03 ± 0.01 times faster than 'dune build @install --cache disabled -j 10'
    1.11 ± 0.01 times faster than 'dune build @install --cache disabled -j 4'
    2.91 ± 0.03 times faster than 'dune build @install --cache disabled -j 1'
Benchmark 1: ~/dune/_build/install/default/bin/dune build @install --cache disabled -j 1
  Time (mean ± σ):     181.466 s ±  0.987 s    [User: 144.700 s, System: 36.753 s]
  Range (min … max):   180.778 s … 184.128 s    10 runs

Benchmark 2: ~/dune/_build/install/default/bin/dune build @install --cache disabled -j 4
  Time (mean ± σ):     71.918 s ±  0.512 s    [User: 219.390 s, System: 55.240 s]
  Range (min … max):   71.223 s … 72.698 s    10 runs

Benchmark 3: ~/dune/_build/install/default/bin/dune build @install --cache disabled -j 7
  Time (mean ± σ):     66.285 s ±  0.222 s    [User: 315.945 s, System: 77.856 s]
  Range (min … max):   66.041 s … 66.728 s    10 runs

Benchmark 4: ~/dune/_build/install/default/bin/dune build @install --cache disabled -j 10
  Time (mean ± σ):     68.791 s ±  0.343 s    [User: 434.495 s, System: 98.365 s]
  Range (min … max):   68.435 s … 69.435 s    10 runs

Benchmark 5: ~/dune/_build/install/default/bin/dune build @install --cache disabled -j 13
  Time (mean ± σ):     69.613 s ±  0.190 s    [User: 522.590 s, System: 120.124 s]
  Range (min … max):   69.368 s … 69.959 s    10 runs

Benchmark 6: ~/dune/_build/install/default/bin/dune build @install --cache disabled -j 16
  Time (mean ± σ):     69.775 s ±  0.209 s    [User: 587.146 s, System: 135.056 s]
  Range (min … max):   69.539 s … 70.064 s    10 runs

Benchmark 7: ~/dune/_build/install/default/bin/dune build @install --cache disabled -j 19
  Time (mean ± σ):     69.795 s ±  0.584 s    [User: 629.852 s, System: 142.853 s]
  Range (min … max):   69.130 s … 71.323 s    10 runs

Summary
  '~/dune/_build/install/default/bin/dune build @install --cache disabled -j 7' ran
    1.04 ± 0.01 times faster than '~/dune/_build/install/default/bin/dune build @install --cache disabled -j 10'
    1.05 ± 0.00 times faster than '~/dune/_build/install/default/bin/dune build @install --cache disabled -j 13'
    1.05 ± 0.00 times faster than '~/dune/_build/install/default/bin/dune build @install --cache disabled -j 16'
    1.05 ± 0.01 times faster than '~/dune/_build/install/default/bin/dune build @install --cache disabled -j 19'
    1.08 ± 0.01 times faster than '~/dune/_build/install/default/bin/dune build @install --cache disabled -j 4'
    2.74 ± 0.02 times faster than '~/dune/_build/install/default/bin/dune build @install --cache disabled -j 1'

real303m58.545s
user1451m54.891s
sys361m6.161s

jarlg pushed a commit to jarlg/HoTT that referenced this pull request May 2, 2023
<!-- ps-id: 954ea380-2db6-46f1-b33c-6f3384074297 -->

Signed-off-by: Ali Caglayan <alizter@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants