Improved function arity analysis #1397

vouillon · 2023-02-01T14:19:29Z

Addresses #594

The analysis is a bit costly, so we want to see whether this should be enabled by default.

There is also the option to make a coarser analysis (propagation information about what the functions returns but not through function parameters).

currently	faster analysis	full analysis
5.24 sec	5.19 sec: 1.01x faster	5.11 sec: 1.03x faster

hhugo · 2023-02-02T21:52:49Z

@vouillon, I've rebased the branch and force-pushed after merging #1384

hhugo · 2023-02-03T13:00:02Z

I don't know how to read the timing in the description. What is it about ? Why is the "full analysis" faster that the "faster" one ?

vouillon · 2023-02-03T14:52:40Z

I don't know how to read the timing in the description. What is it about ? Why is the "full analysis" faster that the "faster" one ?

This is the timing of the generated code, when executing the following command:

node ocamlc.js -c ~/js_of_ocaml/benchmarks/sources/ml/*.ml

So, a more precise analysis yields a significant gain.

Compiling ocamlc.byte takes about 12s. The faster analysis adds about 0.35s. The full one adds about 0.8s.

hhugo · 2023-02-03T17:01:27Z

How is the size affected ? how many extra direct call are generated ?

hhugo · 2023-02-05T13:47:14Z

We could enable the full analysis with -O3 only

vouillon · 2023-02-07T15:37:19Z

How is the size affected ? how many extra direct call are generated ?

The code is about the same size. About 0.25% larger since we use a somewhat verbose notation for best performances:a[7].call(null, e1, ...,en). Indeed, a[7](e1, ..., en) would be interpreted as a method call on array a which is slower than using caml_calln. I tried patterns like (0||a[7])(...) or (0,a[7])(...) as a workaround, but they still seems slower than using call.

There are not that many additional extra direct calls but I think they are more likely to be performed repeatedly.

	currently	faster analysis	full analysis
Code size	1912637	1917311	1917226
Direct calls	25883	+1409	+ 1854

compiler/lib/specialize.ml

compiler/lib/global_flow.ml

hhugo · 2023-02-15T16:26:40Z

I've rebased your branch and make fast depend on profile (o1,o2,o3)

hhugo · 2023-02-15T16:38:48Z

@vouillon, what do you want to do with this ?

hhugo · 2023-02-22T01:23:42Z

I've rebased the PR again

hhugo · 2023-03-01T15:37:23Z

@vouillon, this is the last PR I'd like to sort out before the next release (see https://github.com/ocsigen/js_of_ocaml/milestone/10).
Should we merge or not ?

…alls

compiler/lib/global_flow.ml

…ysis

vouillon · 2023-03-09T16:40:37Z

@hhugo I think it is ready

hhugo · 2024-04-07T07:04:19Z

compiler/lib/generate.ml

-  let apply_directly = J.call f params J.N in
+  let apply_directly =
+    (* Make sure we are performing a regular call, not a (slower)
+       method call *)


@vouillon, can you provide some reference mentioning this optimization. I did some test and it seems to show the opposite.

I don't have any reference, but without this, the improved analysis resulted in slower code.
I just tried is compiling ocamlc.byte (dune exec -- js_of_ocaml --opt 3 which ocamlc.byte -o /tmp/ocamlc.js).
And then running it on some ml source files:

time node /tmp/ocamlc.js -c ./benchmarks/sources/ml/*.ml ./benchmarks/sources/ml/*.ml ./benchmarks/sources/ml/*.ml

This optimization makes close to a 8% performance improvement on my machine.

What ocaml version do you use ? it seems that ocamlc no longer works when compiled to js due to ocaml/ocaml#11997 (since OCaml 5.1). The reason is that Reloc_literal can now contain floats that jsoo doesn't want to marshal.

I tried with Ocaml 4.14.0.

This optimization makes close to a 8% performance improvement on my machine.

Results are too noisy on my laptop but I can see some improvements indeed.

I'm using python3 -m pyperf system tune to reduce the noise.

vouillon force-pushed the exact-calls branch 2 times, most recently from 456af07 to c09e350 Compare February 2, 2023 20:44

vouillon mentioned this pull request Feb 2, 2023

Effects: partial CPS transform #1384

Merged

vouillon force-pushed the exact-calls branch from c09e350 to 230f8e6 Compare February 2, 2023 21:15

hhugo force-pushed the exact-calls branch from 230f8e6 to bf20e38 Compare February 2, 2023 21:51

vouillon force-pushed the exact-calls branch 2 times, most recently from 74f2fa8 to fb90a0b Compare February 3, 2023 14:53

hhugo reviewed Feb 8, 2023

View reviewed changes

compiler/lib/specialize.ml Outdated Show resolved Hide resolved

hhugo reviewed Feb 8, 2023

View reviewed changes

compiler/lib/global_flow.ml Outdated Show resolved Hide resolved

hhugo reviewed Feb 8, 2023

View reviewed changes

compiler/lib/global_flow.ml Show resolved Hide resolved

hhugo added this to the 5.1 milestone Feb 14, 2023

hhugo force-pushed the exact-calls branch from fb90a0b to 70e463e Compare February 15, 2023 16:25

hhugo force-pushed the exact-calls branch from 70e463e to 665a0b8 Compare February 22, 2023 01:15

hhugo marked this pull request as ready for review February 22, 2023 01:15

hhugo force-pushed the exact-calls branch 2 times, most recently from 53f2627 to d960511 Compare February 28, 2023 11:37

vouillon added 4 commits March 2, 2023 15:30

Renamed test

6c9db0d

Test: direct calls

b7576d7

Code generation: make sure we are making function calls, not method c…

9b9f8e0

…alls

More exact calls

44b755b

hhugo force-pushed the exact-calls branch from d960511 to e4b67f4 Compare March 2, 2023 15:32

vouillon commented Mar 5, 2023

View reviewed changes

compiler/lib/global_flow.ml Outdated Show resolved Hide resolved

hhugo removed this from the 5.1 milestone Mar 7, 2023

vouillon and others added 4 commits March 8, 2023 19:19

Global flow analysis: option to perfom a faster but less precise anal…

11f2d20

…ysis

use profile to comtrol exact calls

8aeb07f

Changes

52667ed

small refactoring

30a6db3

vouillon force-pushed the exact-calls branch from e4b67f4 to 30a6db3 Compare March 8, 2023 18:21

hhugo merged commit 8843a98 into master Mar 11, 2023

hhugo deleted the exact-calls branch March 11, 2023 00:02

vouillon mentioned this pull request May 2, 2023

Effects: double translation of functions and dynamic switching between direct-style and CPS code #1461

Merged

hhugo reviewed Apr 7, 2024

View reviewed changes

vouillon mentioned this pull request Jul 26, 2024

Compiler: compact indirect call #1647

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved function arity analysis #1397

Improved function arity analysis #1397

vouillon commented Feb 1, 2023

hhugo commented Feb 2, 2023

hhugo commented Feb 3, 2023

vouillon commented Feb 3, 2023

hhugo commented Feb 3, 2023

hhugo commented Feb 5, 2023

vouillon commented Feb 7, 2023 •

edited

Loading

hhugo commented Feb 15, 2023

hhugo commented Feb 15, 2023

hhugo commented Feb 22, 2023

hhugo commented Mar 1, 2023

vouillon commented Mar 9, 2023

hhugo Apr 7, 2024

vouillon Apr 8, 2024

hhugo Apr 8, 2024

vouillon Apr 8, 2024

hhugo Apr 8, 2024

vouillon Apr 8, 2024

Improved function arity analysis #1397

Improved function arity analysis #1397

Conversation

vouillon commented Feb 1, 2023

hhugo commented Feb 2, 2023

hhugo commented Feb 3, 2023

vouillon commented Feb 3, 2023

hhugo commented Feb 3, 2023

hhugo commented Feb 5, 2023

vouillon commented Feb 7, 2023 • edited Loading

hhugo commented Feb 15, 2023

hhugo commented Feb 15, 2023

hhugo commented Feb 22, 2023

hhugo commented Mar 1, 2023

vouillon commented Mar 9, 2023

hhugo Apr 7, 2024

Choose a reason for hiding this comment

vouillon Apr 8, 2024

Choose a reason for hiding this comment

hhugo Apr 8, 2024

Choose a reason for hiding this comment

vouillon Apr 8, 2024

Choose a reason for hiding this comment

hhugo Apr 8, 2024

Choose a reason for hiding this comment

vouillon Apr 8, 2024

Choose a reason for hiding this comment

vouillon commented Feb 7, 2023 •

edited

Loading