-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed benchmarks up by shortening the 10s iteration time #6271
Comments
As long as there is no, or almost no, background compilation in the measurement phase, then yes, we can decrease the warmup time. I would be careful about it, though. For example, with the current JMH framework we use, there is no simple way how to tell that there are still some compilations in the measurement phase. Note that if we want to measure only the performance of Enso, and not performance of Graal compilations, all the Graal compilations should be done during the warmup phase. In FastR, I reimplemented the whole benchmarking infrastructure from the scratch. I started every benchmark with This is just an example of almost reliable way how to detect that the warmup was sufficient. Before we move on with the investigation, I strongly recommend implementing a similar system to recognize a sufficient warmup and modify the times accordingly. I am not even mentioning that it would be a very good idea to introduce benchmark versioning if we are about to fiddle with any parameters. More details in #5714 |
I think that tracing the Truffle compilation and failing if it happens during measurement is a good idea. Here is an implementation for Enso benchmarks: enso$ git diff engine/runtime/src/bench/java/org/enso/interpreter/bench/benchmarks/semantic/ListBenchmarks.java
diff --git engine/runtime/src/bench/java/org/enso/interpreter/bench/benchmarks/semantic/ListBenchmarks.java engine/runtime/src/bench/java/org/enso/interpreter/bench/benchmarks/semantic/ListBenchmarks.java
index ef8b748ab8..481b607551 100644
--- engine/runtime/src/bench/java/org/enso/interpreter/bench/benchmarks/semantic/ListBenchmarks.java
+++ engine/runtime/src/bench/java/org/enso/interpreter/bench/benchmarks/semantic/ListBenchmarks.java
@@ -1,9 +1,12 @@
package org.enso.interpreter.bench.benchmarks.semantic;
-import java.io.ByteArrayOutputStream;
import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.List;
import java.util.concurrent.TimeUnit;
import java.util.function.Function;
+import java.util.logging.Handler;
+import java.util.logging.LogRecord;
import org.graalvm.polyglot.Context;
import org.graalvm.polyglot.Value;
import org.openjdk.jmh.annotations.Benchmark;
@@ -18,12 +21,14 @@ import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
import org.openjdk.jmh.infra.BenchmarkParams;
import org.openjdk.jmh.infra.Blackhole;
+import org.openjdk.jmh.infra.IterationParams;
+import org.openjdk.jmh.runner.IterationType;
@BenchmarkMode(Mode.AverageTime)
@Fork(1)
-@Warmup(iterations = 3)
-@Measurement(iterations = 5)
+@Warmup(iterations = 1, time = 1)
+@Measurement(iterations = 5, time = 1)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Benchmark)
public class ListBenchmarks {
@@ -33,6 +38,7 @@ public class ListBenchmarks {
private Value self;
private Value sum;
private Value oldSum;
+ private List<LogRecord> messages = new ArrayList<>();
@Setup
public void initializeBenchmark(BenchmarkParams params) throws Exception {
@@ -40,7 +46,21 @@ public class ListBenchmarks {
.allowExperimentalOptions(true)
.allowIO(true)
.allowAllAccess(true)
- .logHandler(new ByteArrayOutputStream())
+ .option("engine.TraceCompilation", "true")
+ .logHandler(new Handler() {
+ @Override
+ public void publish(LogRecord lr) {
+ messages.add(lr);
+ }
+
+ @Override
+ public void flush() {
+ }
+
+ @Override
+ public void close() throws SecurityException {
+ }
+ })
.option(
"enso.languageHomeOverride",
Paths.get("../../distribution/component").toFile().getAbsolutePath()
@@ -113,16 +133,18 @@ public class ListBenchmarks {
}
@Benchmark
- public void mapOverList(Blackhole matter) {
- performBenchmark(matter);
+ public void mapOverList(IterationParams it, Blackhole matter) {
+ performBenchmark(it, matter);
}
@Benchmark
- public void mapOverLazyList(Blackhole matter) {
- performBenchmark(matter);
+ public void mapOverLazyList(IterationParams it, Blackhole matter) {
+ performBenchmark(it, matter);
}
- private void performBenchmark(Blackhole hole) throws AssertionError {
+ private void performBenchmark(IterationParams it, Blackhole hole) throws AssertionError {
+ messages.clear();
+
var newList = plusOne.execute(self, list);
var newSum = sum.execute(self, newList, 0);
@@ -131,6 +153,14 @@ public class ListBenchmarks {
throw new AssertionError("Unexpected result " + result);
}
hole.consume(result);
+ if (it.getType() == IterationType.MEASUREMENT) {
+ if (messages.size() > 0) {
+ for (var lr : messages) {
+ System.err.println(lr.getMessage());
+ throw new AssertionError("Found compilation " + messages.size() + " messages during measurement");
+ }
+ }
+ }
}
} With these changes (especially after changing the number of warmups to 1 and giving it just 1s) my: sbt:runtime> benchOnly ListBenchmarks
Iteration 1:
opt done id=3593 org.graalvm.polyglot.Value<Function>.execute |Tier 2|Time 593( 239+354 )ms|AST 47|Inlined 53Y 0N|IR 3573/ 17618|CodeSize 74181|Addr 0x7f0707edca90|Timestamp 91619721533011|Src n/a
<failure>
java.lang.AssertionError: Found compilation 1 messages during measurement
at org.enso.interpreter.bench.benchmarks.semantic.ListBenchmarks.performBenchmark(ListBenchmarks.java:160)
at org.enso.interpreter.bench.benchmarks.semantic.ListBenchmarks.mapOverLazyList(ListBenchmarks.java:142)
at org.enso.interpreter.bench.benchmarks.semantic.jmh_generated.ListBenchmarks_mapOverLazyList_jmhTest:232 This way we can detect skewed results because of compilation not being finished yet before measurements. Of course this sketch would rather be turned into a support |
Regardless we lower the time, we need to detect whether a compilation is on while benchmarking. Being implemented as |
While using the new
IfVsCaseBenchmark
I realized we can specify time for each@Warmup
and@Measurement
. The regular time is 10s, but it seems 1s for each warm up and 3s for a measurement is good enough for theIfVsCaseBenchmark
. Given it has five warmup iterations and three measurements we are down from 80s to 11s. Of course, shortening the warmup time too much may decrease stability of our benchmarks, but the overall speed up seems too attractive to not try that.time
can be lowered from 10s for various benchmarksCCing @jdunkerley , @Akirathan
The text was updated successfully, but these errors were encountered: