Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression from 1.5.x #11226

Closed
philwebb opened this issue Nov 30, 2017 · 12 comments
Closed

Performance regression from 1.5.x #11226

philwebb opened this issue Nov 30, 2017 · 12 comments
Labels
type: task A general task
Milestone

Comments

@philwebb
Copy link
Member

2.0.0 seems to be slower to start and I'd like to know why!

@philwebb philwebb added this to the 2.0.0.RC1 milestone Nov 30, 2017
@philwebb
Copy link
Member Author

Some timing notes:

GC

2.0.0

java -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -jar target/*.jar | grep GC                                                                                                                                        
2017-12-18T13:55:08.899+0800: 0.656: [GC (Allocation Failure)  65536K->4743K(251392K), 0.0040314 secs]
2017-12-18T13:55:09.181+0800: 0.938: [GC (Allocation Failure)  70279K->7921K(251392K), 0.0060079 secs]
2017-12-18T13:55:09.430+0800: 1.187: [GC (Allocation Failure)  73457K->8195K(251392K), 0.0061011 secs]
2017-12-18T13:55:09.436+0800: 1.194: [GC (Metadata GC Threshold)  9643K->8065K(316928K), 0.0054321 secs]
2017-12-18T13:55:09.442+0800: 1.199: [Full GC (Metadata GC Threshold)  8065K->7701K(220672K), 0.0215755 secs]
2017-12-18T13:55:10.129+0800: 1.887: [GC (Allocation Failure)  138773K->14315K(220672K), 0.0055671 secs]
2017-12-18T13:55:10.783+0800: 2.540: [GC (Allocation Failure)  145387K->21461K(335360K), 0.0098723 secs]
2017-12-18T13:55:11.693+0800: 3.450: [GC (Metadata GC Threshold)  247050K->29365K(338944K), 0.0139084 secs]
2017-12-18T13:55:11.707+0800: 3.464: [Full GC (Metadata GC Threshold)  29365K->23158K(371712K), 0.0749999 secs]

1.5.x

java -XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -jar target/*.jar | grep GC                                                                                                                                       
2017-12-18T13:55:01.892+0800: 0.518: [GC (Allocation Failure)  65536K->4312K(251392K), 0.0036483 secs]
2017-12-18T13:55:02.135+0800: 0.761: [GC (Allocation Failure)  69848K->8099K(251392K), 0.0061107 secs]
2017-12-18T13:55:02.380+0800: 1.006: [GC (Allocation Failure)  73635K->7604K(251392K), 0.0052713 secs]
2017-12-18T13:55:02.625+0800: 1.251: [GC (Metadata GC Threshold)  66324K->9032K(316928K), 0.0078808 secs]
2017-12-18T13:55:02.633+0800: 1.259: [Full GC (Metadata GC Threshold)  9032K->8723K(250368K), 0.0269571 secs]
2017-12-18T13:55:03.387+0800: 2.013: [GC (Allocation Failure)  139795K->20439K(250368K), 0.0090122 secs]
2017-12-18T13:55:03.842+0800: 2.468: [GC (Allocation Failure)  151511K->22637K(311296K), 0.0090661 secs]
2017-12-18T13:55:04.448+0800: 3.074: [GC (Allocation Failure)  212077K->27815K(313344K), 0.0122303 secs]

@philwebb
Copy link
Member Author

philwebb commented Dec 18, 2017

Spring Security's BCrypt class appears as hot method. Add a NoOpPasswordEncoder bean yields the following results:

NoOp:

Started SampleActuatorUiApplication in 3.632 seconds (JVM running for 4.032)
Started SampleActuatorUiApplication in 3.536 seconds (JVM running for 3.968)
Started SampleActuatorUiApplication in 3.437 seconds (JVM running for 3.846)

BCrypt:

Started SampleActuatorUiApplication in 3.826 seconds (JVM running for 4.237)
Started SampleActuatorUiApplication in 3.777 seconds (JVM running for 4.193)
Started SampleActuatorUiApplication in 3.654 seconds (JVM running for 4.067)

I've raise spring-projects/spring-security#4915

@philwebb
Copy link
Member Author

philwebb commented Dec 22, 2017

Finally had a chunk of time to dedicate to this, breaking out a number of new issues:

@philwebb
Copy link
Member Author

philwebb commented Dec 23, 2017

Benchmark timings with Framework and Security patches and all Boot changes except the BeanInfo update:

SpringBoot15xBenchmark.explodedJarLauncher       avgt   10  2.910 ± 0.079   s/op
SpringBoot15xBenchmark.explodedJarMain           avgt   10  2.007 ± 0.032   s/op
SpringBoot15xBenchmark.fatJar                    avgt   10  2.738 ± 0.025   s/op

SpringBoot20M7Benchmark.explodedJarLauncher       avgt   10  3.610 ± 0.125   s/op
SpringBoot20M7Benchmark.explodedJarMain           avgt   10  2.390 ± 0.052   s/op
SpringBoot20M7Benchmark.fatJar                    avgt   10  3.383 ± 0.035   s/op

SpringBoot20xBenchmark.explodedJarLauncher       avgt   10  2.952 ± 0.048   s/op
SpringBoot20xBenchmark.explodedJarMain           avgt   10  2.288 ± 0.034   s/op
SpringBoot20xBenchmark.fatJar                    avgt   10  2.815 ± 0.040   s/op

And petclinic:

--- 1.5.x
PetclinicLatestBenchmark.devtoolsRestart         avgt   10  1.347 ± 0.066   s/op
PetclinicLatestBenchmark.explodedJarMain         avgt   10  4.400 ± 0.077   s/op
PetclinicLatestBenchmark.fatJar                  avgt   10  5.868 ± 0.093   s/op
PetclinicLatestBenchmark.noverify                avgt   10  5.184 ± 0.071   s/op

--- 2.0.0.M7
PetclinicLatestBenchmark.devtoolsRestart  avgt   10  0.001 ±  0.001   s/op
PetclinicLatestBenchmark.explodedJarMain  avgt   10  5.561 ±  0.159   s/op
PetclinicLatestBenchmark.fatJar           avgt   10  7.616 ±  0.083   s/op
PetclinicLatestBenchmark.noverify         avgt   10  6.936 ±  0.171   s/op

--- 2.0.0.BUILD-SNAPSHOT
PetclinicLatestBenchmark.devtoolsRestart  avgt   10  0.001 ±  0.001   s/op
PetclinicLatestBenchmark.explodedJarMain  avgt   10  5.166 ±  0.080   s/op
PetclinicLatestBenchmark.fatJar           avgt   10  6.766 ±  0.087   s/op
PetclinicLatestBenchmark.noverify         avgt   10  6.001 ±  0.060   s/op

@philwebb
Copy link
Member Author

Relevant info from @dsyer


Here's some interesting performance data for 2.0.0.M7

Boot     Benchmark           (sample)  Mode  Cnt  Score   Error  Units
2.0.0.M7 MainBenchmark.main      actr    ss   10  1.623 ± 0.036   s/op
2.0.0.M7 MainBenchmark.main      actr(*) ss   10  1.469 ± 0.028   s/op
1.5.9    OldBenchmark.old        actr    ss   10  1.386 ± 0.083   s/op

The (*) sample has -Dspring.autoconfigure.exclude=org.springframework.boot.actuate.autoconfigure.metrics.MetricsAutoConfiguration -Dmanagement.endpoints.jmx.enabled=false
I spotted those as hotspots using an Aspect that prints timings for AbstractAutowireCapableBeanFactory.initializeBean()

That takes us about half way back to 1.5.9. Only considering Actuator features. Their initialization probably could be deferred - e.g. WebMvcMetricsFilter eagerly instantiates a bean in Spring MVC that would otherwise be @Lazy (HandlerMappingIntrospector)

Excluding JMX also speeds up 1.5.9, but I don't think there was an option to exclude only Actuator JMX in 1.5.9

TomcatServletWebServerFactory also shows up on that hostspot list

Can't switch that one off, but maybe it shows us a place to look

@philwebb
Copy link
Member Author

philwebb commented Jan 3, 2018

Using https://github.com/dsyer/spring-boot-startup-bench with the following:

cd static; mvn install; java -jar target/benchmarks.jar main snap old

Currently gives me:

Benchmark           (sample)  Mode  Cnt  Score   Error  Units
SnapBenchmark.snap      empt  avgt   10  0.823 ± 0.014   s/op
SnapBenchmark.snap      demo  avgt   10  1.681 ± 0.071   s/op
SnapBenchmark.snap      actr  avgt   10  2.222 ± 0.067   s/op
SnapBenchmark.snap      jdbc  avgt   10  1.926 ± 0.061   s/op
SnapBenchmark.snap      actj  avgt   10  2.526 ± 0.035   s/op
SnapBenchmark.snap      jpae  avgt   10  3.002 ± 0.078   s/op
SnapBenchmark.snap      conf  avgt   10  3.026 ± 0.362   s/op

MainBenchmark.main      empt    ss   10  0.958 ± 0.037   s/op
MainBenchmark.main      demo    ss   10  1.869 ± 0.095   s/op
MainBenchmark.main      actr    ss   10  2.514 ± 0.063   s/op
MainBenchmark.main      jdbc    ss   10  2.107 ± 0.082   s/op
MainBenchmark.main      actj    ss   10  2.797 ± 0.205   s/op
MainBenchmark.main      jpae    ss   10  3.168 ± 0.219   s/op
MainBenchmark.main      conf    ss   10  2.884 ± 0.111   s/op
MainBenchmark.main      erka    ss   10  4.098 ± 0.232   s/op
MainBenchmark.main      busr    ss   10  4.374 ± 0.642   s/op
MainBenchmark.main      zuul    ss   10  5.800 ± 0.460   s/op
MainBenchmark.main      erkb    ss   10  5.872 ± 0.447   s/op
MainBenchmark.main      slth    ss   10  6.006 ± 0.611   s/op

OldBenchmark.old        empt    ss   10  0.747 ± 0.094   s/op
OldBenchmark.old        demo    ss   10  1.761 ± 0.261   s/op
OldBenchmark.old        actr    ss   10  2.190 ± 0.190   s/op
OldBenchmark.old        jdbc    ss   10  1.984 ± 0.093   s/op
OldBenchmark.old        actj    ss   10  2.470 ± 0.140   s/op
OldBenchmark.old        jpae    ss   10  2.887 ± 0.178   s/op
OldBenchmark.old        conf    ss   10  3.185 ± 0.276   s/op
OldBenchmark.old        erka    ss   10  4.083 ± 0.846   s/op
OldBenchmark.old        buna    ss   10  3.076 ± 0.313   s/op
OldBenchmark.old        busr    ss   10  3.944 ± 0.330   s/op
OldBenchmark.old        zuul    ss   10  5.926 ± 0.504   s/op
OldBenchmark.old        erkb    ss    5  5.713 ± 0.387   s/op
OldBenchmark.old        slth    ss   10  5.915 ± 0.671   s/op

(this is with the Spring Security patch applied)

Project 1.5.x Milestone Snapshot
empt 0.747 ± 0.094 0.958 ± 0.037 0.823 ± 0.014
demo 1.761 ± 0.261 1.869 ± 0.095 1.681 ± 0.071
actr 2.190 ± 0.190 2.514 ± 0.063 2.222 ± 0.067
jdbc 1.984 ± 0.093 2.107 ± 0.082 1.926 ± 0.061
actj 2.470 ± 0.140 2.797 ± 0.205 2.526 ± 0.035
jpae 2.887 ± 0.178 3.168 ± 0.219 3.002 ± 0.078
conf 3.185 ± 0.276 2.884 ± 0.111 3.026 ± 0.362

@dsyer
Copy link
Member

dsyer commented Jan 4, 2018

Those results look a bit different to mine so it's a surprise. Also, JPA isn't particularly expensive in Boot 2.0 for me. One observation: your error bars are quite large. This often indicates that you ran the tests on a noisy system (not enough free resources, memory, CPU etc.). If you strip down to just running this process and no IDE/browser etc, does it stabilize?

Also (although it doesn't matter for the "snap" comparison), I forgot to say, the "erk*" samples need a local Eureka. You can run it with Spring Boot CLI (with the Cloud extension), or just a vanilla @EnableEurekaServer running on port 8761.

@dsyer
Copy link
Member

dsyer commented Jan 4, 2018

Here's my run from this morning on different hardware (still Ubuntu 16.04):

$ java -jar target/benchmarks.jar main snap old -p sample=empt,demo,actr,jdbc,actj,jpae,conf
Benchmark           (sample)  Mode  Cnt  Score   Error  Units
SnapBenchmark.snap      empt  avgt   10  0.825 ± 0.060   s/op
SnapBenchmark.snap      demo  avgt   10  1.713 ± 0.055   s/op
SnapBenchmark.snap      actr  avgt   10  2.312 ± 0.129   s/op
SnapBenchmark.snap      jdbc  avgt   10  1.983 ± 0.074   s/op
SnapBenchmark.snap      actj  avgt   10  2.682 ± 0.087   s/op
SnapBenchmark.snap      jpae  avgt   10  2.929 ± 0.073   s/op
SnapBenchmark.snap      conf  avgt   10  2.759 ± 0.108   s/op
MainBenchmark.main      empt    ss   10  0.824 ± 0.061   s/op
MainBenchmark.main      demo    ss   10  1.762 ± 0.054   s/op
MainBenchmark.main      actr    ss   10  2.352 ± 0.087   s/op
MainBenchmark.main      jdbc    ss   10  2.046 ± 0.088   s/op
MainBenchmark.main      actj    ss   10  2.668 ± 0.089   s/op
MainBenchmark.main      jpae    ss   10  3.072 ± 0.087   s/op
MainBenchmark.main      conf    ss   10  2.908 ± 0.118   s/op
OldBenchmark.old        empt    ss   10  0.716 ± 0.077   s/op
OldBenchmark.old        demo    ss   10  1.538 ± 0.060   s/op
OldBenchmark.old        actr    ss   10  1.983 ± 0.082   s/op
OldBenchmark.old        jdbc    ss   10  1.805 ± 0.111   s/op
OldBenchmark.old        actj    ss   10  2.326 ± 0.121   s/op
OldBenchmark.old        jpae    ss   10  2.760 ± 0.137   s/op
OldBenchmark.old        conf    ss   10  2.534 ± 0.096   s/op

Slower overall than my "fast" desktop. Less memory available (might explain the error bars). No difference in the conclusion that 2.0.0 snapshots are marginally better than 2.0.0.M7 (and not on a par with 1.5.9).

Maybe someone else could run that on a Mac and post back some results?

@snicoll
Copy link
Member

snicoll commented Jan 4, 2018

Here's my run on latest MBP with Spring Boot 2.0.0 SNAP b96f862c9eff8

$ java -jar target/benchmarks.jar main snap old -p sample=empt,demo,actr,jdbc,actj,jpae,conf
Benchmark           (sample)  Mode  Cnt  Score   Error  Units
SnapBenchmark.snap      empt    ss   10  0.962 ± 0.092   s/op
SnapBenchmark.snap      demo    ss   10  1.868 ± 0.084   s/op
SnapBenchmark.snap      actr    ss   10  2.412 ± 0.105   s/op
SnapBenchmark.snap      jdbc    ss   10  2.131 ± 0.110   s/op
SnapBenchmark.snap      actj    ss   10  2.649 ± 0.133   s/op
SnapBenchmark.snap      jpae    ss   10  3.178 ± 0.206   s/op
SnapBenchmark.snap      conf    ss   10  2.646 ± 0.079   s/op
MainBenchmark.main      empt    ss   10  0.878 ± 0.033   s/op
MainBenchmark.main      demo    ss   10  1.914 ± 0.101   s/op
MainBenchmark.main      actr    ss   10  2.489 ± 0.135   s/op
MainBenchmark.main      jdbc    ss   10  2.136 ± 0.201   s/op
MainBenchmark.main      actj    ss   10  2.825 ± 0.231   s/op
MainBenchmark.main      jpae    ss   10  3.317 ± 0.279   s/op
MainBenchmark.main      conf    ss   10  3.240 ± 0.757   s/op
OldBenchmark.old        empt    ss   10  0.786 ± 0.089   s/op
OldBenchmark.old        demo    ss   10  1.615 ± 0.098   s/op
OldBenchmark.old        actr    ss   10  2.136 ± 0.234   s/op
OldBenchmark.old        jdbc    ss   10  1.992 ± 0.112   s/op
OldBenchmark.old        actj    ss   10  2.517 ± 0.285   s/op
OldBenchmark.old        jpae    ss   10  2.764 ± 0.198   s/op
OldBenchmark.old        conf    ss   10  2.610 ± 0.254   s/op
java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)

@philwebb
Copy link
Member Author

philwebb commented Jan 4, 2018

Second run without any background apps:

Benchmark           (sample)  Mode  Cnt  Score   Error  Units
SnapBenchmark.snap      empt  avgt   10  0.854 ± 0.014   s/op
SnapBenchmark.snap      demo  avgt   10  1.716 ± 0.044   s/op
SnapBenchmark.snap      actr  avgt   10  2.300 ± 0.042   s/op
SnapBenchmark.snap      jdbc  avgt   10  1.960 ± 0.031   s/op
SnapBenchmark.snap      actj  avgt   10  2.651 ± 0.072   s/op
SnapBenchmark.snap      jpae  avgt   10  3.070 ± 0.061   s/op
SnapBenchmark.snap      conf  avgt   10  2.787 ± 0.088   s/op
MainBenchmark.main      empt    ss   10  0.893 ± 0.030   s/op
MainBenchmark.main      demo    ss   10  1.886 ± 0.069   s/op
MainBenchmark.main      actr    ss   10  2.498 ± 0.084   s/op
MainBenchmark.main      jdbc    ss   10  2.176 ± 0.095   s/op
MainBenchmark.main      actj    ss   10  2.795 ± 0.121   s/op
MainBenchmark.main      jpae    ss   10  3.068 ± 0.122   s/op
MainBenchmark.main      conf    ss   10  2.773 ± 0.101   s/op
MainBenchmark.main      erka    ss   10  3.678 ± 0.109   s/op
MainBenchmark.main      busr    ss   10  3.986 ± 0.068   s/op
MainBenchmark.main      zuul    ss   10  5.302 ± 0.047   s/op
MainBenchmark.main      erkb    ss   10  5.322 ± 0.077   s/op
MainBenchmark.main      slth    ss   10  5.410 ± 0.050   s/op
OldBenchmark.old        empt    ss   10  0.725 ± 0.008   s/op
OldBenchmark.old        demo    ss   10  1.514 ± 0.023   s/op
OldBenchmark.old        actr    ss   10  1.917 ± 0.017   s/op
OldBenchmark.old        jdbc    ss   10  1.797 ± 0.066   s/op
OldBenchmark.old        actj    ss   10  2.252 ± 0.074   s/op
OldBenchmark.old        jpae    ss   10  2.630 ± 0.140   s/op
OldBenchmark.old        conf    ss   10  2.436 ± 0.066   s/op
OldBenchmark.old        erka    ss   10  3.407 ± 0.084   s/op
OldBenchmark.old        buna    ss   10  2.797 ± 0.101   s/op
OldBenchmark.old        busr    ss   10  3.411 ± 0.061   s/op
OldBenchmark.old        zuul    ss   10  4.891 ± 0.093   s/op
OldBenchmark.old        slth    ss   10  4.794 ± 0.065   s/op

@philwebb
Copy link
Member Author

philwebb commented Jan 9, 2018

Pushing this one back to RC2 for now. I think performance is good enough (especially in the IDE) for RC1. I still can't work out why the benchmark shows a 100ms difference.

1.5.10 SNAPSHOT in IDE

2018-01-08 17:35:56.682  INFO 16454 --- [           main] s.a.ui.SampleActuatorUiApplication       : Started SampleActuatorUiApplication in 2.647 seconds (JVM running for 2.927)
2018-01-08 17:36:24.634  INFO 16464 --- [           main] s.a.ui.SampleActuatorUiApplication       : Started SampleActuatorUiApplication in 2.694 seconds (JVM running for 2.959)
2018-01-08 17:36:35.993  INFO 16469 --- [           main] s.a.ui.SampleActuatorUiApplication       : Started SampleActuatorUiApplication in 2.658 seconds (JVM running for 2.926)

2.0.0 in IDE with Spring Security patch

2018-01-08 17:26:40.757  INFO 16102 --- [           main] s.a.ui.SampleActuatorUiApplication       : Started SampleActuatorUiApplication in 2.737 seconds (JVM running for 3.066)
2018-01-08 17:27:18.883  INFO 16119 --- [           main] s.a.ui.SampleActuatorUiApplication       : Started SampleActuatorUiApplication in 2.696 seconds (JVM running for 3.01)
2018-01-08 17:27:26.456  INFO 16123 --- [           main] s.a.ui.SampleActuatorUiApplication       : Started SampleActuatorUiApplication in 2.732 seconds (JVM running for 3.093)

@philwebb philwebb modified the milestones: 2.0.0.RC1, 2.0.0.RC2 Jan 9, 2018
@dsyer
Copy link
Member

dsyer commented Jan 10, 2018

Snapshots looking much better now ("main" is 2.0.0.M7, "old" is 1.5.9):

Benchmark           (sample)  Mode  Cnt  Score   Error  Units
MainBenchmark.main      empt    ss   10  0.586 ± 0.025   s/op
MainBenchmark.main      demo    ss   10  1.276 ± 0.044   s/op
MainBenchmark.main      actr    ss   10  1.653 ± 0.056   s/op
MainBenchmark.main      jdbc    ss   10  1.437 ± 0.069   s/op
MainBenchmark.main      actj    ss   10  1.910 ± 0.086   s/op
MainBenchmark.main      jpae    ss   10  2.307 ± 0.184   s/op
MainBenchmark.main      conf    ss   10  1.972 ± 0.058   s/op
OldBenchmark.old        empt    ss   10  0.491 ± 0.016   s/op
OldBenchmark.old        demo    ss   10  1.064 ± 0.027   s/op
OldBenchmark.old        actr    ss   10  1.399 ± 0.090   s/op
OldBenchmark.old        jdbc    ss   10  1.295 ± 0.080   s/op
OldBenchmark.old        actj    ss   10  1.637 ± 0.078   s/op
OldBenchmark.old        jpae    ss   10  1.904 ± 0.053   s/op
OldBenchmark.old        conf    ss   10  1.786 ± 0.040   s/op
SnapBenchmark.snap      empt    ss   10  0.552 ± 0.012   s/op
SnapBenchmark.snap      demo    ss   10  1.166 ± 0.086   s/op
SnapBenchmark.snap      actr    ss   10  1.489 ± 0.044   s/op
SnapBenchmark.snap      jdbc    ss   10  1.334 ± 0.052   s/op
SnapBenchmark.snap      actj    ss   10  1.748 ± 0.079   s/op
SnapBenchmark.snap      jpae    ss   10  2.066 ± 0.059   s/op
SnapBenchmark.snap      conf    ss   10  1.794 ± 0.038   s/op

Marginally slower than 1.5.9, but not so noticeable now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: task A general task
Projects
None yet
Development

No branches or pull requests

3 participants