Skip to content
This repository has been archived by the owner on Oct 23, 2024. It is now read-only.

Optimize v2/groups request performance #5973

Merged
merged 8 commits into from
Feb 9, 2018
Merged

Conversation

zen-dog
Copy link
Contributor

@zen-dog zen-dog commented Feb 2, 2018

Improved v2/groups?embed... requests performance by introducing dedicated thread pools for fetching info instead of using the ForkJoinPool.
Additionally introduces dedicated execution context for GroupManagerModule, StorageModule and SchedulerActions components.

Benchmark results master:

Running 1m test @ http://localhost:8080/v2/groups?embed=...
  10 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.17s   354.93ms   2.00s    70.33%
    Req/Sec     0.12      0.33     1.00     87.76%
  237 requests in 1.00m, 217.12MB read
  Socket errors: connect 0, read 0, write 0, timeout 146
Requests/sec:      3.95
Transfer/sec:      3.62MB

This branch:

Running 1m test @ http://localhost:8080/v2/groups?embed=...
  10 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   249.39ms   62.67ms 580.40ms   65.82%
    Req/Sec     4.01      1.82    10.00     85.60%
  2403 requests in 1.00m, 2.15GB read
Requests/sec:     40.00
Transfer/sec:     36.64MB

Summary: 10x throughput, 5x latency and 5x std. deviation improvement.

JIRA: COPS-2224

Aleksey Dukhovniy added 2 commits February 2, 2018 11:05
and reduced the amount of work done on the futures.
for `StorageModule`, `SchedulerActions` and `GroupManagerModule` components.
@zen-dog zen-dog requested a review from meln1k February 2, 2018 11:14
Copy link

@mesosphere-ci mesosphere-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm building your change at jenkins-marathon-pipelines-PR-5973-1.

Copy link

@mesosphere-ci mesosphere-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✗ Build of #5973 failed.

See the logs and test results for details.

Error message:

Stage Compile and Test failed.

(๑′°︿°๑)

Copy link

@mesosphere-ci mesosphere-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm building your change at jenkins-marathon-pipelines-PR-5973-2.

Copy link

@mesosphere-ci mesosphere-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✗ Build of #5973 failed.

See the logs and test results for details.

Error message:

Stage Compile and Test failed.

(๑′°︿°๑)

Copy link

@mesosphere-ci mesosphere-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm building your change at jenkins-marathon-pipelines-PR-5973-3.

Copy link

@mesosphere-ci mesosphere-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔ Build of #5973 completed successfully.

See details at jenkins-marathon-pipelines-PR-5973-3.

You can create a DC/OS with your patched Marathon by creating a new pull
request with the following changes in buildinfo.json:

"url": "https://s3.amazonaws.com/downloads.mesosphere.io/marathon/builds/marathon-1.6.0-pre-299-gd8338be.tgz",
"sha1": "327099a92157ef863287b936441bb18399e5b623"

You can run system integration test changes of this PR against Marathon
master by triggering this Jenkins job with the Pull_Request_id 5973.
The job then reports back to this PR.

\\ ٩( ᐛ )و //

Copy link
Contributor

@meln1k meln1k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

to name pooled threads e.g. `app-info-module-thread-3`
Copy link

@mesosphere-ci mesosphere-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm building your change at jenkins-marathon-pipelines-PR-5973-4.

Copy link

@mesosphere-ci mesosphere-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔ Build of #5973 completed successfully.

See details at jenkins-marathon-pipelines-PR-5973-4.

You can create a DC/OS with your patched Marathon by creating a new pull
request with the following changes in buildinfo.json:

"url": "https://s3.amazonaws.com/downloads.mesosphere.io/marathon/builds/marathon-1.6.0-pre-300-g658e102.tgz",
"sha1": "83869c6e353ee6242927b3ab4d51845de240daee"

You can run system integration test changes of this PR against Marathon
master by triggering this Jenkins job with the Pull_Request_id 5973.
The job then reports back to this PR.

\\ ٩( ᐛ )و //

Copy link

@mesosphere-ci mesosphere-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm building your change at jenkins-marathon-pipelines-PR-5973-5.

Copy link

@mesosphere-ci mesosphere-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔ Build of #5973 completed successfully.

See details at jenkins-marathon-pipelines-PR-5973-5.

You can create a DC/OS with your patched Marathon by creating a new pull
request with the following changes in buildinfo.json:

"url": "https://s3.amazonaws.com/downloads.mesosphere.io/marathon/builds/marathon-1.6.0-pre-301-gf3b44f4.tgz",
"sha1": "a0c3a032e865bf308ed9b306ec4999a69b051c52"

You can run system integration test changes of this PR against Marathon
master by triggering this Jenkins job with the Pull_Request_id 5973.
The job then reports back to this PR.

\\ ٩( ᐛ )و //

Copy link
Contributor

@kensipe kensipe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

preference for more readable naming of vars but nothing blocking

@@ -21,8 +21,11 @@ class AppInfoModule @Inject() (
healthCheckManager: HealthCheckManager,
marathonSchedulerService: MarathonSchedulerService,
taskFailureRepository: TaskFailureRepository) {

val ec = NamedExecutionContext.fixedThreadPoolExecutionContext(8, "app-info-module")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ec? are we trying to stay within a linter line length rule :)

could we make this more readable pls?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few akka/scala specific and accepted "short names" e.g. ec for ExecutionContext, mat for Materializer etc. Falls into the same category as loop counter for(i=0,..)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this thread pool be ideally related to the number of threads available?

Should we make it configurable? (even if via an undocumented environment variable)?

It seems like we should ideally create the execution context in the module and share it with the pods status controller?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ec is fine for me for the reason @zen-dog pointed out.
I agree with @timcharper that the number of threads could as well be configurable.


private[this] val log = LoggerFactory.getLogger(getClass)
implicit val ec = NamedExecutionContext.fixedThreadPoolExecutionContext(8, "default-info-service")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor

@timcharper timcharper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

What do you think about those questions?

@@ -21,8 +21,11 @@ class AppInfoModule @Inject() (
healthCheckManager: HealthCheckManager,
marathonSchedulerService: MarathonSchedulerService,
taskFailureRepository: TaskFailureRepository) {

val ec = NamedExecutionContext.fixedThreadPoolExecutionContext(8, "app-info-module")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this thread pool be ideally related to the number of threads available?

Should we make it configurable? (even if via an undocumented environment variable)?

It seems like we should ideally create the execution context in the module and share it with the pods status controller?

@@ -199,10 +200,11 @@ class CoreModuleImpl @Inject() (

// GROUP MANAGER

val groupManagerExecutionContext = NamedExecutionContext.fixedThreadPoolExecutionContext(8, "group-manager-module")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 8 threads? Did we measure that 8 is a meaningful number? I'm not trying to overcomplicate things, but it seems to me that making these configurable would allow us to also easily run different tests to verify that the number makes sense.

Copy link
Contributor Author

@zen-dog zen-dog Feb 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 seems like a low enough number that it shouldn't matter. More important is the fact that we separate the thread pools from each other.

@@ -21,8 +21,11 @@ class AppInfoModule @Inject() (
healthCheckManager: HealthCheckManager,
marathonSchedulerService: MarathonSchedulerService,
taskFailureRepository: TaskFailureRepository) {

val ec = NamedExecutionContext.fixedThreadPoolExecutionContext(8, "app-info-module")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ec is fine for me for the reason @zen-dog pointed out.
I agree with @timcharper that the number of threads could as well be configurable.

* Returns an execution context backed by a fixed thread pool. All threads in the pool are prefixed with `namePrefix`
* e.g. `slow-io-pool-thread-1`.
*
* @param numThreads number of threads in the pol
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/pol/pool

Copy link
Contributor

@jeschkies jeschkies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easily the greatest change ever. I do have question on the usage of the fixed worker thread pool but it should not block this PR. Thanks a ton!


private[this] val log = LoggerFactory.getLogger(getClass)
implicit val ec = NamedExecutionContext.fixedThreadPoolExecutionContext(8, "default-info-service")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this really has an impact if app info and the group manager have their own thread pools. I'd like to understand where we benefit from a fixed worker thread pool and where we should be using the fork join thread pool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested the scenario with one thread pool and it adds roughly 20% response time. Not that much but I think dividing them still makes sense - Dīvide et Imperā!

@zen-dog
Copy link
Contributor Author

zen-dog commented Feb 7, 2018

Should this thread pool be ideally related to the number of threads available?
Should we make it configurable? (even if via an undocumented environment variable)?

I don't think there is much value in setting it to availableProcessors or whatever. We already have chaos creating a thread for each connection, akka dispatcher pools and a bunch of other threads. And when we used to block everywhere our thread count would grow > 1k threads at times.
But I do agree that it makes sense to make it configurable.

It seems like we should ideally create the execution context in the module and share it with the pods status controller?

Shared execution context showed 20% bigger response times (for whatever reason).

Copy link

@mesosphere-ci mesosphere-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm building your change at jenkins-marathon-pipelines-PR-5973-6.

Copy link

@mesosphere-ci mesosphere-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔ Build of #5973 completed successfully.

See details at jenkins-marathon-pipelines-PR-5973-6.

You can create a DC/OS with your patched Marathon by creating a new pull
request with the following changes in buildinfo.json:

"url": "https://s3.amazonaws.com/downloads.mesosphere.io/marathon/builds/marathon-1.6.0-pre-311-ge500dcc.tgz",
"sha1": "84c29052bc915820edf9ce47b83b901606253078"

You can run system integration test changes of this PR against Marathon
master by triggering this Jenkins job with the Pull_Request_id 5973.
The job then reports back to this PR.

\\ ٩( ᐛ )و //


trait AppInfoConfig extends ScallopConf {

lazy val appInfoModuleExecutionContextSize = opt[Int](
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's base the default as Runtime.getRuntime().availableProcessors() * 2? (or... just # of availableProcessors?)

Copy link

@mesosphere-ci mesosphere-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm building your change at jenkins-marathon-pipelines-PR-5973-7.

Copy link

@mesosphere-ci mesosphere-ci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔ Build of #5973 completed successfully.

See details at jenkins-marathon-pipelines-PR-5973-7.

You can create a DC/OS with your patched Marathon by creating a new pull
request with the following changes in buildinfo.json:

"url": "https://s3.amazonaws.com/downloads.mesosphere.io/marathon/builds/marathon-1.6.0-pre-313-ga4ee6d2.tgz",
"sha1": "5127a89c0a07c72c70b6c15fd0192ead1f749abf"

You can run system integration test changes of this PR against Marathon
master by triggering this Jenkins job with the Pull_Request_id 5973.
The job then reports back to this PR.

\\ ٩( ᐛ )و //

@zen-dog zen-dog merged commit 9c0ad47 into master Feb 9, 2018
zen-dog pushed a commit that referenced this pull request Feb 9, 2018
Backport of the original PR #5973

JIRA: COPS-2224
zen-dog pushed a commit that referenced this pull request Feb 9, 2018
Backport of the original PR #5973

JIRA: COPS-2224
@zen-dog zen-dog deleted the ad/optimize-fetch-groups branch April 7, 2018 15:49
jeschkies added a commit that referenced this pull request Feb 8, 2019
Summary:
This replace the Akka stream with a simple `Future.sequence` and uses
the same `AppInfoBaseData` instance for each pod. This we utilize the
cached instances and not limit ourselves to eight parallel threads. To
not overload Marathon we use the fixed size thread pool from #5973.

However, even with #5973 we might allocate a lot of futures and
runnables.

JIRA issues: MARATHON-8563
jeschkies added a commit that referenced this pull request Feb 8, 2019
Summary:
This replace the Akka stream with a simple `Future.sequence` and uses
the same `AppInfoBaseData` instance for each pod. This we utilize the
cached instances and not limit ourselves to eight parallel threads. To
not overload Marathon we use the fixed size thread pool from #5973.

However, even with #5973 we might allocate a lot of futures and
runnables.

JIRA issues: MARATHON-8563
jeschkies added a commit that referenced this pull request Feb 12, 2019
Summary:
This replace the Akka stream with a simple `Future.sequence` and uses
the same `AppInfoBaseData` instance for each pod. This we utilize the
cached instances and not limit ourselves to eight parallel threads. To
not overload Marathon we use the fixed size thread pool from #5973.

However, even with #5973 we might allocate a lot of futures and
runnables.

For 100 pods with 1 instance each I got the following:

## Master
```
› wrk -c 10 -t 10 -d 2m http://localhost:8080/v2/pods/::status
Running 2m test @ http://localhost:8080/v2/pods/::status
  10 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   136.52ms   44.93ms 859.05ms   93.64%
    Req/Sec     7.71      2.49    20.00     69.81%
  8918 requests in 2.00m, 1.34GB read
Requests/sec:     74.26
Transfer/sec:     11.45MB
```

## This change
```
Running 2m test @ http://localhost:8080/v2/pods/::status
  10 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    62.27ms   23.24ms 220.39ms   70.17%
    Req/Sec    15.94      6.09    50.00     69.12%
  19345 requests in 2.00m, 2.91GB read
Requests/sec:    161.10
Transfer/sec:     24.85MB 
```

JIRA issues: MARATHON-8563
jeschkies added a commit that referenced this pull request Feb 12, 2019
Summary:
This replace the Akka stream with a simple `Future.sequence` and uses
the same `AppInfoBaseData` instance for each pod. This we utilize the
cached instances and not limit ourselves to eight parallel threads. To
not overload Marathon we use the fixed size thread pool from #5973.

However, even with #5973 we might allocate a lot of futures and
runnables.

For 100 pods with 1 instance each I got the following:

## Master
```
› wrk -c 10 -t 10 -d 2m http://localhost:8080/v2/pods/::status
Running 2m test @ http://localhost:8080/v2/pods/::status
  10 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   136.52ms   44.93ms 859.05ms   93.64%
    Req/Sec     7.71      2.49    20.00     69.81%
  8918 requests in 2.00m, 1.34GB read
Requests/sec:     74.26
Transfer/sec:     11.45MB
```

## This change
```
Running 2m test @ http://localhost:8080/v2/pods/::status
  10 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    62.27ms   23.24ms 220.39ms   70.17%
    Req/Sec    15.94      6.09    50.00     69.12%
  19345 requests in 2.00m, 2.91GB read
Requests/sec:    161.10
Transfer/sec:     24.85MB 
```

JIRA issues: MARATHON-8563
jeschkies added a commit that referenced this pull request Feb 12, 2019
Summary:
This replace the Akka stream with a simple `Future.sequence` and uses
the same `AppInfoBaseData` instance for each pod. This we utilize the
cached instances and not limit ourselves to eight parallel threads. To
not overload Marathon we use the fixed size thread pool from #5973.

However, even with #5973 we might allocate a lot of futures and
runnables.

For 100 pods with 1 instance each I got the following:

```
› wrk -c 10 -t 10 -d 2m http://localhost:8080/v2/pods/::status
Running 2m test @ http://localhost:8080/v2/pods/::status
  10 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   136.52ms   44.93ms 859.05ms   93.64%
    Req/Sec     7.71      2.49    20.00     69.81%
  8918 requests in 2.00m, 1.34GB read
Requests/sec:     74.26
Transfer/sec:     11.45MB
```

```
Running 2m test @ http://localhost:8080/v2/pods/::status
  10 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    62.27ms   23.24ms 220.39ms   70.17%
    Req/Sec    15.94      6.09    50.00     69.12%
  19345 requests in 2.00m, 2.91GB read
Requests/sec:    161.10
Transfer/sec:     24.85MB
```

JIRA issues: MARATHON-8563
jeschkies added a commit that referenced this pull request Feb 14, 2019
Summary:
This replace the Akka stream with a simple `Future.sequence` and uses
the same `AppInfoBaseData` instance for each pod. This we utilize the
cached instances and not limit ourselves to eight parallel threads. To
not overload Marathon we use the fixed size thread pool from #5973.

However, even with #5973 we might allocate a lot of futures and
runnables.

For 100 pods with 1 instance each I got the following:

## Master
```
› wrk -c 10 -t 10 -d 2m http://localhost:8080/v2/pods/::status
Running 2m test @ http://localhost:8080/v2/pods/::status
  10 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   136.52ms   44.93ms 859.05ms   93.64%
    Req/Sec     7.71      2.49    20.00     69.81%
  8918 requests in 2.00m, 1.34GB read
Requests/sec:     74.26
Transfer/sec:     11.45MB
```

## This change
```
Running 2m test @ http://localhost:8080/v2/pods/::status
  10 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    62.27ms   23.24ms 220.39ms   70.17%
    Req/Sec    15.94      6.09    50.00     69.12%
  19345 requests in 2.00m, 2.91GB read
Requests/sec:    161.10
Transfer/sec:     24.85MB 
```

JIRA issues: MARATHON-8563
jeschkies added a commit that referenced this pull request Feb 14, 2019
Summary:
This replace the Akka stream with a simple `Future.sequence` and uses
the same `AppInfoBaseData` instance for each pod. This we utilize the
cached instances and not limit ourselves to eight parallel threads. To
not overload Marathon we use the fixed size thread pool from #5973.

However, even with #5973 we might allocate a lot of futures and
runnables.

For 100 pods with 1 instance each I got the following:

```
› wrk -c 10 -t 10 -d 2m http://localhost:8080/v2/pods/::status
Running 2m test @ http://localhost:8080/v2/pods/::status
  10 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   136.52ms   44.93ms 859.05ms   93.64%
    Req/Sec     7.71      2.49    20.00     69.81%
  8918 requests in 2.00m, 1.34GB read
Requests/sec:     74.26
Transfer/sec:     11.45MB
```

```
Running 2m test @ http://localhost:8080/v2/pods/::status
  10 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    62.27ms   23.24ms 220.39ms   70.17%
    Req/Sec    15.94      6.09    50.00     69.12%
  19345 requests in 2.00m, 2.91GB read
Requests/sec:    161.10
Transfer/sec:     24.85MB
```

JIRA issues: MARATHON-8563
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants