-
Notifications
You must be signed in to change notification settings - Fork 13.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-25226][doc] Add documentation about the AdaptiveBatchScheduler #18757
Conversation
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Automated ChecksLast check on commit 8f1fbaa (Mon Feb 14 13:57:17 UTC 2022) Warnings:
Mention the bot in a comment to re-run the automated checks. Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
8f1fbaa
to
11e85ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding docs for adaptive batch scheduler! @wanglijie95
I have a few comments. Please take a look.
|
||
### Limitations | ||
|
||
- **ALL-EDGES-BLOCKING batch jobs only**: The first version of Adaptive Batch Scheduler only supports ALL-EDGES-BLOCKING batch jobs only. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ALL-EDGES-BLOCKING -> ALL-EXCHANGES-BLOCKING
And maybe add a link to the config option "execution.batch-shuffle-mode" for reference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first version of -> At the moment,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are 2 only
and either should be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this mean? (from the user perspective)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for @zhuzhurk 's comment. Just tell user adaptive batch scheduler only support the case where execution.batch-shuffle-mode
is ALL-EXCHANGES-BLOCKING
, and link to the config pages.
@tillrohrmann @dmvk would you help to take a look at the EN version document if it is convenient? |
I'm not sure whether ABS should have its own "top level" section under the deployment menu. Would it make sense to incorporate this into elastic scaling page? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @wanglijie95 👍 It's great to see that this feature will get a proper documentation 😍. I'm mostly concerned about what audience are we targeting with this docs, I think we should take a less advanced users into consideration here, because this is a really cool feature that many people will want to try out.
Also it would be nice to add a section about how this could be integrated with the external shuffle service (without it, this effort lacks the benefit of the being resource effective).
I left some comments in-line, please take a look.
For the grammar, once you're finished, you can ping @infoverload and she can help to correct it.
Are you also planning a blog post for this? It would be a good opportunity to enhance this with some high level pictures that could be then reused.
👍
The Adaptive Batch Scheduler can automatically decide parallelisms of job vertices for batch jobs. If a job vertex is not set with a parallelism, the scheduler will decide parallelism for the job vertex according to the size of its consumed datasets. This can bring many benefits: | ||
- Batch job users can be relieved from parallelism tuning | ||
- Automatically tuned parallelisms can be vertex level and can better fit consumed datasets which have a varying volume size every day | ||
- Vertices from SQL batch jobs can be assigned with different parallelisms which are automatically tuned |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the target audience? Does regular Flink user supposed to know what the job vertex is? Overall this page feels bit too low level 🤔.
On the other hand I don't think that other pages withing this section are all much better in this regard 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the target audience? Does regular Flink user supposed to know what the job vertex is? Overall this page feels bit too low level 🤔.
Thanks for pointing that out. Maybe stage
is more appropriate?
On the other hand I don't think that other pages withing this section are all much better in this regard 🤔
I'll check the rest content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or use operator
, although it is not exactly the same as the job vertex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for operator
. It is the concept that users can/must understand. I think adaptively deciding parallelisms does mean to adaptively deciding parallelisms for operators. We just do not want to break beneficial operator chaining, so that parallelisms are decided for OperatorChain/JobVertex.
|
||
#### Set the parallelism of job vertices to `-1` | ||
Adaptive Batch Scheduler will only decide parallelism for job vertices whose parallelism is not specified by users (parallelism is `-1`). So if you want the parallelism of vertices can be decided automatically, you should configure as follows: | ||
- Set `paralleims.default` to `-1` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
- Set the parallelism of job vertices to `-1`. | ||
|
||
#### Configure to use Adaptive Batch Scheduler | ||
To use Adaptive Batch Scheduler, you need to set the [`jobmanager.scheduler`]({{< ref "docs/deployment/config" >}}#jobmanager-scheduler) to `AdpaptiveBatch`. In addition, there are several optional config options that might need adjustment when using Adaptive Batch Scheduler: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo AdpaptiveBatch
- [`jobmanager.scheduler.adaptive-batch.data-volume-per-task`]({{< ref "docs/deployment/config" >}}#jobmanager-scheduler-adaptive-batch-data-volume-per-task): The size of data volume to expect each task instance to process | ||
- [`jobmanager.scheduler.adaptive-batch.source-parallelism.default`]({{< ref "docs/deployment/config" >}}#jobmanager-scheduler-adaptive-batch-source-parallelism-default): The default parallelism of source vertices | ||
|
||
#### Set the parallelism of job vertices to `-1` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we set the defaults automatically when the ABS is enabled? Are there cases where we can't assume that this is what user wants?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If users explicitly configure the parallelism.default
(with a value > 0) in flink-conf
, but we override this value with -1
, I think this may give the users a feeling that the configuration does not take effect. Maybe we can check the value of parallelism.default
and then print an ERROR
or WARNING
log if the value > 0 ?
|
||
### Performance tuning | ||
|
||
1. It's recommended to use `Sort Shuffle` and set [`taskmanager.network.memory.buffers-per-channel`]({{< ref "docs/deployment/config" >}}#taskmanager-network-memory-buffers-per-channel) to `0`. This can decouple the network memory consumption from parallelism, so for large scale jobs, the possibility of "Insufficient number of network buffers" error can be decreased. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to link this with a blog post?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to add a link to "https://flink.apache.org/2021/10/26/sort-shuffle-part1.html" (or maybe "https://flink.apache.org/2021/10/26/sort-shuffle-part1.html#motivation-behind-the-sort-based-implementation" which explains the benefits of Sort Shuffle
including saving network buffers).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for this. I will add it.
### Performance tuning | ||
|
||
1. It's recommended to use `Sort Shuffle` and set [`taskmanager.network.memory.buffers-per-channel`]({{< ref "docs/deployment/config" >}}#taskmanager-network-memory-buffers-per-channel) to `0`. This can decouple the network memory consumption from parallelism, so for large scale jobs, the possibility of "Insufficient number of network buffers" error can be decreased. | ||
2. It's not recommended to configure an excessive value for [`jobmanager.scheduler.adaptive-batch.max-parallelism`]({{< ref "docs/deployment/config" >}}#jobmanager-scheduler-adaptive-batch-max-parallelism), otherwise it will affect the performance. Because this option can affect the number of subpartitions produced by upstream tasks, excessive number of subpartitions may degrade the performance of hash shuffle and the performance of network transmission due to small packets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is an excessive value in this context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the maximum parallelism should be set to the parallelism you expect to need to process the data in the worst case, a value large than it (expect value in worst case) can be considered as "excessive value". I will revise the description in this part.
|
||
### Limitations | ||
|
||
- **ALL-EDGES-BLOCKING batch jobs only**: The first version of Adaptive Batch Scheduler only supports ALL-EDGES-BLOCKING batch jobs only. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this mean? (from the user perspective)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for creating this PR @wanglijie95. I think it is already really good. One thing that I missing is what David said: An explanation for less advanced users would be really cool. I think it could go along the lines of how the batch scheduler works and what benefits it brings.
Good idea! +1 to add this doc as an |
Yes, blog post is in our plan, maybe shortly after 1.15 release. |
11e85ed
to
633a085
Compare
Thanks for your comments @zhuzhurk @dmvk @tillrohrmann, this goes a long way towards perfecting this document. I've updated the document, looking forward for your further feedback. @dmvk Currently RSS does not support ABS (mainly does not support one input gate consumes subpartition range), so the part of integrating with external shuffle services , I think it can be added after RSS is adapted to ABS. When posting blog posts in the future, if the adaptation of RSS has finished, users can also be recommended to use it. |
The Adaptive Batch Scheduler can automatically decide parallelisms of job vertices for batch jobs. If a job vertex is not set with a parallelism, the scheduler will decide parallelism for the job vertex according to the size of its consumed datasets. This can bring many benefits: | ||
- Batch job users can be relieved from parallelism tuning | ||
- Automatically tuned parallelisms can be vertex level and can better fit consumed datasets which have a varying volume size every day | ||
- Vertices from SQL batch jobs can be assigned with different parallelisms which are automatically tuned |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for operator
. It is the concept that users can/must understand. I think adaptively deciding parallelisms does mean to adaptively deciding parallelisms for operators. We just do not want to break beneficial operator chaining, so that parallelisms are decided for OperatorChain/JobVertex.
604773d
to
138555c
Compare
Thanks for addressing the comments! The change looks good to me. |
138555c
to
5a93c1c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing all the comments. @wanglijie95
The doc now looks good to me.
db3252d
to
989d228
Compare
@flinkbot run azure |
989d228
to
6063dec
Compare
What is the purpose of the change
Add documentation about the AdaptiveBatchScheduler
Verifying this change
Document change without any test coverage.
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (no)Documentation