You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now, prefer-partitioning only works if you have statistics and the number of partitions is greater than preferred-write-partitioning-min-number-of-partitions (default to 50). However, we know that stats are not always guaranteed to be present in which case partitioned writes will go through from an inefficient route. But with scaling, we could enable prefer-partitioning for any number of partitions thus we don't have to rely on statistics.
We will introduce a new local exchanger (ScalePartitionLocalExchanger) that will split the page into different partitions and assign those pages to their respective table-writer operators. Additionally, it will keep track of the partition level physical written bytes coming from the table writer operator and use that to scale the parallelism for a particular skewed partition in a round-robin fashion.
Approach (global scaling):
For prefer partitioning, the distribution across workers happens through the PartitionedOutputOperator which is hard to scale. So, we still haven't figured out what do to with that.
cc @trinodb/maintainers
The text was updated successfully, but these errors were encountered:
gaurav8297
changed the title
Scale task writers based on throughput for partitioned tables
Scale task writers based on throughput for partitioned tables with skewness
Sep 28, 2022
Problem
Improve the performance of partitioned writes specifically in case writers/partitions are skewed.
Right now,
prefer-partitioning
only works if you have statistics and the number of partitions is greater thanpreferred-write-partitioning-min-number-of-partitions
(default to 50). However, we know that stats are not always guaranteed to be present in which case partitioned writes will go through from an inefficient route. But with scaling, we could enableprefer-partitioning
for any number of partitions thus we don't have to rely on statistics.Approach (local scaling):(Addressed in #14718)We will introduce a new local exchanger (
ScalePartitionLocalExchanger
) that will split the page into different partitions and assign those pages to their respective table-writer operators. Additionally, it will keep track of the partition level physical written bytes coming from the table writer operator and use that to scale the parallelism for a particular skewed partition in a round-robin fashion.Approach (global scaling):
For prefer partitioning, the distribution across workers happens through the
PartitionedOutputOperator
which is hard to scale. So, we still haven't figured out what do to with that.cc @trinodb/maintainers
The text was updated successfully, but these errors were encountered: