Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Add Support for PostgreSQL parallelism for YbSeqScan plan node #18095

Closed
1 task done
Tracked by #17984
sushantrmishra opened this issue Jul 5, 2023 · 0 comments
Closed
1 task done
Tracked by #17984
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue

Comments

@sushantrmishra
Copy link

sushantrmishra commented Jul 5, 2023

Jira Link: DB-7135

Description

Add Support for PostgreSQL parallelism for YbSeqScan plan node.

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@sushantrmishra sushantrmishra added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Jul 5, 2023
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue kind/enhancement This is an enhancement of an existing feature and removed kind/bug This issue is a bug status/awaiting-triage Issue awaiting triage labels Jul 5, 2023
andrei-mart added a commit that referenced this issue Nov 29, 2023
Summary:
Enable Postgres' parallel query feature and implement parallel scan of
YB tables in YBSeqScan, IndexScan, IndexOnlyScan nodes.

Feature is enabled in preview mode, that is, it is disabled by default, to enable:
```
set yb_parallel_range_rows  to 10000;
```
indicates the number of estimated rows per parallel worker. Smaller table scans are not parallelized, default 0 effectively disables the feature. The parameter defines minimum number of parallel workers, while `max_parallel_workers_per_gather` works as the maximum.
```
set yb_enable_base_scans_cost_model to true;
```
since parallel query cost improvements are factored in Yugabyte costing functions.
Also make sure the target tables are large and ANALYZE was done for them. Due to
planned  parallelization overhead optimizer selects parallel plan only if it thinks the
target table is large, default 1000 rows would not be sufficient.

The feature depends on the DocDB ability to return key ranges for
parallel scan implemented in D26978. Those key ranges are stored in the
shared memory buffer from where they are taken one at a time by the
parallel workers.

Transaction consistency between parallel workers is ensured by main
backend sharing its session and transaction context.
Jira: DB-7135

Test Plan:
ybd --java-test org.yb.pgsql.TestPgRegressParallel
ybd --cxx-test pggate_test_select --gtest_filter PggateTestSelect.TestSelectHashRanges
ybd --cxx-test pggate_test_select --gtest_filter PggateTestSelect.TestSelectScanRanges

Reviewers: sergei, timur, jason, pjain, tnayak

Reviewed By: pjain, tnayak

Subscribers: ybase, smishra, yql, bogdan

Differential Revision: https://phorge.dev.yugabyte.com/D28398
jasonyb pushed a commit that referenced this issue Mar 14, 2024
Summary:
Enable Postgres' parallel query feature and implement parallel scan of
YB tables in YBSeqScan, IndexScan, IndexOnlyScan nodes.

Feature is enabled in preview mode, that is, it is disabled by default, to enable:
```
set yb_parallel_range_rows  to 10000;
```
indicates the number of estimated rows per parallel worker. Smaller table scans are not parallelized, default 0 effectively disables the feature. The parameter defines minimum number of parallel workers, while `max_parallel_workers_per_gather` works as the maximum.
```
set yb_enable_base_scans_cost_model to true;
```
since parallel query cost improvements are factored in Yugabyte costing functions.
Also make sure the target tables are large and ANALYZE was done for them. Due to
planned  parallelization overhead optimizer selects parallel plan only if it thinks the
target table is large, default 1000 rows would not be sufficient.

The feature depends on the DocDB ability to return key ranges for
parallel scan implemented in D26978. Those key ranges are stored in the
shared memory buffer from where they are taken one at a time by the
parallel workers.

Transaction consistency between parallel workers is ensured by main
backend sharing its session and transaction context.
Jira: DB-7135

Test Plan:
ybd --java-test org.yb.pgsql.TestPgRegressParallel
ybd --cxx-test pggate_test_select --gtest_filter PggateTestSelect.TestSelectHashRanges
ybd --cxx-test pggate_test_select --gtest_filter PggateTestSelect.TestSelectScanRanges

Reviewers: sergei, timur, jason, pjain, tnayak

Reviewed By: pjain, tnayak

Subscribers: ybase, smishra, yql, bogdan

Differential Revision: https://phorge.dev.yugabyte.com/D28398
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

3 participants