-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YSQL] for non-prepared statements, optimize pg_statistic system table lookups #5051
Labels
Comments
kmuthukk
added
area/ysql
Yugabyte SQL (YSQL)
kind/enhancement
This is an enhancement of an existing feature
labels
Jul 11, 2020
m-iancu
added a commit
that referenced
this issue
Jul 24, 2020
…e lookups #5051 Summary: Allow caching of negative (not found) results for pg_statistic and pg_statistic_ext tables since we do not change those in YSQL yet. Additionally, update the debug utilities for YSQL by removing the general variable 'yb_debug_mode' and adding: yb_debug_report_error_stacktrace: - Add stacktrace information to every YSQL error. - Effectively replaces (yb_debug_mode). yb_debug_log_catcache_events: - Log cache misses and cache refresh events. yb_debug_log_internal_restarts: - Log automatic statement (or transaction) restarts - e.g. read-restarts and schema mismatch restarts. yb_debug_log_docdb_requests: - Log protobufs of all outgoing DocDB requests. Test Plan: In ysqlsh, run the following queries and check that no 'Catalog cache miss' entries are found in the tserver's postgresql log. ``` -- setup tables create table t1(k int primary key, v int); create table t2(k int primary key, v int); -- Execute the query once to load up the caches select * from t1, t2 where t1.k = t2.k; -- enable cache events log and re-run the query. set yb_debug_log_catcache_events = true; select * from t1, t2 where t1.k = t2.k; select * from t1, t2 where t1.k = t2.k; select * from t1, t2 where t1.k = t2.k; -- reset (disable) the cache events log. reset yb_debug_log_catcache_events; ``` Reviewers: neha, neil, alex, dmitry Reviewed By: dmitry Subscribers: kannan, yql Differential Revision: https://phabricator.dev.yugabyte.com/D8936
m-iancu
added a commit
that referenced
this issue
Jul 28, 2020
…pg_statistic system table lookups Summary: Allow caching of negative (not found) results for pg_statistic and pg_statistic_ext tables since we do not change those in YSQL yet. Additionally, update the debug utilities for YSQL by removing the general variable 'yb_debug_mode' and adding: yb_debug_report_error_stacktrace: - Add stacktrace information to every YSQL error. - Effectively replaces (yb_debug_mode). yb_debug_log_catcache_events: - Log cache misses and cache refresh events. yb_debug_log_internal_restarts: - Log automatic statement (or transaction) restarts - e.g. read-restarts and schema mismatch restarts. yb_debug_log_docdb_requests: - Log protobufs of all outgoing DocDB requests. Test Plan: Jenkins: rebase: 2.2 In ysqlsh, run the following queries and check that no 'Catalog cache miss' entries are found in the tserver's postgresql log. ``` -- setup tables create table t1(k int primary key, v int); create table t2(k int primary key, v int); -- Execute the query once to load up the caches select * from t1, t2 where t1.k = t2.k; -- enable cache events log and re-run the query. set yb_debug_log_catcache_events = true; select * from t1, t2 where t1.k = t2.k; select * from t1, t2 where t1.k = t2.k; select * from t1, t2 where t1.k = t2.k; -- reset (disable) the cache events log. reset yb_debug_log_catcache_events; ``` Reviewers: neha, neil, dmitry, alex Reviewed By: alex Subscribers: yql, kannan Differential Revision: https://phabricator.dev.yugabyte.com/D9034
Fixed by c41c3c9. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
When not using a prepared statement that's repeatedly used for re-execution, if the query planner doesn't have an obvious plan based on indexes or primary key for a given statement, it ends up going to the
pg_statistic
table.For example, if you have two tables
t1
andt2
both with two columns k (primary key) and v, then:this join:
SELECT * from t1, t2 WHERE t1.k = 'k-123'
doesn't seem to need the help of
pg_statistic
table, whereas this one does:SELECT * from t1, t2 WHERE t1.k = t2.k
And for such queries, if a "prepared" statement is not being used, then every execute ends up doing extra lookups on the pg_statistic table.
For the above example, we see two calls to
pg_statistic
each time the statement is prepared (one fort1
and one fort2
). Given that this Postgres system table lives in yb-master, this incurs additional RPCs calls from yb-tserver to yb-master leaders and also increases the likelihood of yb-master becoming a bottleneck.Test program which illustrates the issue:
https://gist.github.com/kmuthukk/611813ab4685179038654486d2ae1c0a
The workaround would be to use a prepared statement; but since we do find a lot of use of direct "execute" without a separate prepare step in the real-world, we should optimize/cache these lookups on each server rather than go to yb-master each time.
For python specifically, the psycopg2 driver doesn't have a cursor.prepare() API. But to work around you can use Postgres' explicity PREPARE statement.
For example, changing:
to:
addresses the issue.
The text was updated successfully, but these errors were encountered: