Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support explain analyze format = zip to collect metrics to investigate slow query #20419

Open
zz-jason opened this issue Oct 13, 2020 · 4 comments
Labels
feature/accepted This feature request is accepted by product managers type/feature-request Categorizes issue or PR as related to a new feature.

Comments

@zz-jason
Copy link
Member

Feature Request

Is your feature request related to a problem? Please describe:

To investigate a slow query, typically we need the following information:

  • SQL Query
  • Slow Log
  • create table statements for related tables
  • explain analyze ..., usually contained in slow log
  • TiDB, TiKV, PD related logs (more detail than slow log), usually used when the issue is hard to investigate
  • CPU, heap, goroutine, lock profile, etc

Collecting these metrics is not easy, it involves lots of discussion with users and explanations. If we could collect all the needed information, it would greatly save the time to collect useful information and reduce the time to find the slow reason.

Describe the feature you'd like:

explain analyze format = zip ... statement to collect all the needed information, putting them to a zip file, then users can provide us the zip file to help investigate the root cause.

Describe alternatives you've considered:

N/A

Teachability, Documentation, Adoption, Migration Strategy:

N/A

@zz-jason zz-jason added the type/feature-request Categorizes issue or PR as related to a new feature. label Oct 13, 2020
@ghost
Copy link

ghost commented Oct 13, 2020

For slow query, maybe it is better to include the information from information_schema.statements_summary? I would also add bind_info to the list of sources.

@zz-jason
Copy link
Member Author

both information_schema.statements_summary and bind_info is helpful for investigation.

@zz-jason zz-jason added the feature/accepted This feature request is accepted by product managers label Oct 13, 2020
@zz-jason
Copy link
Member Author

@breeswish It would be better if we can support collecting these metrics in TiDB Dashboard.

@breezewish
Copy link
Member

breezewish commented Oct 13, 2020

The idea is great. Tracing & metrics can be supplied as well. I only concern about the implementation for some parts. For example:

  • It's very hard to identify "related logs" for non TiDB components now. Maybe supporting Print Txn ID and Query ID in log to trace the whole lifetime of a Txn / SQL #17845 can help a little. (but still pretty hard for region related things, since region operation in TiKV are not easy to be linked with a specific TiDB query). If the logs are known to be missing, then it may be simply regarded as "unreliable" so that the usefulness is reduced. DBAs may still need to collect logs in other ways, in order to not miss something.
  • AFAIK CPU / Heap / etc can not be collected in parallel for multiple queries. It may cause EXPLAIN ANALYZE statement wait for non execution related things.

As a reference, CockroachDB may have something similar:
https://www.cockroachlabs.com/docs/stable/admin-ui-statements-page.html#diagnostics

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/accepted This feature request is accepted by product managers type/feature-request Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

2 participants