Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable COUNT in view queries #6820

Merged
merged 1 commit into from
Aug 6, 2024
Merged

Conversation

jepett0
Copy link
Collaborator

@jepett0 jepett0 commented Jul 18, 2024

#6394
KIKIMR-21731

  • Add ModuleResolver from KQP host.

    It seems that COUNT and other aggregate functions are implemented as imported modules. See, for example, AST of the following query:

    SELECT COUNT(*) FROM `/Root/episodes`

    Note the following line:

    (import aggregate_module '"/lib/yql/aggregate.yql")

    Aggregate module count_traits_factory is implemented here.

  • Add ExpandApplyForLambdas transformation step after RewriteReadFromView.

    The same is done to rewrite reads from views in YT. The necessity could be well understood by studying the differences in:

    • transformations of

      SELECT COUNT(*) FROM `/Root/episodes`

      (Note how Apply of a lambda disappears due to some graph transformation between the first and the second debug printout of the TExprNode graph of the pure SELECT.)

    • transformations of the same query written inside a view

Both (ModuleResolver addition and ExpandApplyForLambdas step repetition) are necessary to enable aggregate functions in views.

Copying default KQP host's translation settings is done to make view compilation more similar to a pure SELECT compilation. It is not a necessary part of the fix.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

@jepett0 jepett0 marked this pull request as ready for review July 18, 2024 13:17
@@ -0,0 +1,9 @@
CREATE VIEW `/Root/count_episodes` WITH (security_invoker = TRUE) AS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do these tests check? Do you have a validation for the results of these queries?

Copy link
Collaborator Author

@jepett0 jepett0 Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, these tests are a non-standard way to test views. They are functional tests, but written in C++. A test case is a single folder in ydb/core/kqp/ut/view/input/cases (like count_rows folder, or count_episodes folder). Each test case is executed in the following order:

  1. execute create_view.sql
  2. execute etalon_query.sql
  3. execute select_from_view.sql
  4. execute drop_view.sql

Results of step (2) and step (3) are compared as YSON strings here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This particular test (count_episodes) is a way to test COUNT inside a view, but with a bit more complex query than just SELECT COUNT(*) FROM some_table

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an intresting approach. Thank you for exlanation :)

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

dorooleg
dorooleg previously approved these changes Jul 22, 2024
- add ModuleResolver from KQP host
- use TKqpTranslationSettingsBuilder like KQP Host does to compile queries
Copy link

github-actions bot commented Aug 5, 2024

2024-08-05 13:36:59 UTC Pre-commit check for 85daa8e has started.
2024-08-05 13:39:42 UTC Check linux-x86_64-release-asan is running...
🔴 2024-08-05 15:34:05 UTC Some tests failed, follow the links below.

Test history | Ya make output

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
9758 9715 0 4 24 15

🟢 2024-08-05 15:35:14 UTC Build successful.
🟢 2024-08-05 15:35:42 UTC ydbd size 5.4 GiB changed* by -201.8 KiB, which is <= 0 Bytes vs main: OK

ydbd size dash main: 6533a28 merge: 85daa8e diff diff %
ydbd size 5 826 345 728 Bytes 5 826 139 096 Bytes -201.8 KiB -0.004%
ydbd stripped size 1 463 285 456 Bytes 1 463 202 576 Bytes -80.9 KiB -0.006%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

Copy link

github-actions bot commented Aug 5, 2024

2024-08-05 13:38:37 UTC Pre-commit check for 85daa8e has started.
2024-08-05 13:48:45 UTC Check linux-x86_64-release-clang14 is running...
🟢 2024-08-05 13:55:09 UTC Build successful.

Copy link

github-actions bot commented Aug 5, 2024

2024-08-05 13:39:32 UTC Pre-commit check for 85daa8e has started.
2024-08-05 13:49:28 UTC Check linux-x86_64-relwithdebinfo is running...
🟡 2024-08-05 15:46:32 UTC Some tests failed, follow the links below. Going to retry failed tests...

Test history | Ya make output

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
38497 33175 0 3 5307 12

2024-08-05 15:51:00 UTC Failed tests rerun (try 2) linux-x86_64-relwithdebinfo is running...
🟢 2024-08-05 15:58:35 UTC Tests successful.

Test history | Ya make output

TESTS PASSED ERRORS FAILED SKIPPED MUTED?
18 (only retried tests) 9 0 0 1 8

🟢 2024-08-05 16:01:46 UTC Build successful.
🟢 2024-08-05 16:02:24 UTC ydbd size 8.1 GiB changed* by -258.6 KiB, which is <= 0 Bytes vs main: OK

ydbd size dash main: 6533a28 merge: 85daa8e diff diff %
ydbd size 8 666 494 576 Bytes 8 666 229 768 Bytes -258.6 KiB -0.003%
ydbd stripped size 471 487 400 Bytes 471 471 720 Bytes -15.3 KiB -0.003%

*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation

cluster,
query,
SessionCtx->Config().BindingsMode,
GUCSettings
Copy link
Collaborator Author

@jepett0 jepett0 Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GUCSettings are needed here. That is why we pass them in CreateKikimrDataSource.

We cannot, unfortunately, pass the KQP translation settings builder itself from the KQP host to the Kikimr DataSource, because it needs a query to be built.

@@ -1,7 +1,8 @@
#pragma once

#include <ydb/core/kqp/provider/yql_kikimr_results.h>
Copy link
Collaborator Author

@jepett0 jepett0 Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I remove unnecessary includes and add only the explicitly needed ones. It needs to be done to eliminate dependency loop between kqp/provider and kqp/host

@@ -0,0 +1,43 @@
CREATE VIEW `/Root/aggregates_and_window` WITH (security_invoker = TRUE) AS
SELECT
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test might be an overkill, but it uses SUM, AVG and COUNT aggregate functions as well as a WINDOW clause

@jepett0 jepett0 requested a review from dorooleg August 5, 2024 13:56
@jepett0 jepett0 merged commit d7a9935 into ydb-platform:main Aug 6, 2024
11 of 13 checks passed
jepett0 added a commit to jepett0/ydb that referenced this pull request Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants