Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(hogql): Allow chaining for lazy tables #28253

Merged
merged 10 commits into from
Feb 4, 2025

Conversation

danielbachhuber
Copy link
Contributor

@danielbachhuber danielbachhuber commented Feb 4, 2025

Fixes #26598
See #26954

Changes

Adds support for joining A => B and A => C via B and then querying SELECT a.c.field when using experiments_optimized lazy joins.

How did you test this code?

Added a couple of failing test cases and fixed them.

""")
node = cast(ast.SelectQuery, resolve_types(node, self.context, dialect="clickhouse"))

def test_resolve_field_through_nested_joins(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Gilbert09 I'd expect this test to fail but it passes for some reason. Do you see anything obviously wrong with it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parsing the HogQL will work fine - you need to actually run it against clickhouse for it to fail I believe

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that was it:

db = create_hogql_database(team_id=self.team.pk)
context = HogQLContext(
team_id=self.team.pk,
enable_select_queries=True,
database=db,
)
print_ast(parse_select("SELECT events.distinct_id FROM subscriptions"), context, dialect="clickhouse")
Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this sufficient? print_ast won't run the query

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EDsCODE Yes: the test fails without 8e2dd3c and passes with.

I added an ExperimentTrendsQuery test too: 2d10c40

@danielbachhuber danielbachhuber requested review from Gilbert09 and a team February 4, 2025 15:22
@danielbachhuber danielbachhuber marked this pull request as ready for review February 4, 2025 15:23
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

This PR fixes lazy join chaining in HogQL by ensuring correct field requests when joining multiple tables, particularly for experiment-optimized joins where fields from intermediate joins are needed for subsequent joins.

  • Added test cases in /posthog/hogql/database/test/test_database.py validating both linear (A->B->C) and nested (A->B and A->C via B) join chains
  • Refactored table key expression parsing in /posthog/warehouse/models/join.py by extracting logic into __parse_table_key_expression method
  • Added test case test_query_runner_with_data_warehouse_subscriptions_table in /posthog/hogql_queries/experiments/test/test_experiment_trends_query_runner.py to verify experiment-optimized joins
  • Fixed issue where intermediate join fields (like 'email') weren't being properly requested for subsequent joins

3 file(s) reviewed, 2 comment(s)
Edit PR Review Bot Settings | Greptile

Comment on lines +109 to +110
left = self.__parse_table_key_expression(self.source_table_key, join_to_add.from_table)
right = self.__parse_table_key_expression(self.joining_table_key, join_to_add.to_table)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Using instance variables directly here instead of the override parameters like in join_function. Should use _source_table_key and _joining_table_key for consistency

@@ -210,3 +195,14 @@ def _join_function_for_experiments(
)

return _join_function_for_experiments

def __parse_table_key_expression(self, table_key: str, table_name: str) -> ast.Expr:
expr = parse_expr(table_key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Consider validating table_key and table_name are not empty strings before parsing to avoid cryptic parse errors

Copy link
Member

@EDsCODE EDsCODE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may not be following here but I don't think these changes affect the tests? The test passes before and and after for me

Edit: nvm!

@danielbachhuber danielbachhuber merged commit 88a7113 into master Feb 4, 2025
98 checks passed
@danielbachhuber danielbachhuber deleted the hogql/lazy-chaining-fix branch February 4, 2025 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HogQL lazy join chaining bug
3 participants