Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use correct configuration property for orc bloom filter #10343

Merged
merged 4 commits into from
Jan 13, 2022

Conversation

Praveen2112
Copy link
Member

@Praveen2112 Praveen2112 commented Dec 17, 2021

Fixes #9792

Copy link
Contributor

@ksobolew ksobolew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, that's kind of embarrassing... Difficulty in writing tests for this is only a partial excuse :(

@Praveen2112 Praveen2112 marked this pull request as draft December 17, 2021 15:47
@Praveen2112
Copy link
Member Author

The thing is that hive/orc doesn't provide a easier way to expose whether bloom filter is written or not. I'll take a stab at it again to check if it is possible or not.

@Praveen2112 Praveen2112 force-pushed the praveen/bug/orc_bloom_filter branch 8 times, most recently from 4ff453a to c6f629e Compare December 19, 2021 05:21
@raunaqmorarka
Copy link
Member

Is it possible to write a queryrunner test where the predicate can result in no data getting read only due to bloom filter (something similar to assertRowGroupPruning in TestParquetPageSkipping).

@Praveen2112 Praveen2112 force-pushed the praveen/bug/orc_bloom_filter branch 2 times, most recently from 349ad2b to f38b12a Compare December 19, 2021 10:09
@Praveen2112
Copy link
Member Author

@raunaqmorarka Thanks for the pointers. Last time when I checked with the query runner, the query stats where misleading (maybe I could have a different API). Now I'm able to test if the bloom filter is created or not. Thanks a bunch !!

@ksobolew I have added integration tests for it.

@Praveen2112 Praveen2112 marked this pull request as ready for review December 19, 2021 10:10
@sopel39 sopel39 requested a review from skrzypo987 December 22, 2021 10:42
Copy link
Member

@skrzypo987 skrzypo987 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that this never worked?

@Praveen2112
Copy link
Member Author

Do I understand correctly that this never worked?

Yes.

@findepi
Copy link
Member

findepi commented Jan 4, 2022

Do I understand correctly that this never worked?

Yes.

@Praveen2112 let's consider fixing release notes for 339

Writing ORC bloom filter is broken for Hive connector
@Praveen2112 Praveen2112 force-pushed the praveen/bug/orc_bloom_filter branch from f38b12a to 0e85567 Compare January 11, 2022 13:31
@Praveen2112
Copy link
Member Author

@findepi Updated release notes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

ORC bloom filter in Trino working?
7 participants