KQL parser generates deeply nested boolean clauses #89473
Labels
bug
Fixes for quality problems that affect the customer experience
impact:high
Addressing this issue will have a high level of impact on the quality/strength of our product.
loe:large
Large Level of Effort
PR sent
SharedUX/fix-it-week
Bugs that have been groomed and queued up for the team's next fix it week
PR elastic/elasticsearch#66204 was recently backported to 7.x - it changes the limit on nested boolean clauses in ES queries. The new default max depth is 20.
This affects alerting, which creates fairly elaborate queries with deeply nested booleans - and we opened PR #89345 to fix this. We assumed at the time this was just a problem with the
node_builder
code we use to programmatically add to a KQL AST, to add our own filtering on an ES query.The problem with
node_builder
was that it took some KQL likea:(B or C or D)
, but then treated it likea:B or (a:C or a:D)
, in terms of the ES query generated. Specifically, it took something that was "linear", and returned it as an equivalent, but nested form. This is specifically happening withand
andor
boolean causes. We fixednode_builder
to linearize this, so it would be treated as(a:B or a:C or a:D)
, flattening out the "recursive" aspect of the expression interpretation.That worked, and seemed to fix the deeply nested queries, but then some tests failed that were comparing output generated with
node_builder
with a pure KQL string parsed by the KQL PEG parser. And at that point, I realized that the KQL PEG parser is also generating these deeply nested "recursive" AST itself.Here's the test where I figured that out:
kibana/x-pack/plugins/alerts/server/authorization/alerts_authorization.test.ts
Lines 630 to 634 in ba1e795
The argument to
expect()
is the new output of the "fixed"node_builder
, andesKuery.fromKueryExppression()
is the output from the KQL PEG parser. The output is no longer "equal", with the KQL PEG parser output looking like the pre-fixednode_builder
output. Some further experimenting withesKuery.fromKueryExppression()
made it clear that the parser also generates the deeply nested booleans, just like the pre-fixednode_builder
code. In fact, it seems likely to me thatnode_builder
was probably structured to return the same shapes as the KQL parser, even though it didn't really have to.I'm not sure how bad this is - but presumably a customer with KQL query
a:(B or C or D or ... Z)
(25 or'd clauses), they'll hit the nesting limit. Seems like something we need to fix.The text was updated successfully, but these errors were encountered: