Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for nested query syntax within query string query DSL #11322

Closed
tuespetre opened this issue May 24, 2015 · 38 comments
Closed

Support for nested query syntax within query string query DSL #11322

tuespetre opened this issue May 24, 2015 · 38 comments
Labels
>feature :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team

Comments

@tuespetre
Copy link

I understand that issue #9611 was closed regarding this:

Nested fields need to be queries with nested queries/filters, because multiple documents can match and you need to be able to specify how these multiple scores should be reduced to a single score.
— -- @clintongormley

Proposal

I propose that, when a field name within a query string query is parsed, and it does not match a field mapping, an attempt should be made to match the field name to a nested object mapper. If the attempt is successful, the query text for that field name should then be parsed as a query string query using the same settings as the root level query string query. The resulting query from that parsing will in turn be used to create a ToParentBlockJoinQuery (a nested query) that uses the same default scoring mode that would be applied when manually submitting a nested query ("avg".)

The syntax

The acceptable syntax for a nested query within a query string query is similar to this:

nestedPath:"<query string query>"

This means that any constructs you would use in a query string query are valid:

children:"children.first:peggy"
children:"children.first:\"peggy\""
children:"children.first:(peggy ruby)"
children:"children.first:peggy AND children.last:sue"
children:"children.first:pegyg~ +children.last:su?"

Note that the nested query MUST be surrounded with quotes. I wanted it to be parentheses instead but unfortunately the Lucene QueryParser class does not recognize the field names the way I wanted it to (children:(children.first:peggy) would come out as a TermQuery on children.first, the children field name would be discarded.)

Other considerations

  • Support for specifying scoring modes within the query string query settings based on nested object paths is a possibility.
  • Support for inner hits may also be a possibility, in a similar fashion to scoring modes.

Support for nested queries in query strings at all would be an enhancement, but these options could provide additional enhancements. Example of how they may look:

{
    "query_string" : {
        "query" : "children:\"children.first:peggy\"",
        "nested": [
            {
                "path": "children",
                "score_mode": "max",
                "inner_hits": {
                    <inner_hits_options>
                }
            }
        ]
    }
}

Pull request

For the basic functionality, I have already made the necessary modifications (three changed files, one changed test file to add a test with several assertions) on the 'master' branch of my local clone of the repository. I would like to submit a pull request; please advise as to how you would like that to be done (if I need to rebase onto another branch, etc.)

@tuespetre
Copy link
Author

I've since worked around this in other ways (simple regex to parse out nested field expressions on my end and submit them properly to ES); It was fun to mess around with this but I fully support axing it now. It would just be more complexity to maintain; perhaps the query string documentation could hint at some kind of better solution for developers that may look for this functionality.

@clintongormley
Copy link
Contributor

thanks @tuespetre

@radenui
Copy link

radenui commented Jan 22, 2016

Hi @tuespetre ,

I'm very interested in the workaround you used. Did you manage to make it work with kibana ?
Thanks !

@tuespetre
Copy link
Author

@radenui

I wrote the following drop-in helper class (written in C#, but should be easily portable to other languages): https://gist.github.com/tuespetre/f6951bb665c79abbb7c8

You basically use the class to create new URIs by performing some function against the existing query string (remove this filter, replace that filter, add this filter, etc.) When you specifically need to allow users to perform a 'proper' nested query, you can just use the helper to extract the filters on the nested properties out and build up a separate query string, which you would then submit as a nested query string query in your request to Elasticsearch.

I'm using it to offer both 'customer service representative friendly' interfaces (where the query string built up by the 'friendly' controls is stored in a hidden input) and 'technical user friendly' interfaces (where the query string is spit out into a visible text box that you can also type in, a-la GitHub Issues.)

@rmm5t
Copy link

rmm5t commented May 10, 2016

I actually quite like this proposal. Is it something that would be considered by the elasticsearch team or is this something that's not likely to ever be a feature? I'd love if the query string syntax allowed for nested query combinations.

I wanted it to be parentheses instead

Agreed. I think this syntax would be much better served by parentheses instead of quotations.

@jsangari-ssat
Copy link

Hi,

Is the syntax recommended here for the query_string supported in ES, I am using Version 2.2 and am having hard time getting it to work

@alexgarel
Copy link

alexgarel commented Sep 19, 2016

Hello I also think this should be supported. query_string remains a nice helper, and being able to use nested objects whit it would be great.

@tuespetre
Copy link
Author

@alexgarel and all everyone:

I think it would be more beneficial to keep something this niche and complex out of the core elasticsearch, and offer your own query DSL 'layer' that can be translated into a 'proper' ES query on the backend. By brushing up on regular expressions (or even parsing!) a little bit you can put together some pretty cool UX affordances specific to your application.

@rmm5t
Copy link

rmm5t commented Sep 19, 2016

...keep something this niche and complex out of the core elasticsearch...

...By brushing up on regular expressions (or even parsing!) a little bit...

So, is it niche and complex or is it as simple as adding a few items to the elasticsearch grammar?

Personally, I agree that you can add a custom syntax on top (with regexes or otherwise), but I also would like this discussion to remain open, because I think having a conversation about making the query string syntax more robust isn't necessarily a bad thing. Having every elasticsearch application implement yet another hack on top of the query string syntax to accomplish this isn't necessarily a great use of global man-hours.

I'm mostly interested to better understand if the elasticsearch team is interested in a Pull Request for this feature. So far, we don't have an answer to that question.

@alexgarel
Copy link

alexgarel commented Sep 19, 2016

@tuespetre
Ok I understand, it's the way we have chosen but not fully implemented yet. If someone needs it, we have a (GPL) lucene query parser in python

@tuespetre
Copy link
Author

@rmm5t I had submitted a PR (#11339) but as @clintongormley points out it's just a fragile thing to have in the core application, and as I found out when working the PR initially, it can't really be done with a pleasant syntax -- it comes out feeling very verbose and awkward, especially being unable to hijack the parenthesis for it. With a small handful of regular expressions I was able to implement a much nicer syntax specific to the particular needs of our application without feeling like I had to 'settle' for something subpar.

@rmm5t
Copy link

rmm5t commented Sep 19, 2016

I had submitted a PR (#11339) but as @clintongormley points out it's just a fragile thing to have in the core application, and as I found out when working the PR initially, it can't really be done with a pleasant syntax

@tuespetre Interesting point. That PR was tagged for discussion (which, respectfully, never really happened amongst the elasticsearch team, aside from @clintongormley willingness to comment and chime in). Then, it was closed, solely because you closed this particular issue after building a workaround -- not because a discussion really happened.

I agree with your first assessment that the double-quoted syntax isn't ideal. I understand there are problems with the clearer parentheses syntax, but I suspect those can probably be overcome.

If the core query string syntax and implementation are "fragile," maybe that's something that should be addressed and potentially refactored as well. To be clear, I'm not trying to make light of this; I'm sure a refactor would be a tricky endeavor.

Proposal

Overall, I'd really just like to see an ability to narrow a query string search to one particular embedded object. I'd like to see a syntax that looked like this:

children:(gender:male AND age:>=18 AND age:<=25)

Otherwise, there's no way to use the query string syntax and (in this particular US-centric example) find parents who have children who should be signed up for the US Selective Service System.

@traut
Copy link

traut commented Jul 20, 2017

can we resurrect this issue please?

@clintongormley
Copy link
Contributor

Yeah, I think we need to think more about whether to expose this. Opening for more discussion

@czjxy881
Copy link
Contributor

czjxy881 commented Oct 9, 2017

+1

@tuespetre
Copy link
Author

@rmm5t good points, your 'wish syntax' looks nice!

@buchanae
Copy link

If I could comment on my experience as a user:

It took me an hour or so to figure out that this didn't exist. I'd like to build a dashboard with a search bar, where the syntax is defined by Elasticsearch/Lucene's query string syntax. Having this would make that project substantially easier.

As an engineer: this seems like a great candidate for something that could grow, mature, and harden outside of the core. If a service/library can be built using Lucene's parser and submit JSON-style nested Elasticsearch queries on the backend, we could figure out the details with a non-core prototype.

children:(gender:male AND age:>=18 AND age:<=25)

I like that.

My initial idea, inspired by jq: children[].gender:male. About 2 seconds of thought went into that, so potentially full of holes :)

@alexgarel
Copy link

@buchanae sorry for I repeat myself but you can see our (GPL) lucene query parser in python it's yet far from perfect but may help.

@albogdano
Copy link

albogdano commented Jan 12, 2018

I'd also like to +1 this and share my experience. I'm a long-time Elasticsearch user and I've recently hit the "field mapping explosion" limitation. Our system allows users to define their own objects with any number of custom fields, which leads to a mapping explosion. Currently, from what I read in the forum, the only way to solve this is to use nested key/value objects inside an array field:

nested: [{k: FIELD1, v: TERM1}, ...]

This lead me to this issue. I'm trying to seamlessly combine normal queries and queries to nested objects in a single query string query. I think this feature would make it easier for people to solve the problem of "too many custom fields".

EDIT: I've implemented this as a Lucene query string syntax extension, by detecting and rewriting queries which contain special nested fields. Link to code

albogdano added a commit to Erudika/para-search-elasticsearch that referenced this issue Jan 18, 2018
@clintongormley clintongormley added :Search/Search Search-related issues that do not fall into other categories and removed :Query DSL labels Feb 14, 2018
@cbuescher
Copy link
Member

/cc @elastic/es-search-aggs

@cont-korzh
Copy link

+1

@jonasbergqvist
Copy link

jonasbergqvist commented Jun 28, 2018

+1

@jimczi jimczi removed the discuss label Sep 7, 2018
@jimczi jimczi self-assigned this Sep 7, 2018
@imranansarij2ee
Copy link

+1

3 similar comments
@waswrongassembled
Copy link

+1

@ChristopherSnay
Copy link

+1

@theoJA
Copy link

theoJA commented May 13, 2019

+1

@prashantalhat
Copy link

+1

@jimczi jimczi removed their assignment Mar 13, 2020
@mrka124
Copy link

mrka124 commented Mar 16, 2020

+1

1 similar comment
@ffery
Copy link

ffery commented Apr 15, 2020

+1

@rjernst rjernst added the Team:Search Meta label for search team label May 4, 2020
@chethan-uc
Copy link

+1

2 similar comments
@4ndygu
Copy link

4ndygu commented Jul 21, 2021

+1

@adjivas
Copy link

adjivas commented Oct 8, 2021

+1

@SimarFromCowbell
Copy link

+1 would like this added

@tumbledwyer
Copy link

+1
I have the issue, where I'm trying to do a nested query from logstash using the elasticsearch filter, which only supports query string, not the regular DSL.
I can accomplish this in KQL like this:
myNestedObject:{ nestedProperty: "The value I'm looking for" }

@vaimer
Copy link

vaimer commented Sep 22, 2022

+1 it will be very useful

@javanna
Copy link
Member

javanna commented Oct 13, 2022

Closing as duplicate of #16551.

@javanna javanna closed this as completed Oct 13, 2022
@mldyh
Copy link

mldyh commented Sep 6, 2023

+1

1 similar comment
@heidi-holappa
Copy link

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team
Projects
None yet
Development

No branches or pull requests