-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support field collapsing + rescore #7484
Comments
@wli-chwy, are you working with ElasticSearch or OpenSearch 1.x or 2.x? |
@wli-chwy as you see from the elastic issue, there's interest in doing something like this, but it's never been executed. We need to investigate the feasibility and hopefully involve folks from the community on this. I searched through the code to see just how deep this goes and found the exception here:
Are you able to share more about your use case including mapping and queries here? I also see #6846 that may be related to your issue. tagging some folks who may be able to offer some deeper insights/correct my thinking here: @nknize, @msfroh. Is there anyone else who could provide some guidance on this? |
@wli-chwy I believe you did try collapsing after reranking, which is typically what users do. How did that work for you? Is there more you can say about that case here to help us figure out what our options could be? |
There might be a couple of options we can think about with regards to collapse + rescore:
|
@macohen it was slow. We need to do the guessing game. We need to guess how many precollapse items could fill up one page. If we guess less, we need to make another call. If we guess more, we waste data transfer. All in all, it added 40% more latency in application layer. About 100ms. |
With #9405, I added a Functionally, it's not that different from the application layer collapsing that @macohen suggested above (and @wli-chwy replied adds latency), but it avoids the round-trip between the application and the cluster, keeping the work on the coordinator node. I don't know if keeping it in the cluster cuts the latency enough, but it may be worth trying out. |
Is there any specific integration needed to use the LTR results in the CollapseResponseProcessor? cc:@msfroh |
Nope |
@wli-chwy, do you think this CollapseResponseProcessor can help? |
I think this strongly depends on the reason why documents are collapsed and how judgments are created. Business examples for retail: For the case that you collapse t-shirts over different sizes you might want re-rank before collapsing as the different size-variants contain different content that is critical for the search. For the case that you collapse smartphones over different sellers, you might want to re-rank after collapsing as sellers might not be relevant for your search. Interestingly, for ES the integration of collapsing and reranking was done the other way round as suggested here (first collapsing, then re-ranking): elastic/elasticsearch#27243 |
Is there a planned release date for this feature? We're planning to use it and very interested in the 'collapse then rescore' functionality and would appreciate any updates. |
Is your feature request related to a problem? Please describe.
OpenSearch will error "cannot use
collapse
in conjunction withrescore
", if I have both collapse and rescore clause in the query. In ecommerce space, my existing query rely on collapse (collapse on the same parent ID) to deduplicate the same variations of a product. Because of the limitation, I cannot use learn to rank plugin which need rescore to re-rank to improve my search relevancy.Describe the solution you'd like
No error issues when using
collapse
in conjunction withrescore
. Andrescore
should happen first to ensure the correct ranking of the face out item.Describe alternatives you've considered
N/A
Additional context
The same request in Elastic Search elastic/elasticsearch#27243
The text was updated successfully, but these errors were encountered: