-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Azure Search Implementation #9775
Conversation
One immediate question: Just calling
I thought this was weird and @brjohnstmsft also confirms:
How do we want to handle this in the Python SDK? Are there field names that are safe to use that we could adapt these to? Return a list of actual item objects instead of dicts? Sidecar somehow? |
@bryevdv While I can't answer from the Python perspective, I can share prior art from Track 1 .NET and Track 2 Java. In those cases, what's returned as the results for each page is not a collection of documents, it is a collection of results, each of which has properties specifically for score, highlight, and the document itself. Accessing it might look like this in Track 1 C#: var page = await searchIndexClient.Documents.SearchAsync<Hotel>("hello");
var results = page.Results; // page also has facets, coverage, count, etc.
foreach (var result in results)
{
Console.WriteLine($"Relevance score: {result.Score}");
Console.WriteLine($"Result has {result.Highlights.Count} highlighted phrases.");
Console.WriteLine($"Matching document: {result.Document}"); // Assuming Hotel has an intelligent implementation of ToString().
} |
@brjohnstmsft That's definitely possibility. FWIW Cosmos takes the "add dict keys that can't be valid field names" approach. Would like to know what @johanste thinks about this situation, though. |
@bryevdv You could keep everything in the dict, if you're able to leave the "@search." qualifier on the special keys. I'm not sure whether that's a better developer experience than the alternatives though. |
How prominent is the application usage of these fields? Making it significantly easier to get the values make sense if they are almost always used, but if it is in the less-than-25% case (picking a random number here), I think it is ok to keep them in the dict with the "@search" prefix. After all, that is how we expect people to get to the data that we are almost certain that they will use (e.g. the actual document data). |
@johanste Although we don’t have telemetry to measure the usage of those properties, I can speculate based on what they are used for. The document score is probably rarely used, except for trying to debug why the ranking of results isn’t what was expected. The value of the score is only meaningful in the context of a single set of results, and is a relative value, not an absolute value. So there is no meaningful calculation you can do with it. On the other hand, hit highlights are very useful for rendering search results in an application. We get enough customer questions and feedback about highlighting that I would say it is a common feature, which in my mind would justify making it a dedicated property. We will also be adding a third annotation to this part of the response very soon. It will contain extracted features that will be useful for machine learning based ranking models. As this is a new and experimental feature, we don’t really know how much traction it’s going to get at this point. That said, making it a property of its own would make it more discoverable. |
sdk/search/azure-search-index/azure/search/index/_index_operation.py
Outdated
Show resolved
Hide resolved
sdk/search/azure-search-index/azure/search/index/_credential.py
Outdated
Show resolved
Hide resolved
sdk/search/azure-search-index/azure/search/index/_generated/_version.py
Outdated
Show resolved
Hide resolved
sdk/search/azure-search-index/azure/search/index/_generated/models/_models.py
Outdated
Show resolved
Hide resolved
sdk/search/azure-search-index/azure/search/index/_generated/aio/_configuration_async.py
Outdated
Show resolved
Hide resolved
Per discussion with @brjohnstmsft and others, re-organized this to be a single "azure-search" package. |
Can one of the admins verify this patch? |
There is more work to do before code freeze for preview 1 but I'd prefer to merge this sooner and build smaller PRs off of this starting point. |
sdk/search/azure-search/samples/async_samples/sample_authentication_async.py
Outdated
Show resolved
Hide resolved
sdk/search/azure-search/samples/async_samples/sample_simple_query_async.py
Outdated
Show resolved
Hide resolved
sdk/search/azure-search/samples/async_samples/sample_get_document_async.py
Outdated
Show resolved
Hide resolved
sdk/search/azure-search/samples/async_samples/sample_filter_query_async.py
Outdated
Show resolved
Hide resolved
sdk/search/azure-search/samples/async_samples/sample_crud_operations_async.py
Outdated
Show resolved
Hide resolved
sdk/search/azure-search/samples/async_samples/sample_autocomplete_async.py
Outdated
Show resolved
Hide resolved
__doc__ = AutocompleteRequest.__doc__ | ||
|
||
|
||
class SearchQuery(_QueryBase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to have a constructor that takes optional select
, order_by
and a list of filter
tuples?
E.g.
q = SearchQuery(select=['a', 'b'], order_by=['b desc'])
?
Or are we just adding too many options on how to create this puppy...
:dedent: 4 | ||
:caption: Get a auto-completions. | ||
""" | ||
if not isinstance(query, AutocompleteQuery): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should accept str
like objects as well, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two arguments are mandatory, the search text and a suggester. I'd personally prefer to avoid the complication of dealing with combinations that don't make sense.
:dedent: 4 | ||
:caption: Get search suggestions. | ||
""" | ||
if not isinstance(query, SuggestQuery): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should accept str
like objects as well, no?
Very early WIP, current status: sync implementation sans paging (Or tests, or samples, I have been iterating in a notebook)
cc @johanste @lmazuel would appreciate a quick look for any early problems before I move on to add paging, tests, and async.