-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configure Csv delimiter #716
Conversation
Co-authored-by: Paul Sanders <psanders1@gmail.com>
Co-authored-by: Paul Sanders <psanders1@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments, if you apply the test ones please apply to both of the test cases :)
@@ -0,0 +1,54 @@ | |||
id;title;album;artist;genre;country;released;duration;released-timestamp;duration-float |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could reduce this csv file to a few lines :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! But thank @sanders41, the tests have been missing from the package build since #367. In case you wonder
parameters = {} | ||
if primary_key: | ||
parameters["primaryKey"] = primary_key | ||
if csv_delimiter: | ||
parameters["csvDelimiter"] = csv_delimiter | ||
if primary_key is None and csv_delimiter is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like that much to see these if
conditions because they usually hide a code that will grow indefinitely, WDYT about this?
parameters = {} | |
if primary_key: | |
parameters["primaryKey"] = primary_key | |
if csv_delimiter: | |
parameters["csvDelimiter"] = csv_delimiter | |
if primary_key is None and csv_delimiter is None: | |
parameters = { "csvDelimiter": csv_delimiter, "primaryKey": primary_key } | |
parameters = dict((k, v) for k, v in parameters.items() if v) | |
if primary_key is None and csv_delimiter is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will work, but I would argue it is less readable and not obvious what you are doing. If you do go this direction you could clean it up some:
parameters = {k: v for k, v in parameters.items() if v}
With either direction it could also be worth it to short circuit so parameters don't get created and run if not needed:
def _build_url(
self,
primary_key: Optional[str] = None,
csv_delimiter: Optional[str] = None,
) -> str:
if primary_key is None and csv_delimiter is None:
return f"{self.config.paths.index}/{self.uid}/{self.config.paths.document}"
parameters = { "csvDelimiter": csv_delimiter, "primaryKey": primary_key }
parameters = {k: v for k, v in parameters.items() if v}
return f"{self.config.paths.index}/{self.uid}/{self.config.paths.document}?{parse.urlencode(parameters)}"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with sanders41. In general, I prefer to avoid using the magic one-liners of Python to avoid complex understanding.
And we should not have to add a new one. But if we do, I will change it for your code!
index = empty_index("csv-delimiter") | ||
response = index.add_documents_csv(songs_csv_custom_separator, csv_delimiter=";") | ||
assert isinstance(response, TaskInfo) | ||
assert response.task_uid is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, but all the other tests are like that, so I find it weird to remove it for this one.
assert isinstance(response, TaskInfo) | ||
assert response.task_uid is not None | ||
task = index.wait_for_task(response.task_uid) | ||
assert task.status == "succeeded" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the task status, if the wait did not work your test will fail anyway right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible for the task to complete, but the status to be failed right? I think this is what she is testing, that it was successful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got the idea, and my point is the assertion has no practical value because if the task fails, the assert task.details["receivedDocuments"] == 53
will fail and also the following assertions after get_documents
.
We have 6 assertions in this test case, but only the last two are checking what the use case wants to assert (+ the receivedDocuments assertion), which is to verify if the documents were indexed properly.
I know test code is supposed to be explicit instead of implicit, but in this case, I struggle to find good reasons to keep those assertions everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you read this:
def test_add_documents_csv_with_delimiter(empty_index, songs_csv_custom_separator):
"""Tests adding new documents to a clean index."""
index = empty_index("csv-delimiter")
response = index.add_documents_csv(songs_csv_custom_separator, csv_delimiter=";")
task = index.wait_for_task(response.task_uid)
assert task.details["receivedDocuments"] == 53
documents = index.get_documents().results
assert documents[1].country == "Europe"
assert documents[4].artist == "Elton John"
Could you let me know if you missed the removed assertions?
709: Changes related to the next Meilisearch release (v1.1.0) r=bidoubiwa a=meili-bot Related to this issue: meilisearch/integration-guides#251 This PR: - gathers the changes related to the next Meilisearch release (v1.1.0) so that this package is ready when the official release is out. - should pass the tests against the [latest pre-release of Meilisearch](https://github.com/meilisearch/meilisearch/releases). - might eventually contain test failures until the Meilisearch v1.1.0 is out.⚠️ This PR should NOT be merged until the next release of Meilisearch (v1.1.0) is out. _This PR is auto-generated for the [pre-release week](https://github.com/meilisearch/integration-guides/blob/main/resources/pre-release-week.md) purpose._ Done: - #714 - #716 726: Update version for the next release (v0.26.0) r=bidoubiwa a=meili-bot This version makes this package compatible with Meilisearch v1.1 🎉 Check out the changelog of [Meilisearch v1.1](https://github.com/meilisearch/meilisearch/releases/tag/v1.1.0) for more information on the changes. ##⚠️ Breaking changes * Change error names from MeiliSerach to Meilisearch (#720) `@sanders41` ## 🚀 Enhancements - Add the ability to provide a specific `csv-delimiter` when adding and updating documents in CSV format (#716) `@alallema` - New method `client.multi_search()` provides the possibility to make multiple requests at once (#714) `@alallema` Example: ```python client.multi_search( [ {'indexUid': 'movies', 'q': 'pooh', 'limit': 5}, {'indexUid': 'movies', 'q': 'nemo', 'limit': 5}, {'indexUid': 'movie_ratings', 'q': 'us'} ] ) ``` Thanks again to `@alallema` and `@sanders41!` 🎉 Co-authored-by: meili-bot <74670311+meili-bot@users.noreply.github.com> Co-authored-by: alallema <amelie@meilisearch.com> Co-authored-by: Amélie <alallema@users.noreply.github.com>
Add the ability to provide a CSV delimiter for adding and updating CSV documents as in the specification
SDK requirements: meilisearch/integration-guides#251