-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configure Csv delimiter #716
Changes from 6 commits
a9eae23
bf25a80
b1b9113
a221f94
e53b5f7
5e6feaf
cb2212a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
id;title;album;artist;genre;country;released;duration;released-timestamp;duration-float | ||
702481615;Armatage Shanks;Dookie: The Ultimate Critical Review;Green Day;Rock;Europe;2005;;1104537600; | ||
888221515;Old Folks;Six Classic Albums Plus Bonus Tracks;Harold Land;Jazz;Europe;2013;6:36;1356998400;6.36 | ||
1382413601;คำขอร้อง;สำเนียงคนจันทร์ / เอาเถอะถ้าเห็นเขาดีกว่า;อิทธิพล บำรุงกุล;"Folk; World; & Country";Thailand;;;; | ||
190889300;Track 1;Summer Breeze;Dreas;Funk / Soul;US;2008;18:56;1199145600;18.56 | ||
813645611;Slave (Alternative Version);Honky Château;Elton John;Rock;Europe;;2:53;;2.5300000000000002 | ||
394018506;Sex & Geld;Trackz Für Den Index;Mafia Clikk;Hip Hop;Germany;2006;5:02;1136073600;5.02 | ||
1522481803;Pisciaunella;Don Pepp U Pacce;Giovanni Russo (2);"Folk; World; & Country";Italy;1980;;315532800; | ||
862296713;不知;Kiss 2001 Hong Kong Live Concert;Various;Electronic;Hong Kong;2002-04-13;;1018656000; | ||
467946423;Rot;Be Quick Or Be Dead Vol. 3;Various;Electronic;Serbia;2013-06-20;1:00;1371686400;1 | ||
1323854803;"Simulation Project 1; ツキハナ「Moonflower」";Unlimited Dream Company;Amun Dragoon;Electronic;US;2018-04-10;2:44;1523318400;2.44 | ||
235115704;Doctor Vine;The Big F;The Big F;Rock;US;1989;5:29;599616000;5.29 | ||
249025232;"Ringel; Ringel; Reihe";Kinderlieder ABC - Der Bielefelder Kinderchor Singt 42 Lieder Von A-Z;Der Bielefelder Kinderchor;Children's;Germany;1971;;31536000; | ||
710094000;Happy Safari = Safari Feliz;Safari Swings Again = El Safari Sigue En Su Swing;Bert Kaempfert & His Orchestra;Jazz;Argentina;1977;2:45;220924800;2.45 | ||
538632700;Take Me Up;Spring;Various;Electronic;US;2000;3:06;946684800;3.06 | ||
1556505508;Doin To Me ( Radio Version );Say My Name;Netta Dogg;Hip Hop;US;2005;;1104537600; | ||
1067031900;Concerto For Balloon & Orchestra / Concerto For Synthesizer & Orchestra;Concerto For Balloon & Orchestra And Three Overtures;Stanyan String & Wind Ensemble;Classical;US;1977;;220924800; | ||
137251914;"I Love The Nightlife (Disco 'Round) (Real Rapino 7"" Mix)";The Adventures Of Priscilla: Queen Of The Desert - Original Motion Picture Soundtrack;Various;Stage & Screen;US;1994;3:31;757382400;3.31 | ||
554983904;Walking On The Moon;Certifiable (Live In Buenos Aires);The Police;Rock;Malaysia;2008-11-00;;1225497600; | ||
557616002;Two Soldiers;Jerry Garcia / David Grisman;David Grisman;"Folk; World; & Country";US;2014-04-00;4:24;1396310400;4.24 | ||
878936809;When You Gonna Learn;Live At Firenze 93;Jamiroquai;Funk / Soul;France;2004;13:01;1072915200;13.01 | ||
368960707;Sardo D.O.C.;Sardinia Pride Compilation Vol.2;Various;Hip Hop;Italy;2012-06-22;4:41;1340323200;4.41 | ||
1416041312;Sympathy For The Devil;Under Cover;Ozzy Osbourne;Rock;Russia;2005;7:11;1104537600;7.11 | ||
1260509200;Grosse Overturen;noxious effects garanty;Nocif (3);Rock;France;1990;;631152000; | ||
1466381324;Πρινιώτης;Μουσικά Πατήματα Της Κρήτης;Αντώνης Μαρτσάκης;"Folk; World; & Country";Greece;2019;;1546300800; | ||
256009724;Here I Stand And Face The Rain (Demo);Hunting High And Low;a-ha;Electronic;UK & Europe;2010-07-23;;1279843200; | ||
565253008;Born Free;At His Best Goldfinger;The Royal Philharmonic Orchestra;Blues;Japan;1976;;189302400; | ||
492519701;Others;Where Did She Go;Stephan Sulke;Rock;US;1965;2:43;-157766400;2.43 | ||
1268208902;Electric Fires;All That Divides;Black Peaks;Rock;UK;2018-10-05;3:22;1538697600;3.22 | ||
1077074411;Your Summer Dream;Surfer Girl;The Beach Boys;Pop;Australia;2012-09-24;;1348444800; | ||
1275072979;Untitled;Die Legenden Der Albae: Tobender Sturm;Johannes Steck;Non-Music;Germany;2014;6:49;1388534400;6.49 | ||
826446602;Salvaje;Introglicerina;Seguridad Social;Rock;France;1990;;631152000; | ||
180155008;Man;Bajzel;Bajzel;Rock;Poland;2007;2:37;1167609600;2.37 | ||
949980604;Ebb Tide;21Channel Sound;David Rose & His Orchestra;Pop;US;1962;;-252460800; | ||
1241480200;Things;Things / Is There A Missing Piece;Donny Mann;Funk / Soul;Canada;1973;3:15;94694400;3.15 | ||
38038907;Waving My Arms In The Air;Barrett;Syd Barrett;Rock;UK & Europe;2004;2:07;1072915200;2.07 | ||
359975318;The Hand;The Lost Patrol. Virginia City;Max Steiner;Stage & Screen;Europe;1995;2:29;788918400;2.29 | ||
810688602;We Don't Need Another Hero;Split Tape;Rottingrex;Rock;France;2014;;1388534400; | ||
1225622907;Our Place In The Sun;Unity Drum;Dissident (2);Rock;US;2018-07-13;;1531440000; | ||
406467204;Riders On The Storm (Album Edit);U.S. Dance Party - Vol. 4;Various;Electronic;US;1995;6:35;788918400;6.35 | ||
158441000;Take Time To Know Her;Take Time To Know Her;Ninjaman;Reggae;US;1988;;567993600; | ||
992096712;Suddenly;Midnight Soul;Various;Funk / Soul;US;1997;3:51;852076800;3.51 | ||
244054503;Straight From My Heart;Huang Chung;Wang Chung;Electronic;Italy;1982;2:48;378691200;2.48 | ||
894880317;Черное Знамя (Live In Paris);Ва-Банкъ MP3;Ва-Банкъ;Rock;Russia;2005;3:48;1104537600;3.48 | ||
767521501;L'Inganno;Eclissi;A Piedi Nudi;Rock;Russia;2006;4:56;1136073600;4.5600000000000005 | ||
785246800;Got It Covered;Songs Of The Next Great Depression;Barry Melton;"Folk; World; & Country";Germany;1991;2:41;662688000;2.41 | ||
499471800;Collateral;Collateral;Mickmag & Justbob;Electronic;Netherlands;2013-09-16;5:39;1379289600;5.39 | ||
1251220502;Can't Wait;Rosebud e.p.;Poncho+Casino Tart;Funk / Soul;Japan;2016-10-09;3:24;1475971200;3.24 | ||
360552304;Сюита В Фа-Миноре;Своя Игра;Арсенал;Rock;Russia;2007;;1167609600; | ||
1232342909;Had It All;Sunday Morning Record;The Band Of Heathens;"Folk; World; & Country";;2013-09-17;4:23;1379376000;4.23 | ||
1254489900;"Hollywood Bowl; Los Angeles; CA; U.S.A. August 23; 1964";The Lost Hollywood Bowl Films;The Beatles;Non-Music;Japan;2018;;1514764800; | ||
1584961721;Man In My Life;Fear Of Flying;Mya;Pop;;;;; | ||
1190631208;Little Bit Of Nothing;All By Myself;Kevin Klimek;Rock;US;;2:57;;2.57 | ||
1254307002;Big Jim Salter;Niagara;Stone The Crows;Rock;Luxembourg;1990;15:41;631152000;15.41 |
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -434,6 +434,7 @@ def add_documents_csv( | |||||||||||||||||||||
self, | ||||||||||||||||||||||
str_documents: str, | ||||||||||||||||||||||
primary_key: Optional[str] = None, | ||||||||||||||||||||||
csv_delimiter: Optional[str] = None, | ||||||||||||||||||||||
) -> TaskInfo: | ||||||||||||||||||||||
"""Add string documents from a CSV file to the index. | ||||||||||||||||||||||
|
||||||||||||||||||||||
|
@@ -443,6 +444,8 @@ def add_documents_csv( | |||||||||||||||||||||
String of document from a CSV file. | ||||||||||||||||||||||
primary_key (optional): | ||||||||||||||||||||||
The primary-key used in index. Ignored if already set up. | ||||||||||||||||||||||
csv_delimiter: | ||||||||||||||||||||||
One ASCII character used to customize the delimiter for CSV. Comma used by default. | ||||||||||||||||||||||
|
||||||||||||||||||||||
Returns | ||||||||||||||||||||||
------- | ||||||||||||||||||||||
|
@@ -455,7 +458,7 @@ def add_documents_csv( | |||||||||||||||||||||
MeiliSearchApiError | ||||||||||||||||||||||
An error containing details about why Meilisearch can't process your request. Meilisearch error codes are described here: https://docs.meilisearch.com/errors/#meilisearch-errors | ||||||||||||||||||||||
""" | ||||||||||||||||||||||
return self.add_documents_raw(str_documents, primary_key, "text/csv") | ||||||||||||||||||||||
return self.add_documents_raw(str_documents, primary_key, "text/csv", csv_delimiter) | ||||||||||||||||||||||
|
||||||||||||||||||||||
def add_documents_ndjson( | ||||||||||||||||||||||
self, | ||||||||||||||||||||||
|
@@ -489,6 +492,7 @@ def add_documents_raw( | |||||||||||||||||||||
str_documents: str, | ||||||||||||||||||||||
primary_key: Optional[str] = None, | ||||||||||||||||||||||
content_type: Optional[str] = None, | ||||||||||||||||||||||
csv_delimiter: Optional[str] = None, | ||||||||||||||||||||||
alallema marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||||||||||
) -> TaskInfo: | ||||||||||||||||||||||
"""Add string documents to the index. | ||||||||||||||||||||||
|
||||||||||||||||||||||
|
@@ -499,7 +503,10 @@ def add_documents_raw( | |||||||||||||||||||||
primary_key (optional): | ||||||||||||||||||||||
The primary-key used in index. Ignored if already set up. | ||||||||||||||||||||||
type: | ||||||||||||||||||||||
The type of document. Type available: 'csv', 'json', 'jsonl' | ||||||||||||||||||||||
The type of document. Type available: 'csv', 'json', 'jsonl'. | ||||||||||||||||||||||
csv_delimiter: | ||||||||||||||||||||||
One ASCII character used to customize the delimiter for CSV. | ||||||||||||||||||||||
Note: The csv delimiter can only be used with the Content-Type text/csv. | ||||||||||||||||||||||
|
||||||||||||||||||||||
Returns | ||||||||||||||||||||||
------- | ||||||||||||||||||||||
|
@@ -512,7 +519,7 @@ def add_documents_raw( | |||||||||||||||||||||
MeiliSearchApiError | ||||||||||||||||||||||
An error containing details about why Meilisearch can't process your request. Meilisearch error codes are described here: https://docs.meilisearch.com/errors/#meilisearch-errors | ||||||||||||||||||||||
""" | ||||||||||||||||||||||
url = self._build_url(primary_key) | ||||||||||||||||||||||
alallema marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||||||||||
url = self._build_url(primary_key=primary_key, csv_delimiter=csv_delimiter) | ||||||||||||||||||||||
response = self.http.post(url, str_documents, content_type) | ||||||||||||||||||||||
return TaskInfo(**response) | ||||||||||||||||||||||
|
||||||||||||||||||||||
|
@@ -601,6 +608,7 @@ def update_documents_csv( | |||||||||||||||||||||
self, | ||||||||||||||||||||||
str_documents: str, | ||||||||||||||||||||||
primary_key: Optional[str] = None, | ||||||||||||||||||||||
csv_delimiter: Optional[str] = None, | ||||||||||||||||||||||
) -> TaskInfo: | ||||||||||||||||||||||
"""Update documents as a csv string in the index. | ||||||||||||||||||||||
|
||||||||||||||||||||||
|
@@ -609,7 +617,9 @@ def update_documents_csv( | |||||||||||||||||||||
str_documents: | ||||||||||||||||||||||
String of document from a CSV file. | ||||||||||||||||||||||
primary_key (optional): | ||||||||||||||||||||||
The primary-key used in index. Ignored if already set up | ||||||||||||||||||||||
The primary-key used in index. Ignored if already set up. | ||||||||||||||||||||||
csv_delimiter: | ||||||||||||||||||||||
One ASCII character used to customize the delimiter for CSV. Comma used by default. | ||||||||||||||||||||||
|
||||||||||||||||||||||
Returns | ||||||||||||||||||||||
------- | ||||||||||||||||||||||
|
@@ -622,13 +632,14 @@ def update_documents_csv( | |||||||||||||||||||||
MeiliSearchApiError | ||||||||||||||||||||||
An error containing details about why Meilisearch can't process your request. Meilisearch error codes are described here: https://docs.meilisearch.com/errors/#meilisearch-errors | ||||||||||||||||||||||
""" | ||||||||||||||||||||||
return self.update_documents_raw(str_documents, primary_key, "text/csv") | ||||||||||||||||||||||
return self.update_documents_raw(str_documents, primary_key, "text/csv", csv_delimiter) | ||||||||||||||||||||||
|
||||||||||||||||||||||
def update_documents_raw( | ||||||||||||||||||||||
self, | ||||||||||||||||||||||
str_documents: str, | ||||||||||||||||||||||
primary_key: Optional[str] = None, | ||||||||||||||||||||||
content_type: Optional[str] = None, | ||||||||||||||||||||||
csv_delimiter: Optional[str] = None, | ||||||||||||||||||||||
alallema marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||||||||||
) -> TaskInfo: | ||||||||||||||||||||||
"""Update documents as a string in the index. | ||||||||||||||||||||||
|
||||||||||||||||||||||
|
@@ -640,6 +651,9 @@ def update_documents_raw( | |||||||||||||||||||||
The primary-key used in index. Ignored if already set up. | ||||||||||||||||||||||
type: | ||||||||||||||||||||||
The type of document. Type available: 'csv', 'json', 'jsonl' | ||||||||||||||||||||||
csv_delimiter: | ||||||||||||||||||||||
One ASCII character used to customize the delimiter for CSV. | ||||||||||||||||||||||
Note: The csv delimiter can only be used with the Content-Type text/csv. | ||||||||||||||||||||||
|
||||||||||||||||||||||
Returns | ||||||||||||||||||||||
------- | ||||||||||||||||||||||
|
@@ -652,7 +666,7 @@ def update_documents_raw( | |||||||||||||||||||||
MeiliSearchApiError | ||||||||||||||||||||||
An error containing details about why Meilisearch can't process your request. Meilisearch error codes are described here: https://docs.meilisearch.com/errors/#meilisearch-errors | ||||||||||||||||||||||
""" | ||||||||||||||||||||||
url = self._build_url(primary_key) | ||||||||||||||||||||||
url = self._build_url(primary_key=primary_key, csv_delimiter=csv_delimiter) | ||||||||||||||||||||||
response = self.http.put(url, str_documents, content_type) | ||||||||||||||||||||||
return TaskInfo(**response) | ||||||||||||||||||||||
|
||||||||||||||||||||||
|
@@ -1530,8 +1544,13 @@ def __settings_url_for(self, sub_route: str) -> str: | |||||||||||||||||||||
def _build_url( | ||||||||||||||||||||||
self, | ||||||||||||||||||||||
primary_key: Optional[str] = None, | ||||||||||||||||||||||
csv_delimiter: Optional[str] = None, | ||||||||||||||||||||||
) -> str: | ||||||||||||||||||||||
if primary_key is None: | ||||||||||||||||||||||
parameters = {} | ||||||||||||||||||||||
if primary_key: | ||||||||||||||||||||||
parameters["primaryKey"] = primary_key | ||||||||||||||||||||||
if csv_delimiter: | ||||||||||||||||||||||
parameters["csvDelimiter"] = csv_delimiter | ||||||||||||||||||||||
if primary_key is None and csv_delimiter is None: | ||||||||||||||||||||||
Comment on lines
+1549
to
+1554
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't like that much to see these
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This will work, but I would argue it is less readable and not obvious what you are doing. If you do go this direction you could clean it up some: parameters = {k: v for k, v in parameters.items() if v} With either direction it could also be worth it to short circuit so parameters don't get created and run if not needed: def _build_url(
self,
primary_key: Optional[str] = None,
csv_delimiter: Optional[str] = None,
) -> str:
if primary_key is None and csv_delimiter is None:
return f"{self.config.paths.index}/{self.uid}/{self.config.paths.document}"
parameters = { "csvDelimiter": csv_delimiter, "primaryKey": primary_key }
parameters = {k: v for k, v in parameters.items() if v}
return f"{self.config.paths.index}/{self.uid}/{self.config.paths.document}?{parse.urlencode(parameters)}" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree with sanders41. In general, I prefer to avoid using the magic one-liners of Python to avoid complex understanding. |
||||||||||||||||||||||
return f"{self.config.paths.index}/{self.uid}/{self.config.paths.document}" | ||||||||||||||||||||||
primary_key = parse.urlencode({"primaryKey": primary_key}) | ||||||||||||||||||||||
return f"{self.config.paths.index}/{self.uid}/{self.config.paths.document}?{primary_key}" | ||||||||||||||||||||||
return f"{self.config.paths.index}/{self.uid}/{self.config.paths.document}?{parse.urlencode(parameters)}" | ||||||||||||||||||||||
sanders41 marked this conversation as resolved.
Show resolved
Hide resolved
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -197,6 +197,20 @@ def test_add_documents_csv(empty_index, songs_csv): | |
assert index.get_primary_key() == "id" | ||
|
||
|
||
def test_add_documents_csv_with_delimiter(empty_index, songs_csv_custom_separator): | ||
"""Tests adding new documents to a clean index.""" | ||
index = empty_index("csv-delimiter") | ||
response = index.add_documents_csv(songs_csv_custom_separator, csv_delimiter=";") | ||
assert isinstance(response, TaskInfo) | ||
assert response.task_uid is not None | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this really needed? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not really, but all the other tests are like that, so I find it weird to remove it for this one. |
||
task = index.wait_for_task(response.task_uid) | ||
assert task.status == "succeeded" | ||
alallema marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also the task status, if the wait did not work your test will fail anyway right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's possible for the task to complete, but the status to be failed right? I think this is what she is testing, that it was successful? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I got the idea, and my point is the assertion has no practical value because if the task fails, the We have 6 assertions in this test case, but only the last two are checking what the use case wants to assert (+ the receivedDocuments assertion), which is to verify if the documents were indexed properly. I know test code is supposed to be explicit instead of implicit, but in this case, I struggle to find good reasons to keep those assertions everywhere. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When you read this: def test_add_documents_csv_with_delimiter(empty_index, songs_csv_custom_separator):
"""Tests adding new documents to a clean index."""
index = empty_index("csv-delimiter")
response = index.add_documents_csv(songs_csv_custom_separator, csv_delimiter=";")
task = index.wait_for_task(response.task_uid)
assert task.details["receivedDocuments"] == 53
documents = index.get_documents().results
assert documents[1].country == "Europe"
assert documents[4].artist == "Elton John" Could you let me know if you missed the removed assertions? |
||
assert task.details["receivedDocuments"] == 53 | ||
documents = index.get_documents().results | ||
assert documents[1].country == "Europe" | ||
assert documents[4].artist == "Elton John" | ||
|
||
|
||
def test_update_documents_csv(index_with_documents, songs_csv): | ||
"""Tests updating a single document with csv string.""" | ||
index = index_with_documents() | ||
|
@@ -208,6 +222,20 @@ def test_update_documents_csv(index_with_documents, songs_csv): | |
assert index.get_primary_key() == "id" | ||
|
||
|
||
def test_update_documents_csv_with_delimiter(index_with_documents, songs_csv_custom_separator): | ||
"""Tests adding new documents to a clean index.""" | ||
index = index_with_documents() | ||
response = index.update_documents_csv(songs_csv_custom_separator, csv_delimiter=";") | ||
assert isinstance(response, TaskInfo) | ||
assert response.task_uid is not None | ||
task = index.wait_for_task(response.task_uid) | ||
assert task.status == "succeeded" | ||
alallema marked this conversation as resolved.
Show resolved
Hide resolved
|
||
assert task.details["receivedDocuments"] == 53 | ||
document = index.get_document("813645611") | ||
assert document.country == "Europe" | ||
assert document.artist == "Elton John" | ||
|
||
|
||
def test_add_documents_json(empty_index, small_movies_json_file): | ||
"""Tests adding new documents to a clean index.""" | ||
index = empty_index() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could reduce this csv file to a few lines :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! But thank @sanders41, the tests have been missing from the package build since #367. In case you wonder