-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to Raw Insert compressed CSV? #223
Comments
For example, inserting into T002 table:
with uncompressed data:
and compressed:
|
My apologies, this was recently broken by the PR that moved the query from an HTTP query param to the POST body. When inserting binary data, the query itself was not properly compressed. I'm working on releasing a fix. |
This should work correctly now in the new 0.6.8 version. Here's the new test case for reference: data_file = f'{Path(__file__).parent}/movies.csv.gz'
with open(data_file, mode='rb') as movies_file:
data = movies_file.read()
with table_context('test_gzip_movies', ['movie String', 'year UInt16', 'rating Decimal32(3)']):
insert_result = test_client.raw_insert('test_gzip_movies', None, data, fmt='CSV', compression='gzip',
settings={'input_format_allow_errors_ratio': .2,
'input_format_allow_errors_num': 5}
)
assert 248 == insert_result.written_rows I also added a compression parameter to the tools.insert_file method, so if you are inserting directly from the file system, this should work: import clickhouse_connect
from clickhouse_connect.driver.tools import insert_file
client = clickhouse_connect.get_client()
data_file = 'movies.csv.gz'
insert_result = insert_file(test_client, 'movies_table', data_file) The Thanks for the report and the detailed information to reproduce. |
Great, it's working now :) Is it possible to insert gzipped compressed data using the |
I forgot about that question. You can insert data using the command function by the same mechanism that the raw_insert method uses, by creating an You cannot use You can achieve somewhat similar results using external data with the command function and sending the data to the ClickHouse server as HTTP multi-part form data for processing. Unfortunately, the Otherwise you might want to stick with |
Hi,
I'm trying to insert CSV that is compressed with gzip, but I'm getting this error:
When I decompress the data using Python and insert it, the insert works flawlessly.
Steps to reproduce
ClickHouse version: 23.3.1.2823
clickhouse-connect version: 0.6.6
Inserting into ReplacingMergeTree engine table with the client:
and the command:
Also, is it possible to insert gzipped compressed data using the
command
method? If it is, how?Thanks
The text was updated successfully, but these errors were encountered: