[BUG] High CPU/Memory usage for large-payload insert queries #499

zghong · 2024-12-25T07:32:11Z

Describe the bug
When running large-payload insert queries (e.g., offline data import rate of 3GB/s), chproxy experiences severe performance bottlenecks, with a significant increase in CPU/Memory utilization, and even causing the online chproxy service down in some cases.

To Reproduce
In offline data import scenario, large amounts of data are batch-inserted into ClickHouse via chproxy. If each entry is approximately 100 bytes, constructing an INSERT SQL statement to insert 500,000 rows results in an overall size of around 50MB. Sending this SQL via chproxy's POST method results in the aforementioned issues (30 concurrent processes, each performing 100 insert operations).

CREATE TABLE IF NOT EXISTS test_db.test_tb
(
    `id` UInt32,
    `partCode` UInt16,
    `shopId` UInt64,
    `processDate` Date,
    `vTagType` LowCardinality(String),
    `shop_first_order_time` SimpleAggregateFunction(anyLast, Date),
    `shop_latest_order_time` SimpleAggregateFunction(anyLast, Date),
    `pct` SimpleAggregateFunction(anyLast, Float64),
    `valid_ord_montary` SimpleAggregateFunction(anyLast, Float64)
)
ENGINE = AggregatingMergeTree()
PARTITION BY vTagType
ORDER BY (shopId, partCode, id);

#!/bin/bash

output_file="insert_data.sql"
table_name="test_db.test_tb"
batch_size=10000
total_records=500000

echo -n "INSERT INTO $table_name VALUES" > $output_file
# generate data and write to file
for ((batch=0; batch<total_records/batch_size; batch++)); do
    values=""
    for ((i=1; i<=batch_size; i++)); do
        id=$((batch * batch_size + i))
        partCode=$((RANDOM % 65536))
        shopId=$((RANDOM % 1000000000000))
        processDate=$(date -d "$((RANDOM % 365)) days ago" +"%Y-%m-%d")
        vTagType="type_$((RANDOM % 10))"
        shop_first_order_time=$(date -d "$((RANDOM % 365)) days ago" +"%Y-%m-%d")
        shop_latest_order_time=$(date -d "$((RANDOM % 365)) days ago" +"%Y-%m-%d")
        pct=$(awk -v min=0 -v max=100 'BEGIN{srand(); print min+rand()*(max-min)}')
        valid_ord_montary=$(awk -v min=0 -v max=1000 'BEGIN{srand(); print min+rand()*(max-min)}')

        values+="($id, $partCode, $shopId, '$processDate', '$vTagType', '$shop_first_order_time', '$shop_latest_order_time', $pct, $valid_ord_montary),"
    done

    # remove the last comma
    values=${values%,}
    # append to file
    if (( batch == (total_records/batch_size - 1) )); then
        echo -n "$values;" >> $output_file  # End the last batch with a semicolon
    else
        echo -n "$values," >> $output_file  # End other batches with a comma
    fi

    echo "Batch $((batch + 1)) of $((total_records / batch_size)) written to $output_file"
done

echo "All insert statements have been written to $output_file"

Expected behavior
In v1.15.0 and earlier versions, the same stress testing did not present similar problems, and better performance with low CPU and Memory utilization.

Environment information

ClickHouse Version: 24.3.13.40, 2 shards and 2 replicas per shard, total 4 nodes (16C48G).
chproxy Version: 1.26.4, total 1 node (4C16G).

zghong · 2024-12-25T08:20:38Z

The flame graph indicates that the functions readAndRestoreRequestBody and getFullQueryFromBody are heavily involved in I/O operations. I believe these two functions are causing unnecessary performance degradation, even though retry and throughput limits are not set in the configuration (#296, #299).

zghong · 2024-12-25T08:28:58Z

Looks like this is a duplicate of #428

zghong linked a pull request Dec 25, 2024 that will close this issue

fix: avoid useless io in large-payload queries #500

Open

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] High CPU/Memory usage for large-payload insert queries #499

[BUG] High CPU/Memory usage for large-payload insert queries #499

zghong commented Dec 25, 2024

zghong commented Dec 25, 2024

zghong commented Dec 25, 2024

[BUG] High CPU/Memory usage for large-payload insert queries #499

[BUG] High CPU/Memory usage for large-payload insert queries #499

Comments

zghong commented Dec 25, 2024

zghong commented Dec 25, 2024

zghong commented Dec 25, 2024