Implement multipart partition object upload #99

gruuya · 2022-09-06T08:09:32Z

Instead of loading up the entire file into memory and uploading it in one big chunk, perform a multipart upload of small buffered file chunks.

Also, handle read errors during multipart uploads as well.

…mum payload size requirements

…the threshold

gruuya · 2022-09-07T13:30:29Z

As of the last implementation, the memory profile for the dataset addition from the docs (13:08:21.452Z is the start of the actual multipart partition files upload):

However, when running

CREATE EXTERNAL TABLE area1 STORED AS PARQUET LOCATION '/home/ubuntu/area1.parquet';
CREATE TABLE area1 AS SELECT * FROM staging.area1;

via a psql terminal (using area1.parquet, with size about 2.45GB) I'm seeing:

mildbyte · 2022-09-07T13:43:05Z

src/context.rs

@@ -304,6 +311,8 @@ pub async fn plan_to_object_store(
    }
    writer.close().map_err(DataFusionError::from).map(|_| ())?;

+    warn!("Starting upload of partition objects");


Suggested change

warn!("Starting upload of partition objects");

info!("Starting upload of partition objects");

mildbyte · 2022-09-07T13:45:40Z

src/context.rs

+                                }
+
+                                let part_size = part_buffer.len();
+                                warn!("Uploading part with {} bytes", part_size);


Suggested change

warn!("Uploading part with {} bytes", part_size);

info!("Uploading part with {} bytes", part_size);

(or even debug?)

mildbyte · 2022-09-07T13:46:07Z

src/lib.rs

+extern crate core;
+


Suggested change

extern crate core;

mildbyte · 2022-09-07T13:49:09Z

src/context.rs

+                            Ok(size) => {
+                                if size == 0 && part_buffer.is_empty() {
+                                    // We've reached EOF and there are no pending writes to flush
+                                    // TODO: as per the docs size = 0 doesn't actually guarantee that we've reached EOF


Is this still true? Weird, since there's no other way to signal EOF (unless it returns a special Err type)?

Yes, this is still true. I added a comment with a potential workaround using stream_len (nightly-only experimental API atm).

I've also tried to get the true file size from metadata beforehand and keep track of the total read bytes, but they never matched somehow.

gruuya · 2022-09-07T14:33:16Z

As a comparison, the current main show the following memory profiles for the two examples above:
supply_chains

area1

gruuya · 2022-09-07T15:04:38Z

Repeating profile once again (sanity check) for the current branch
supply_chains

area1

… of simultaneous multipart sending

…ce fix

gruuya · 2022-09-09T09:24:51Z

Another memory profile, with upload task semaphore, and object store multipart race fix
suply_chains

area1

Implement multipart partition object upload

b7774f8

gruuya requested a review from mildbyte September 6, 2022 08:09

gruuya self-assigned this Sep 6, 2022

gruuya linked an issue Sep 6, 2022 that may be closed by this pull request

Workaround for having to load Parquet files in-memory before uploading them #6

Closed

gruuya added 3 commits September 6, 2022 10:51

Extract object upload buffer and chunk sizes into constants

d9a2933

Also, handle read errors during multipart uploads as well.

Refactor the multipart upload to read from a buffer ensuring the mini…

955d041

…mum payload size requirements

Don't enforce exact part buffer size in multipart upload, only check …

4118901

…the threshold

mildbyte reviewed Sep 7, 2022

View reviewed changes

Revise log levels and add a EOF detection comment

8925d87

gruuya mentioned this pull request Sep 8, 2022

Fix multiple part uploads at once making vector size inconsistent apache/arrow-rs#2681

Merged

gruuya added 6 commits September 8, 2022 10:14

Make partition upload tasks wait for semaphore to limit memory impact…

959ba01

… of simultaneous multipart sending

Patch object store to version with fixed multipart races

7cf7007

Point object store patch to apache repo as the upstream PR was merged

0b1a859

Add package specification to object_store patch

b7abd10

Merge branch 'main' into partition-multipart-upload-cu-2v11y42

7bff7b7

Point object store patch to original 0.4.0 commit + only multipart ra…

f84133f

…ce fix

gruuya added 2 commits September 9, 2022 11:47

Rustify the matching logic for partition reading and uploading

375c3c5

Add heuristic for determining EOF

e16b8ae

gruuya merged commit a311282 into main Sep 9, 2022

gruuya deleted the partition-multipart-upload-cu-2v11y42 branch September 9, 2022 10:43

This was referenced Mar 13, 2023

Migration of storage layer to delta-rs #307

Merged

Excessive memory usage of Delta writers delta-io/delta-rs#1225

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement multipart partition object upload #99

Implement multipart partition object upload #99

gruuya commented Sep 6, 2022

gruuya commented Sep 7, 2022

mildbyte Sep 7, 2022

mildbyte Sep 7, 2022

mildbyte Sep 7, 2022

mildbyte Sep 7, 2022

gruuya Sep 7, 2022

gruuya commented Sep 7, 2022

gruuya commented Sep 7, 2022

gruuya commented Sep 9, 2022

	warn!("Starting upload of partition objects");
	info!("Starting upload of partition objects");

	warn!("Uploading part with {} bytes", part_size);
	info!("Uploading part with {} bytes", part_size);

Implement multipart partition object upload #99

Implement multipart partition object upload #99

Conversation

gruuya commented Sep 6, 2022

gruuya commented Sep 7, 2022

mildbyte Sep 7, 2022

Choose a reason for hiding this comment

mildbyte Sep 7, 2022

Choose a reason for hiding this comment

mildbyte Sep 7, 2022

Choose a reason for hiding this comment

mildbyte Sep 7, 2022

Choose a reason for hiding this comment

gruuya Sep 7, 2022

Choose a reason for hiding this comment

gruuya commented Sep 7, 2022

gruuya commented Sep 7, 2022

gruuya commented Sep 9, 2022