-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
concurrent file merging in passthrough mode to reduce initial pull time #1987
base: main
Are you sure you want to change the base?
Conversation
cc @ktock |
if _, err := w.Write(ip); err != nil { | ||
sf.gr.putBuffer(b) | ||
|
||
if _, err := w.Write(buffer); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to guarantee, check and test exhaustiveness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The operation of verifyOneChunk
is within processBatchChunks
:
func (sf *file) processBatchChunks(args *batchWorkerArgs) error {
...
if _, err := sf.fr.ReadAt(bufStart, chunk.offset); err != nil && err != io.EOF {
return fmt.Errorf("failed to read data at offset %d: %w", chunk.offset, err)
}
if err := sf.gr.verifyOneChunk(sf.id, bufStart, chunk.digestStr); err != nil {
return fmt.Errorf("chunk verification failed at offset %d: %w", chunk.offset, err)
}
}
return nil
}
fs/reader/reader.go
Outdated
buffer := make([]byte, batchSize) | ||
|
||
var wg sync.WaitGroup | ||
errChan := make(chan error, len(batchChunks)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use errgroup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
w.Abort() | ||
return fmt.Errorf("failed to read data: %w", err) | ||
batchSize := batchEnd - batchStart | ||
buffer := make([]byte, batchSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs mutex for all shared variables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that there is no need to lock the buffer. I have added a diagram and explanation in the PR description as follows.
Specifically, before concurrent reading begins, the buffer area is divided, and each worker is assigned a specific region. As a result, the operations of each worker on the buffer do not conflict with one another.
|
||
To optimize the time taken for the initial image pull, you can use the `merge_buffer_size` and `merge_worker_count` configuration options. The `merge_buffer_size` specifies the size of the buffer used for reading the image, with a default value of 400MB. The `merge_worker_count` determines the level of concurrency for reading the image, with a default value of 10. | ||
|
||
By concurrently reading chunks and caching them for batch writing, you can significantly enhance the performance of the initial image pull in passthrough mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
significantly enhance the performance
Do you have benchmarks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have pushed the test image to docker hub: abushwang/cmerge:514M-c7717cb6
.
The image information is as follows, where the sha512sum
command is executed on a large file. The value of the large file is intentionally chosen to avoid being aligned to 512 bytes, in order to cover scenarios that test the boundaries of the buffer.
nerdctl inspect abushwang/cmerge:514M-c7717cb6
...
"Cmd": [
"sh",
"-c",
"sha512sum /data/514MB"
],
...
- Execute
sha512sum
locally to record the original SHA value of the file:
sha512sum 514MB
c7717cb60f2b1400def55bb3451ab58004d7c6c8ecc2c0982a9828d86eb7826d730f711a01c9af982f447a0958b50933dbba4ec5d7f8f33de0b299385c19c94c 514M
- Compare the time taken before and after optimization:
Before:
time nerdctl --insecure-registry run --rm -it --snapshotter stargz $images_name
...
9.30.17.238:5000/ocs9:514-sha512-esgz: resolved |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:ba9f6fcca13ffcbc05bfbb0a7197d1ef9541e45fcfcb50cb778a3c4930240c70: done |++++++++++++++++++++++++++++++++++++++|
config-sha256:42d6f47a2f5eea7e5b65730666273aa258999886989841edac42d9023bf3c77d: done |++++++++++++++++++++++++++++++++++++++|
elapsed: 0.1 s total: 0.0 B (0.0 B/s)
c7717cb60f2b1400def55bb3451ab58004d7c6c8ecc2c0982a9828d86eb7826d730f711a01c9af982f447a0958b50933dbba4ec5d7f8f33de0b299385c19c94c /data/514MB
real 0m8.584s
user 0m0.274s
sys 0m0.068s
After:
time nerdctl --insecure-registry run --rm -it --snapshotter stargz $images_name
...
9.30.17.238:5000/ocs9:514-sha512-esgz: resolved |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:ba9f6fcca13ffcbc05bfbb0a7197d1ef9541e45fcfcb50cb778a3c4930240c70: done |++++++++++++++++++++++++++++++++++++++|
config-sha256:42d6f47a2f5eea7e5b65730666273aa258999886989841edac42d9023bf3c77d: done |++++++++++++++++++++++++++++++++++++++|
elapsed: 0.1 s total: 0.0 B (0.0 B/s)
c7717cb60f2b1400def55bb3451ab58004d7c6c8ecc2c0982a9828d86eb7826d730f711a01c9af982f447a0958b50933dbba4ec5d7f8f33de0b299385c19c94c /data/514MB
real 0m4.891s
user 0m0.288s
sys 0m0.037s
From the test results, we can conclude that:
- The SHA value of the file remains unchanged after optimization, ensuring the integrity of the file.
- There is a significant performance improvement compared to before optimization.
Signed-off-by: abushwang <abushwangs@gmail.com>
In passthrough mode, the initial pull of an image requires merging chunks into a file. This process can be time-consuming, especially for large files.
This pr optimizes the merging process by introducing two configuration options:
merge_buffer_size
andmerge_worker_count
. By concurrently reading chunks and caching them for batch writing, we enhance the performance of the initial image pull in passthrough mode.The principle of concurrent reading is illustrated in the diagram below.

The
merge_worker_count
corresponds to the number of workers in the diagram, whilemerge_buffer_size
corresponds to the size of the buffer in the diagram. Before concurrent reading begins, the buffer area is divided, and each worker is assigned a specific region. This approach allows all workers to concurrently read the contents of the chunks into the buffer.