-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rust: Add a composefs-oci crate #286
Conversation
rust/composefs-oci/src/repo.rs
Outdated
anyhow::bail!("Requested fsverity, but target does not support it"); | ||
} | ||
let meta = RepoMetadata { | ||
version: String::from("0.5"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You said 0.1 above.
}; | ||
let mut digest = Digest::new(); | ||
composefs::fsverity::fsverity_digest_from_fd(tmpfile.as_file().as_fd(), &mut digest) | ||
.context("Computing fsverity digest")?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case where the filesystem doesn't support fs-verity, this will re-read the file to compute the fs-verity. We already have fs-verity digest support in libcomposefs that supports streaming, so we could have computed this digest while writing to the tmpfile above.
I guess that would be unnecessary in the case where the target fs supports fs-verity, as then we compute it twice, but we can avoid that if self.has_verity().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good point. I think though a better optimization will be to move the fsverity computation (whether userspace or kernel) into a worker thread pool, so we leave the main thread's job to basically reading from the network and saving data as quickly as possible. This way we avoid intermixing network and CPU bound work on the same thread. (OTOH on modern processors with sha256 acceleration maybe it doesn't matter, I haven't measured)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I don't know what the cost is of sha256 these days. Sometimes modern CPUs are fast enough that doing computation on memory is basically free compared to the memory bandwidth cost, but I don't know if that is true with sha256. And in this case its even compared to i/O as well. If the file is large, then we may actually have to read it back from disk (rather than cache).
I did some tests with sha256sum on my (pretty fast) machine on a 1GB file. cat to /dev/null takes ~270 msec uncached, and ~65 msec cached (real time). Running sha256sum instead takes 460 msec uncached (although this time varies more), and 440 msec cached.
I guess we can say that sha256 computation is the more costly operation compared to disk i/o. Cached read+sha is 7x time slower than a cached read. And total sha256 time is almost the same for uncached and cached reads, meaning that the disk read times were small (or at least pipelined) compared to the computation time.
Given the above we might want something like a threadpool that does fixed block size sha56 computation across n_cpu threads. Then we could just feed it block by block, getting parallelization even across a single large file.
I'm not sure I understand why you want this? During the import, can't you just compute the dump-format file line by line in combination with the object files and pass that to mkcomposefs? Why do you need the file metadata "on-disk"? |
This allows sharing them easily, as we have multiple crates. Signed-off-by: Colin Walters <walters@verbum.org>
3b2f2b2
to
4318f72
Compare
The high level goal of this crate is to be an opinionated generic storage layer using composefs, with direct support for OCI. Note not just OCI *containers* but also including OCI artifacts too. This crate is intended to be the successor to the "storage core" of both ostree and containers/storage. Signed-off-by: Colin Walters <walters@verbum.org>
For now I think it makes sense actually to keep this repository with just "core" functionality. |
The high level goal of this crate is to be an opinionated
generic storage layer using composefs, with direct support
for OCI. Note not just OCI containers but also including
OCI artifacts too.
This crate could to be the successor to
the "storage core" of both ostree and containers/storage.
This is an initial sketch! It'd be good to sync on the design/goals; it started to "feel" right but obviously there's a whole lot going on. In particular, when I was going through the current containers/storage composefs code...there's a few things here where I am not sure it has the right architecture.
//
etc. and symlink traversal.What I want to do next here is teach
mkcomposefs
to optionally honor auser.cfs.meta
xattr. This way we can unpack a tarball to disk, but instead of e.g. physically setting sensitive file metadata like owner uid, gid, suid bits, and security-related xattrs, we can do something much like what ostree'sbare-user
mode does and store them in an xattr. We can also skip making specials like device nodes and FIFOs "physically" and just make a zero-sized regular file with the xattr.