-
Notifications
You must be signed in to change notification settings - Fork 553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rbd: Add timeout for cryptsetup commands #4912
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable to me. Did you manage to test this somehow, and see how everything gets corrected by a different container?
we need to ensure that the CLI command we start is killed and doesn't continue to execute, please test it the changes to ensure that it works as expected and also provide some test results on how this is tested |
internal/util/cryptsetup.go
Outdated
const cryptsetupPBKDFMemoryLimit = 32 << 10 // 32768 KiB | ||
const ( | ||
// Maximum time to wait for cryptsetup commands to complete. | ||
CryptSetupExecutionTimeout = 3 * time.Minute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets keep the timeout which is almost less than the GRPC timeout between the sidecar and the cephcsi so that we try and if it didn't work as will kill the process before the next GRPC kicks in on any other nodes/same node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea. Let's keep them a minute apart then? Accounting for any delays that might occur in sending gRPC request and also to keep some buffer period after sending SIGKILL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minute might be too much, we can reduce it by 30 seconds to be on safer side, @nixpanic thoughts?
Trying but not able to yet. I am thinking of having a wrapper to cryptsetup that sleeps for some time and then tries to execute the intended command. The key is to kill the process in midst of operations. I will test it and report. |
I also think the only way reproduce this, is by adding a wrapper for the command. It helps to make the command do something that can be verified after aborting, maybe something like this (untested): #!/bin/sh
#
# Assume a 1GB block-device, write unencrypted data at 500MB offset.
#
# run normal crypsetup, it will be quick and exits cleanly
cryptsetup.real $@
# run dd in a loop, small I/O with sync to make it slower
# only runs the loop if /tmp/testing exists
while [ -e /tmp/testing ]
do
# set correct device and verify parameters
dd if=/dev/urandom of=/dev/rbd... bs=1 count=100m offset=500m
done Make sure to name the script |
Hey @black-dragon74 , I'll mark this as a draft for now, otherwise I keep checking it for updates. Please move it back as ready for review once you have tested it. Thanks! |
a68d41f
to
3802a3a
Compare
943698e
to
911d52f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renaming the type and functions make it better, please take care of that.
The suggestion with the interface and moving to a separate package are optional and can be done in a follow-up PR if you prefer.
911d52f
to
37c0f44
Compare
f42438d
to
22bcec3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just need to start context before obtaining locks and use the same context for get/put calls to kms.
internal/rbd/encryption.go
Outdated
// Create a new luks wrapper | ||
timeoutCtx, cancel := context.WithTimeout(ctx, cryptsetup.ExecutionTimeout) | ||
luks := cryptsetup.NewLUKSWrapper(timeoutCtx) | ||
defer cancel() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be done before obtaining locks at line 481.
Pass the same context to get/put cryptopassphrase calls as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
22bcec3
to
2ad0ad8
Compare
@Mergifyio rebase |
This PR modifies the execCryptSetupCommand so that the process is killed in an event of lock timeout. Useful in cases where the volume lock is released but the command is still running. Signed-off-by: Niraj Yadav <niryadav@redhat.com>
✅ Branch has been successfully rebased |
2ad0ad8
to
ddf7ec4
Compare
Rakshits concern has been addressed
@Mergifyio queue |
✅ The pull request has been merged automaticallyThe pull request has been merged automatically at 1c02e69 |
/test ci/centos/k8s-e2e-external-storage/1.31 |
/test ci/centos/mini-e2e-helm/k8s-1.31 |
/test ci/centos/k8s-e2e-external-storage/1.29 |
/test ci/centos/mini-e2e/k8s-1.31 |
/test ci/centos/mini-e2e-helm/k8s-1.29 |
/test ci/centos/k8s-e2e-external-storage/1.30 |
/test ci/centos/mini-e2e/k8s-1.29 |
/test ci/centos/mini-e2e-helm/k8s-1.30 |
/test ci/centos/mini-e2e/k8s-1.30 |
/test ci/centos/upgrade-tests-cephfs |
/test ci/centos/upgrade-tests-rbd |
Describe what this PR does
This PR introduces a new wrapper around LUKS utility functions that allow us to have context aware subsequent operations,
util.NewLuksWrapperWithContext
. This context is shared across the functions exposed by this wrapper.A sample use case is to have a context that times out after certain period of time.
How to Test
We can simulate a timeout by replacing the cryptsetup binary with a wrapper script that introduces some delay.
mv /usr/sbin/cryptsetup /usr/sbin/cryptsetup_orig
/usr/sbin/cryptsetup
and paste the following:chmod +x /usr/sbin/cryptsetup