Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libct/cg/sd/v1: do not update non-frozen cgroup after frozen failed. #3804

Conversation

jiusanzhou
Copy link
Contributor

@jiusanzhou jiusanzhou commented Apr 3, 2023

In code we have frozen the cgroup to avoid the processes get
an occasional "permission denied" error, while the systemd's application of device
rules is done disruptively. When the processes in the container can not
be frozen over 2 seconds (which defined in fs/freezer.go),
we still update the cgroup which resulting the container get an occasional
"permission denied" error in some cases.

This PR make sure to return error directly without updating cgroup, when freeze fails.

Fixes: #3803

@jiusanzhou jiusanzhou force-pushed the bugfix/skip-update-while-frozen-faield branch from 68168b7 to fad60b8 Compare April 3, 2023 13:57
Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the fix makes sense to me -- we should not update a non-frozen container.

Will take a deeper look later.

@jiusanzhou jiusanzhou force-pushed the bugfix/skip-update-while-frozen-faield branch from 6bacdf5 to 446bf56 Compare April 6, 2023 08:47
@kolyshkin
Copy link
Contributor

kolyshkin commented Apr 6, 2023

@jiusanzhou you need to rebase and fix linter warnings (the ones from validate and commit CI jobs look legit).

Update: ah, I see you've rebased already.

@kolyshkin
Copy link
Contributor

For the commit subject, I suggest something like

libct/cg/sd/v1: do not update non-frozen cgroup

For the description, you can get some text from #3803 to describe what happens and then go on to say how the commit is fixing this.

@jiusanzhou jiusanzhou force-pushed the bugfix/skip-update-while-frozen-faield branch 2 times, most recently from ac81a17 to 53a8bd4 Compare April 7, 2023 14:04
@jiusanzhou
Copy link
Contributor Author

@kolyshkin thank you for your advice, CI checking is green now. Please take another look.

@jiusanzhou jiusanzhou force-pushed the bugfix/skip-update-while-frozen-faield branch from 53a8bd4 to cbfa85b Compare April 17, 2023 07:44
@jiusanzhou
Copy link
Contributor Author

@cyphar PTAL

@jiusanzhou jiusanzhou force-pushed the bugfix/skip-update-while-frozen-faield branch from cbfa85b to 01caf55 Compare April 27, 2023 03:47
@jiusanzhou jiusanzhou changed the title bugfix: avoid an occasional "permission denied" error while frozen failed. #3803 libct/cg/sd/v1: do not update non-frozen cgroup after frozen failed. Apr 27, 2023
@jiusanzhou jiusanzhou force-pushed the bugfix/skip-update-while-frozen-faield branch from 01caf55 to ad9e9b9 Compare April 29, 2023 15:02
@zvier
Copy link
Contributor

zvier commented Jun 4, 2023

any update?

Copy link
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kolyshkin kolyshkin added backport/1.1-todo A PR in main branch which needs to be backported to release-1.1 area/systemd area/cgroupv1 labels Jun 6, 2023
@kolyshkin
Copy link
Contributor

@thaJeztah @AkihiroSuda @cyphar PTAL

In code we have frozen the cgroup to avoid the processes get
an occasional "permission denied" error, while the systemd's application of device
rules is done disruptively. When the processes in the container can not
be frozen over 2 seconds (which defined in fs/freezer.go),
we still update the cgroup which resulting the container get an occasional
"permission denied" error in some cases.

Return error directly without updating cgroup, when freeze fails.

Fixes: opencontainers#3803

Signed-off-by: Zoe <hi@zoe.im>
@jiusanzhou jiusanzhou force-pushed the bugfix/skip-update-while-frozen-faield branch from ad9e9b9 to 62963fe Compare June 12, 2023 02:10
@kolyshkin
Copy link
Contributor

@opencontainers/runc-maintainers PTAL

@AkihiroSuda
Copy link
Member

Can we have a test?

@kolyshkin
Copy link
Contributor

Can we have a test?

I'm not sure we can. To test the branch of code being added, we need cgroup freeze to fail, and I don't know how to do that reliably.

@AkihiroSuda AkihiroSuda merged commit 92c71e7 into opencontainers:main Jun 27, 2023
@kolyshkin
Copy link
Contributor

@jiusanzhou can you open a PR against release-1.1 branch?

@jiusanzhou
Copy link
Contributor Author

1.1 backport: #3921

@kolyshkin kolyshkin added backport/1.1-done A PR in main branch which has been backported to release-1.1 and removed backport/1.1-todo A PR in main branch which needs to be backported to release-1.1 labels Jul 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cgroupv1 area/systemd backport/1.1-done A PR in main branch which has been backported to release-1.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Still got an occasional "permission denied" error while cgroup frozen failed.
4 participants