Skip to content
This repository has been archived by the owner on Dec 21, 2021. It is now read-only.

Move restart policy to systemd #263

Merged
merged 14 commits into from
Aug 30, 2021
Merged

Conversation

siegfriedweber
Copy link
Member

@siegfriedweber siegfriedweber commented Aug 17, 2021

Description

Fixes #207
Tested by stackabletech/agent-integration-tests#61

Changed

  • Handling of service restarts moved from the Stackable agent to systemd.

Removed

  • Check removed if a service starts up correctly within 10 seconds. systemd manages restarts now and the Stackable agent cannot detect if a service is in a restart loop.

Review Checklist

  • Code contains useful comments
  • (Integration-)Test cases added (or not applicable)
  • Documentation added (or not applicable)
  • Changelog updated (or not applicable)

@siegfriedweber siegfriedweber requested a review from a team August 17, 2021 14:56
@siegfriedweber siegfriedweber self-assigned this Aug 17, 2021
@siegfriedweber siegfriedweber force-pushed the move_restart_policy_to_systemd branch from 4310fa5 to 2fc6ef8 Compare August 19, 2021 07:21
Copy link
Member

@soenkeliebau soenkeliebau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, I've left a few questions that I'd like to clarify, but most of those will probably simply be resolveable.

As far as I can tell this would not notice if a service enters into a permanent restart loop, right? It would update the restart count on the pod, but never set a failed condition or something similar?

src/provider/states/pod/starting.rs Show resolved Hide resolved
src/provider/states/pod/running.rs Show resolved Hide resolved
src/provider/states/pod/running.rs Show resolved Hide resolved
src/provider/systemdmanager/service.rs Show resolved Hide resolved
@siegfriedweber siegfriedweber force-pushed the move_restart_policy_to_systemd branch from 2fc6ef8 to e48e1ea Compare August 23, 2021 10:39
razvan
razvan previously approved these changes Aug 23, 2021
Copy link
Member

@razvan razvan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't look closely at the code but the corresponding integration tests are green.

@siegfriedweber
Copy link
Member Author

As far as I can tell this would not notice if a service enters into a permanent restart loop, right? It would update the restart count on the pod, but never set a failed condition or something similar?

Right. If the restart policy of the pod is set to Always or OnFailure then this is the desired behavior, I think. The Stackable Agent also does not have a notion what a "restart loop" is. We regard a restart every 10 seconds as a restart loop but a restart every 10 minutes probably not.

@siegfriedweber siegfriedweber force-pushed the move_restart_policy_to_systemd branch 3 times, most recently from a205512 to abeac58 Compare August 26, 2021 14:25
soenkeliebau
soenkeliebau previously approved these changes Aug 30, 2021
Copy link
Member

@soenkeliebau soenkeliebau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The running state is no longer responsible for deciding if a service
should be restarted. This is done by systemd. Therefore the transition
from the running state to the starting state was removed.

The service_state function was added to the SystemdManager to check if a
service is running or terminated. This function is used for the
transition from the running to the terminated state and for patching the
container state.
When a systemd service is started then the agent does not wait anymore
ten seconds to check if it was successfully started because systemd
manages restarts now and the agent cannot detect if the service is in a
restart loop.
@siegfriedweber siegfriedweber merged commit 98bf547 into main Aug 30, 2021
@siegfriedweber siegfriedweber deleted the move_restart_policy_to_systemd branch August 30, 2021 11:24
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move restart behavior of services from agent to systemd
4 participants