Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restart/Reconnect containers connected via 'network_mode: service' automatically when main service is restarted #6626

Closed
DavHau opened this issue Apr 1, 2019 · 31 comments

Comments

@DavHau
Copy link

DavHau commented Apr 1, 2019

Is your feature request related to a problem? Please describe.
When running the following docker-compose.yml:

version: "3.7"

services:
  
  mother:
    image: alpine
    command: "sleep 999999"
    restart: always

  child:
    image: alpine
    command: "sleep 888888"
    network_mode: "service:mother"

If the mother container is restarted for any reason (crash / manual restart), the child container loses its network forever.

$ docker-compose restart mother
$ docker-compose exec child ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever

The child container is fully disconnected from the world. It will not reattach to mother's network. It will be unable to communicate with other containers and the internet. Does it make any sense at all to continue running the child container in this state?

Describe the solution you'd like
Whenever a service is restarted which has other services connected to it via 'network_mode: service', then reconnect those other services or restart them if reconnecting is technically unfeasible.

Describe alternatives you've considered
A workaround using a healthcheck and autoheal is described here: #6329 (comment)

In discussions of other issues related to 'network_mode: service' it is suggested to use a user defined network instead. But as far as i know there are container compositions which require 'network_mode: service', for example when putting multiple containers behind a vpn. Please correct me if I'm wrong.

@stale
Copy link

stale bot commented Oct 9, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Oct 9, 2019
@hectorj
Copy link

hectorj commented Oct 10, 2019

AFAIK this is still an issue.

@stale
Copy link

stale bot commented Oct 10, 2019

This issue has been automatically marked as not stale anymore due to the recent activity.

@stale stale bot removed the stale label Oct 10, 2019
@ndeloof
Copy link
Contributor

ndeloof commented Oct 10, 2019

when you use "service" network_mode (i.e. sharing network namespace between containers), loosing connectivity on restart is really the expected behaviour. Comparable to using "host" network and getting the node shut down and service restarted elsewhere on cluster.

Such usage only makes sense for highly coupled containers (typically: containers in a kubernetes Pod) but not for services communicating together in a reliable way. Automatically restarting the dependent service would help you hide the networking constraints of your architecture but this is just cheating, better get your architecture to embrace the risk for dependent service being restarted or sacled up/down. For this purpose, use your compose file to define an explicit network connecting services together.

@stale
Copy link

stale bot commented Apr 7, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 7, 2020
@DavHau
Copy link
Author

DavHau commented Apr 8, 2020

@ndeloof

when you use "service" network_mode (i.e. sharing network namespace between containers), loosing connectivity on restart is really the expected behaviour.

Intuitively i would not call this expected behaviour. I give you some real world examples: In my home network if my network connection is dependent on a cable being plugged into my machine and i plug this cable out and then back in, I expect my machine to reconnect. Or if my network connection is dependent on some other machine, i.e. my router, and i restart that machine, i expect my network to be back up again after restarting that machine

Comparable to using "host" network and getting the node shut down and service restarted elsewhere on cluster.

I agree to this comparison as it demonstrates how useless such kind of behaviour is. This is why you would never configure your cluster in a way to run a service without a vital resource being present. And therefore i think it would be a good idea to also stop doing that in docker compose. When using "service" network_mode, the services are highly coupled, so that one cannot live without the other one. In a mother child configuration the child is strongly dependent on mother and it never makes sense to have the child running without a mother. There is no single good reason why you would not also stop the child if mother is gone / or cannot reunite with the child after being restarted.

For this purpose, use your compose file to define an explicit network connecting services together.

As i already stated in the original issue, there are some container configurations where creating an explicit network is not sufficient and instead you have to share the network adapter itself. For example forcing any kind of container to connect via a VPN container. Therefore your suggestion doesn't solve the problem.

@stale
Copy link

stale bot commented Apr 8, 2020

This issue has been automatically marked as not stale anymore due to the recent activity.

@stale
Copy link

stale bot commented Jan 1, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jan 1, 2021
@stale
Copy link

stale bot commented Jan 9, 2021

This issue has been automatically closed because it had not recent activity during the stale period.

@stale stale bot closed this as completed Jan 9, 2021
@SirDavidLudwig
Copy link

Amazing this issue still exists

@sambartik
Copy link

I agree with @DavHau. There should be at least an option to make this behaviour possible.

@rakbladsvalsen
Copy link

rakbladsvalsen commented Sep 25, 2021

Now that compose is transitioning to v2, maybe it would be worth to check this issue again? @ndeloof

There are quite a lot of usecases where automatically restarting the child's network stack is quite useful, as explained above. As of now, child containers will literally be deprived of all network connectiviy once the container providing the network stack dies.

The healthcheck workaround provided in the first comment is a rather brutal and completely ineffective approach, since child containers do not need to be brutally restarted as the only thing failing is their network stack, not the service itself nor whathever the container provides. Once the mother (or network-providing container) is restarted the network stack from child containers should be restarted/updated as well, thus avoiding brutal, and (hopefully not) taxing and long reloads for important services.

To make things worse I've seen quite a lot of images using healthchecks that only do probes internally. If some container offers some service at localhost:8000, chances are it's just using plain curl -f localhost:8000, which won't fail even if the container providing the network stack fails. This wouldn't be too much of an issue if they did something like curl -f localhost:8000 && curl -f google.com, but I for one don't support the idea of restarting completely fine-working containers just because their network stack malfunctioned for a brief moment.

@ndeloof
Copy link
Contributor

ndeloof commented Sep 27, 2021

if child depends on mother service, like defined here by network_mode (but could also be by any other shared namespace, as well as explicit depends_on) it would make sense to me restarting mother service would restart all the dependent services. (pull requests are welcome on v2 :P)

That being said, to connect services together you might better define a network to be shared between services. The only scenario I can imagine to require shared network namespace is for one of the service to access the other as localhost without the ability for you to change this behavior.

@BasePointer
Copy link

The only scenario I can imagine to require shared network namespace is for one of the service to access the other as localhost without the ability for you to change this behavior.

opening this issue again as the described situation is the one I am in

@glitch452
Copy link

For another use case for a feature like this, check out the Gluten project (https://github.com/qdm12/gluetun).

It's a VPN container that routes all the network traffic in the namespace through a VPN tunnel. So, any containers connected via the network_mode:container-name have their network traffic routed through the tunnel. This is great for applications which do not support proxy routing at the application level.

A feature that allows the child network to be re-connected automatically if the parent is restart would be fantastic for those containers that are dependent on gluetun for a secure connection to somewhere else.

@kraftzwerg
Copy link

I want to push this.

In my situation i use an vpn container and connect serval other containers via network_mode: "container:mycontainer"
sometimes i have to restart the VPN, to change the server, or just for maintenance. And after that, i have to manually restart all the child containers. I know, that i can write everything to the same compose file, but then i lose flexability.

A good behavior would be an option like:
restart: on_network
And then the child container restarts, if it loses the network connection. In the next Step this check shuld be done in configurable intervals, to prevent countless container restarts.

Kind Regards

@aaomidi
Copy link

aaomidi commented Apr 30, 2023

Could we keep this issue open?

@melyux
Copy link

melyux commented Jul 25, 2023

Can someone reopen this issue?

@gionag
Copy link

gionag commented Sep 11, 2023

+1

@ndeloof
Copy link
Contributor

ndeloof commented Sep 11, 2023

network_mode implies an explicit depends_on between services, and as such the "mother" service does already restart the depending services:

$ cat compose.yaml 
services:
  mother:
    image: nginx
  app:
    image: nginx
    network_mode: "service:mother"

$ docker compose up -d
[+] Building 0.0s (0/0)                                    docker:desktop-linux
[+] Running 3/3
 ✔ Network chose_default     Created                                       0.0s 
 ✔ Container chose-mother-1  Started                                       0.0s 
 ✔ Container chose-app-1     Started                                       0.0s 
$ docker compose restart mother
[+] Restarting 2/2
 ✔ Container chose-mother-1  Started                                       0.3s 
 ✔ Container chose-app-1     Started                                       0.0s 

if this is not what you get, please open a new issue with details on your configuration

@gionag
Copy link

gionag commented Sep 11, 2023

just tested, and if i restart the mother, in my implementation, doesn't trigger a restart on the child...

@ndeloof
Copy link
Contributor

ndeloof commented Sep 11, 2023

@gionag did you tried my example? Which version of compose are you running?

@GHOSCHT
Copy link

GHOSCHT commented Sep 11, 2023

As far as I understand the reasoning of the others they might mean that in case of a container crash (restart: always) or something similar (like manual docker container restart) the children aren't restarted. A restart only happens with the explicit compose restart command.

@raphamotta
Copy link

network_mode implies an explicit depends_on between services, and as such the "mother" service does already restart the depending services:

$ cat compose.yaml 
services:
  mother:
    image: nginx
  app:
    image: nginx
    network_mode: "service:mother"

$ docker compose up -d
[+] Building 0.0s (0/0)                                    docker:desktop-linux
[+] Running 3/3
 ✔ Network chose_default     Created                                       0.0s 
 ✔ Container chose-mother-1  Started                                       0.0s 
 ✔ Container chose-app-1     Started                                       0.0s 
$ docker compose restart mother
[+] Restarting 2/2
 ✔ Container chose-mother-1  Started                                       0.3s 
 ✔ Container chose-app-1     Started                                       0.0s 

if this is not what you get, please open a new issue with details on your configuration

Tested it and if I need idk, update the mother container with a newer image, add some environment variable, recreate the container (with the same name), I need to attach the mother network again (i'm using portainer)

@ndeloof
Copy link
Contributor

ndeloof commented Sep 11, 2023

Obviously this only applies when compose recreate the mother container. Any other scenario where user re-create container or container restart after a crash isn't managed by Compose

@aaomidi
Copy link

aaomidi commented Sep 12, 2023

Obviously this only applies when compose recreate the mother container. Any other scenario where user re-create container or container restart after a crash isn't managed by Compose

Hence the bug. If we think this doesn't belong in compose, then a bug in the docker runtime should probably track it?

@ndeloof since you're part of the docker organization, where do you think this should be tracked? At the end of the day I think we all want to see this bug/feature fixed/implemented.

I think part of the disconnect here is on what the purpose of docker-compose is. If I'm interpreting your comment correctly, compose is only intended to reconcile what's actually running in the docker runtime when it's directly invoked.

Others potentially expect the conditions & restrictions that are specified in a compose file to be used to continuously reconcile the state the container runtime is in. E.g. if something causes the docker runtime to put a container out of its intended state, then compose kicks in and reconciles the changes.

Maybe what we're asking here for is runtime continuous dependencies between different containers, vs at-the-time-of-command dependencies between the container definitions.

@ndeloof
Copy link
Contributor

ndeloof commented Sep 12, 2023

If we think this doesn't belong in compose, then a bug in the docker runtime should probably track it?

Definitively not under compose scope as long as events don't take place under its control.
I also don't think this should be reported to docker runtime: as you replace a resource, invalidating those which depends on it, it is your responsibility to manage the reconciliation. This is what compose offers when you use up to recreate container.
Is there any reason you want to do this on your own ?

@ghost
Copy link

ghost commented Apr 3, 2024

Would love this as well. I have a VPN container and all dependents on it lose network if this container is restarted/crashed etc.

@ndeloof
Copy link
Contributor

ndeloof commented Apr 3, 2024

@Fossil01 engine is not aware of relation between services declared in compose, so it can't manage such a "cascade" restart.

@melyux
Copy link

melyux commented Apr 3, 2024

It probably should be aware of such things.

@ndeloof
Copy link
Contributor

ndeloof commented Apr 4, 2024

@melyux this should be discussed on github.com/moby/moby
my 2 cents: engine already manages restart policy "on failure", maybe it could also manage shared-namespace source being restarted

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests