Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make activity check method configurable #43

Open
lpeabody opened this issue Jan 28, 2019 · 13 comments
Open

Make activity check method configurable #43

lpeabody opened this issue Jan 28, 2019 · 13 comments
Assignees

Comments

@lpeabody
Copy link
Contributor

Currently the calculation for active and dangling projects is based on the logs generated by containers with the virtual host label.

Our CI setup acomodates two use cases:

  1. Web sandboxes for QA.
  2. Artifact generation and deployment.

In the case of 2, the web host never generates any logs. None the less, we are using the CLI container constantly, but after the dangling period has elapsed, the project's containers are removed, even though we're actively using the CLI container for the project.

It would be nice to be able to configure cleanup policy on a per-sandbox basis, and choose between cli service and web service activity as the method for determining project inactivity.

@lmakarov
Copy link
Member

@lpeabody yeah, I've been running into this myself recently.

Unfortunately, there is no good way of knowing that something is happening inside cli, since no logs will be output there unless there is a failure or an access log entry produced by php-fpm (which would map to a web request made to web).

It's not documented, but you can use io.docksal.permanent=true label on the web container to mark an environment as permanent. However, now that I'm thinking about it, this won't solve this problem. The environment would still be hibernated after inactivity, but won't be garbage collected after an extended period of inactivity.

We could add another label to disable/change the global hibernation policy, but I'd rather want to figure out a way to know whether something important is going on inside cli (or other containers) or not.

@lmakarov lmakarov self-assigned this Jan 28, 2019
@lpeabody
Copy link
Contributor Author

Being hibernated is less of an issue for us, as our infrastructure permits preserving sandbox environments between pipeline runs (we use Bitbucket Pipelines, and yes docksal/ci-agent#39 is still very much on my radar!). The reason for preserving is that, at the end of the day, it saves us build minutes.

First use case (without preserving):

  1. Clone sandbox.
  2. Install dependencies from scratch.
  3. Build assets.
  4. Fetch existing artifact branch in deployment directory (can be quite large on a pull into fresh deployment repo).
  5. Create build artifact in deployment directory.
  6. Commit and push artifact in deployment directory.
  7. Process complete.

Second use case (with preserving):

  1. git pull
  2. Install dependencies (may be nothing to install, increasing speed).
  3. Build assets (may be nothing to build, increasing speed).
  4. Fetch existing artifact branch (99% of the time already has the latest from the artifact branch, nothing to pull).
  5. Create artifact commit.
  6. Push artifact commit.
  7. Process complete.

So, permanent will at least guarantee I'm never deleting that sandbox from the file system, keeping the benefits from the second use case and speeding up the build fairly significantly, and essentially only ever having to make sure we run fin up to ensure the containers are up and running (which is pretty quick, yay containers).

The only downside from this I see is that, well, the sandbox is permanent and someone is going to have to remember to physically clean up that sandbox and wipe it from the server when the project is sunset. I'd rather not depend on people to consciously remember to do this when projects wrap up, because I know most people won't remember to do it ;)

If the goal is to allow vhost-proxy to reasonably identify sandboxes that should be cleaned up over time, then this is a use case that will need to be solved for. Happy to contribute where I can in this space, as this is a feature I badly want to see working as I fear becoming a pack rat otherwise :)

@lpeabody
Copy link
Contributor Author

Could you add a log entry in cli every time fin exec is run?

@lpeabody
Copy link
Contributor Author

moby/moby#8662 (comment)

Looks like an interesting approach, wonder if this is something that can be experimented with?

@lmakarov
Copy link
Member

It is possible to forward stdout and stderr output from a command executed inside cli to the container logs like this:

$ fin exec 'composer --version 2>&1 | sudo tee /proc/1/fd/1'
Composer version 1.6.3 2018-01-31 16:28:17
$ fin logs cli
...
cli_1  | Composer version 1.6.3 2018-01-31 16:28:17

tee writes into /proc/1/fd/1 (stdout of 1/proc/1`, used by docker for logging) as well as prints the output on the screen.

@achekulaev
Copy link
Member

@lmakarov I guess @lpeabody meant if we could alter fin to write something into cli stdout upon fin exec

@lmakarov
Copy link
Member

So, I don't think we'd want to stream the output of every exec command to container logs.
Writing a log entry with the command details before and (possibly) after it finishes is an option.

This can either be done in fin OR we could add a shim script inside cli (and any other container we'd want this to work in) to execute commands with logging to docker logs.

I'm leaning towards the 2nd option, as that would move the complexity from fin to individual images. This way, fin exec would just use a standard interface with any Docksal image.

@achekulaev
Copy link
Member

It will be enough to just put in logs fin exec was executed to make vhost-proxy happy knowing something is happening in cli

@lmakarov
Copy link
Member

This would still require refactoring in vhost-proxy, as it only looking at logs in a single "primary" container

# Filter: primary containers with a defined virtual host
local running_projects=$(docker ps \
--filter "label=io.docksal.project-root" \
--filter "label=io.docksal.virtual-host" \
--format '{{ .ID }}:{{ .Label "com.docker.compose.project" }}')

We may need to introduce another label, which identifies containers within a project, which should be considered when determining project status.

@lpeabody
Copy link
Contributor Author

A couple of scenarios I've come up with (all in regards to designating a specific service for being monitored by vhost-proxy, this does not cover updating fin exec to ping the log):

Scenario 1: Make activity monitored service configurable (any service)

I think ideally we would add a label such as io.docksal.activity-monitored or something like that. And then in project configuration have an environment variable ACTIVITY_MONITORED_SERVICE=serviceid, with the value specifying which service the label becomes attached to. This service would also have the label io.docksal.project-root added to it to maintain cleanup compatibility.

In vhost-proxy, refactor the filter method to use label=io.docksal.activity-monitored in place of label=io.docksal.virtual-host.

The dilemma here is that Docker Compose does not make it easy to attach labels to arbitrary service names during docker-compose up (though docker-compose run does, see docker/compose#6159). So, to have arbitrary service names, we'd have to generate a composer override file for the service with that label attached and include that in COMPOSER_FILE.

Scenario 2: Designate either cli or web to be the monitored service

All of the above, except replace ACTIVITY_MONITORED_SERVICE with MONITOR_CLI_SERVICE=true, and leave web as the default (current functionality).

Create ~/.docksal/stacks/overrides-monitored-service.yml:

version: "2.1"

services:
  cli:
    labels:
      - io.docksal.project-root=${PROJECT_ROOT}
      - io.docksal.activity-monitored
      - io.docksal.permanent=${SANDBOX_PERMANENT:-false}
  web:
    labels:
      - io.docksal.virtual-host=${VIRTUAL_HOST},*.${VIRTUAL_HOST},${VIRTUAL_HOST}.*
      - io.docksal.cert-name=${VIRTUAL_HOST_CERT_NAME:-none}
      - io.docksal.project-root=${PROJECT_ROOT}

and update ~/.docksal/stacks/services.yml

...
  web:
    hostname: web
    image: ${WEB_IMAGE:-docksal/web:2.1-apache2.4}
    volumes:
      - project_root:/var/www:ro,nocopy  # Project root volume (read-only)
    labels:
      - io.docksal.virtual-host=${VIRTUAL_HOST},*.${VIRTUAL_HOST},${VIRTUAL_HOST}.*
      - io.docksal.cert-name=${VIRTUAL_HOST_CERT_NAME:-none}
      - io.docksal.project-root=${PROJECT_ROOT}
      - io.docksal.permanent=${SANDBOX_PERMANENT:-false}
      - io.docksal.activity-monitored
    environment:
      - APACHE_DOCUMENTROOT=/var/www/${DOCROOT:-docroot}
      - APACHE_BASIC_AUTH_USER
      - APACHE_BASIC_AUTH_PASS
    dns:
      - ${DOCKSAL_DNS1}
      - ${DOCKSAL_DNS2}
...

And of course update fin to check the value of MONITOR_CLI_SERVICE and add the override file to COMPOSE_FILE if true.

Final lunchbreak thoughts

I think that scenario 2 is the best bet. That is entirely manageable. I think once docker/compose#6159 lands then we could refactor to make the monitored service arbitrary (thinking scalability down the line for when/if Docksal can support projects that don't have cli or web services).

What are your thoughts? Chances are I missed something but I think I checked the most important boxes.

@lmakarov
Copy link
Member

@lpeabody I feel like there should be a single primary container in the stack that is used to determine the stack activity status.

ATM, that is whichever container has the following two tags:

io.docksal.project-root
io.docksal.virtual-host

# Filter: primary containers with a defined virtual host
local running_projects=$(docker ps \
--filter "label=io.docksal.project-root" \
--filter "label=io.docksal.virtual-host" \
--format '{{ .ID }}:{{ .Label "com.docker.compose.project" }}')

In the past, this used to be by name - web. Then we changed to tags to allow for "web-less" stacks or even individual containers.

You can assign any container in your stack these two labels and it will be picked up for monitoring by the vhost-proxy

See our node stack as an example of using this approach:
https://github.com/docksal/docksal/blob/develop/stacks/stack-node.yml#L10-L15

We could replace the two tag combination with a single tag (e.g. io.docksal.activity-monitored or io.docksal.primary).

Now, I'm not sure what's going to happen if you have more than a single "primary" container in your stack defined using either of those approaches. This needs to be tested.

@lpeabody
Copy link
Contributor Author

To be clear, I'm not advocating for more than one container to represent whether or not the project is active. My thinking was that it makes sense to have a single container determine project status (aligned with your thinking above), indicate that container with a meaningful label name, and filter on that label, with the caveat that io.docksal.project-root must be included alongside to maintain backwards compatibility with removing sandbox directories after the dangling timeout.

The reason I'm thinking that io.docksal.virtual-host should not be used is because it doesn't really make much sense to have it connected to a cli service that isn't hosting any web content. It probably wouldn't hurt to have it, I just try to keep things as lean as possible. The stack-node example you linked to is fine because the cli container actually needs to be web accessible in that instance, but when the service is just being used to compile and push artifacts it's unnecessary.

I see the cleanup logic kind of going like this:

In proxyctl stop:

# Filter: primary containers
local running_projects=$(docker ps \
	--filter "label=io.docksal.primary" \
	--format '{{ .ID }}:{{ .Label "com.docker.compose.project" }}')

And proxyctl cleanup:

# Filter: primary containers with a project root
projects=$(docker ps -a \
	--filter "label=io.docksal.project-root" \
	--filter "label=io.docksal.primary" \
	--format '{{ .ID }}:{{ .Label "com.docker.compose.project" }}:{{ .Label "io.docksal.project-root" }}:{{ .Label "io.docksal.permanent" }}')

TL;DR: Definitely think io.docksal.primary/io.docksal.activity-monitored should be unique to a single container. io.docksal.virtual-host should not be used when determining activity.

@lmakarov
Copy link
Member

ok, that makes sense.

I would even remove --filter "label=io.docksal.project-root" from the proxyctl cleanup filter and just skip containers that do not have io.docksal.project-root set to a valid directory.

This way, io.docksal.primary is the single label used to determine which container holds the additional configuration via the extra labels (like io.docksal.permanenet or io.docksal.project-root).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment