[REF292] containerization of live_data_server #1

KedoKudo · 2022-03-02T22:20:00Z

Original gitlab story: REF292

This PR adds the Dockerfile and an example docker-compose file for the containerization of the live-data-server.
Corresponding documentation on how to build and run the application as a container is added to the readme.md as well.

jmborr

I searched for string systemctl anywhere in the package and found no hits. Do you know how script third-party/systemctl.py is consumed by the app?

jmborr · 2022-03-03T19:24:17Z

Dockerfile

+
+# ENV variables that need to be updated/supplied either from CLI or
+# docker-compose.yml
+ENV DATABASE_NAME=livedata


These environment variables should not be hardcoded in the image but rather initialized at run-time when the container starts. Is there a reason why you put them in the Dockerfile?

short version:
Defining the required variable inside the dockerfile is the standard practice.

long version:
The ENV in a docker file is intended to define the env vars that can be later overwritten via docker-compose or CLI.
Therefore the actual values here does not matter as they will definitely be different when live_data_server image is integrated into other system.
If these values are not defined here, the image won't be able to recognize these are env variables, and there is a good chance that the runtime container will have some strange issue with these vars.

Defining the required variable inside the dockerfile is the standard practice.

The standard practice for sensitive information like the database admin account is to avoid encoding it the source, like we're doing in web_reflectivity. You can place those variables in an .envrc file automatically read by command direnv, or you can manually export them in the terminal where you build your images.

These are not sensitive information since they are already visible from the source code.
During the deployment, the actual Authentication information is stored locally, so there is no reason to make things extra cryptic.

@peterfpeterson please advise.

Being a bit reductionist: having a password in the source today doesn't mean that it should stay there forever.

If we were going to redeploy this for something other than the test environment, I think we'd need the ability to change where the container looks for the database and the login at runtime so there is only a configuration change (not a rebuild) to go from remote testing environment to production environment. I really don't have a strong preference, but if we leave them hard-coded with the test information now and we work on this repo "later" we'll need to remove them then.

Github doesn't let you reveal secrets once they're set. :(

For the sake of consistency with other projects, let's not put the secrets in the repo and have them as CI/dev provided variables.

These are not secrets (even if we remove them from dockerfile, they are still visible from the settings.py, which means we can't hide them).
Also, defining them in dockerfile basically tell future devs that these are the environment vars you have to specify when spinning up a container based on the image.
Currently, live_data_server does not need to talk to the data base during the build, and the db action only happens

Github doesn't let you reveal secrets once they're set. :(

bummer, I thought the behavior was the same as GitLab's regarding environment variables.
I tried adding a DATABASE_USER and you're right, once set there's no way to see its value.

These are not secrets (even if we remove them from dockerfile, they are still visible from the settings.py, which means we can't hide them).

I'm looking at settings.py and these variables are being referenced but I don't see their actual values hardcoded in the source. What do you mean they're visible?

The values stored in the current dockerfiles are dummy ones, which will work locally.
You were suggesting removing the definition completely, which is what I am against as it is pointless:

defining them in the dockerfile makes it easier for future devs to know that these are the necessary vars that need to be provided when pinning up a container

the values of these env vars in the dockerfile are not important, but it allows one to quickly spin things up for checking.

an important point I think is missing at the moment: when spinning up the image as a new container, docker will update/overwrite these var with whatever is provided.

jmborr · 2022-03-03T19:26:44Z

docker-compose.yml

+      - db-data:/var/lib/postgresql/data/
+
+  livedata:
+    image: live_data:dev


This I don't understand. Can you clarify the meaning of this line for me? I though you'd be building the image with the Dockerfile instead of what it looks like importing an already created image.

This docker-file is intended as a way to demonstrate how to integrate the live_data_server into an existing system (i.e. it will need some env var defined as well as a predefined db that has the necessary table entries).
Most dev's would build the image locally, and the tag :dev is a common tag used to describe the local developed image.
Adding the building info here doesn't really help as we will not use docker-compose to build the image.

If you don't plan on building the image with docker-compose then you'll need to pass the secret database account at docker build time. I suggest to store the docker-build command as a Makefile target because it will probably be a lengthy command.

You don't need to pass the db info during the building time.
A simple docker build -t TAGNAME . is sufficient.

It is as you say provided you leave the database admin account secrets in the Dockerfile. If you remove then, then you need to pass then as arguments to docker build.

I am a little bit lost at the moment, so here is what I recommend:

leave everything as it is (this is just for documentation purpose according to the original enabler).

the actual artifact needed is the image, which is already in Gitlab

build a local image of the live_data_server with: docker build -t live_data:dev .

spin up the virtual network with docker-compose so that we can verify that the live_data server can talk to the external database via docker-compose up

if needed, go to localhost:9999/admin, and you should get 400 error as the livedata server does not have a valid admin page enabled

this is the current state of this app, and we are not refactoring/modernizing this web app according to the original Gitlab enabler.

I'm happy with everything except that initialization of variables DATABASE_USER, DATABASE_PASS, DATABASE_PORT, and DATABASE_HOST should be removed from the .env and DockerFile files.
@peterfpeterson , sorry but it's up to you to break the tie 😄

KedoKudo · 2022-03-03T19:38:13Z

The production deployment of live_data_server is via apachectl, which is a background process.
Docker does not want background process running inside a container, therefore we need to used a fake/alt systemctl to bypass this constrain.
Currently we are using the simple runtime server provided by django since the image is (currently) intended for development only.
However, it is possible that someone will try to do a 100% mimic of the actual production server, which will require using systemctl to launch the server in background, which is why this systemctl is added to bypass the constrain from docker.

KedoKudo added 6 commits March 2, 2022 13:21

add files for containerization

b48e4ea

add ENV to dockerfile

94ad9bc

fix env vars

2a704ec

add start script

e534d8b

ensure db is ready before live data

36f7fce

add instruction on the usage of the container

557eed3

KedoKudo requested a review from rosswhitfield March 2, 2022 22:21

jmborr requested changes Mar 3, 2022

View reviewed changes

peterfpeterson merged commit 94365ff into main Mar 7, 2022

peterfpeterson deleted the REF292_containerization branch March 7, 2022 16:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REF292] containerization of live_data_server #1

[REF292] containerization of live_data_server #1

KedoKudo commented Mar 2, 2022

jmborr left a comment

jmborr Mar 3, 2022

KedoKudo Mar 3, 2022

jmborr Mar 4, 2022

KedoKudo Mar 4, 2022

peterfpeterson Mar 4, 2022

peterfpeterson Mar 4, 2022

KedoKudo Mar 4, 2022

jmborr Mar 4, 2022

jmborr Mar 4, 2022

KedoKudo Mar 4, 2022

jmborr Mar 3, 2022

KedoKudo Mar 3, 2022

jmborr Mar 4, 2022

KedoKudo Mar 4, 2022

jmborr Mar 4, 2022

KedoKudo Mar 4, 2022

jmborr Mar 4, 2022

KedoKudo commented Mar 3, 2022

[REF292] containerization of live_data_server #1

[REF292] containerization of live_data_server #1

Conversation

KedoKudo commented Mar 2, 2022

jmborr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KedoKudo commented Mar 3, 2022