Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a healthcheck. #142

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

holdenk
Copy link
Contributor

@holdenk holdenk commented Dec 3, 2022

See #124
We can expand on the healthcheck to do more than just check the localhost HTTP but figured this was a good start.

Dockerfile Outdated
@@ -3,6 +3,9 @@ ARG NGINX_NAME="nginx-${NGINX_VERSION}"

FROM debian AS build

HEALTHCHECK --interval=30s --timeout=3s --start-period=60s \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@holdenk this is in the wrong place... 😸 Because this Dockerfile uses a multi-stage build, this HEALTHCHECK belongs after FROM nginx (not the FROM debian AS build where it is currently proposed).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah gotcha, yeah a halth check in the build container is probably not very useful.

Dockerfile Outdated Show resolved Hide resolved
holdenk and others added 2 commits December 4, 2022 09:12
Co-authored-by: Michael Vorburger ⛑️ <mike@vorburger.ch>
Signed-off-by: Holden Karau <holden@pigscanfly.ca>
Copy link
Contributor

@vorburger vorburger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (But I haven't tested it in production.)

@@ -85,6 +85,9 @@ RUN echo "Cloning nginx and building $NGINX_VERSION (rev $NGINX_COMMIT from '$NG

FROM nginx:${NGINX_VERSION}

HEALTHCHECK --interval=30s --timeout=3s --start-period=60s \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering what a suitable interval could be... probably something LOWER than the Healthcheck that the Orchestrator runs against Nodes? (I don't know what intervals those are set to; see my question yesterday on Slack.) I'm currently using 7s on my livenessProbe, but chose that somewhat arbitrarily. The 30s seems relatively high, to me; but I'm sure more knowledgeable reviewers might have some thoughts about this.

@@ -85,6 +85,9 @@ RUN echo "Cloning nginx and building $NGINX_VERSION (rev $NGINX_COMMIT from '$NG

FROM nginx:${NGINX_VERSION}

HEALTHCHECK --interval=30s --timeout=3s --start-period=60s \
CMD (curl -f http://localhost/ipfs/QmXjYBY478Cno4jzdCcPy4NcJYFrwHZ51xaCP8vUwN9MGm/) || exit 1
Copy link
Contributor

@vorburger vorburger Dec 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, does this actually work? If I run curl -f http://192.168.1.xxx:31080/ipfs/QmXjYBY478Cno4jzdCcPy4NcJYFrwHZ51xaCP8vUwN9MGm/ from CLI, it has exit code 23, see https://everything.curl.dev/usingcurl/returns... which would fail this HEALTCHECK, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird when I run this inside of the container what I see is:

root@DEN:/usr/src/app# curl -f http://localhost/ipfs/QmXjYBY478Cno4jzdCcPy4NcJYFrwHZ51xaCP8vUwN9MGm/
<html>
<head><title>301 Moved Permanently</title></head>
<body>
<center><h1>301 Moved Permanently</h1></center>
<hr><center>nginx</center>
</body>
</html>
root@DEN:/usr/src/app# echo $?
0

Copy link
Contributor

@vorburger vorburger Dec 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duh, yeah, please ignore that... sorry. But while looking into this, I've noticed that as-is it would actually pass (exit code 0) whether or not it "really worked" (e.g. even if it's a wrong IPFS CID, or there is a problem to resolve it, it' still always just 301). I've had a closer look at curl options, what do you think about we do the following:

curl --fail --silent --output /dev/null --location --insecure http://localhost/ipfs/QmXjYBY478Cno4jzdCcPy4NcJYFrwHZ51xaCP8vUwN9MGm

The --fail is (your) -f just perhaps more explicit in the long form.

The --silent --output /dev/null are with an eye towards #114.

The --location --insecure follows that 301 redirect - and lets use "test" that it "really worked".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CMD (curl -f http://localhost/ipfs/QmXjYBY478Cno4jzdCcPy4NcJYFrwHZ51xaCP8vUwN9MGm/) || exit 1
CMD (curl --fail --silent --output /dev/null --location --insecure http://localhost/ipfs/QmXjYBY478Cno4jzdCcPy4NcJYFrwHZ51xaCP8vUwN9MGm/) || exit 1

@holdenk LGTY?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@holdenk do you want to click Accept / Apply for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants