Dockerized standalone Fleet Server #2359

jsoriano · 2023-02-20T20:08:41Z

What is the problem this PR solves?

Allow to run Fleet Server standalone without check in in production builds.

How does this PR solve the problem?

Standalone mode previously used in development mode can be also used now in release builds.
Elasticsearch version check is not done in standalone mode.
Migrations are not done in standalone mode.
Health is checked in standalone mode by checking that the service can read the policies index.

After this, fleet-server only needs an Elasticsearch service account to start.

How to test this PR locally

fleet-server -c ./fleet-server.yml -E output.elasticsearch.ssl.verification_mode=none -E fleet.agent.checkin=false -E output.elasticsearch.service_token=....

Or make release-docker, and use the built docker image in your docker or kubernetes environment.

Design Checklist

I have ensured my design is stateless and will work when multiple fleet-server instances are behind a load balancer.
I have or intend to scale test my changes, ensuring it will work reliably with 100K+ agents connected.
I have included fail safe mechanisms to limit the load on fleet-server: rate limiting, circuit breakers, caching, load shedding, etc.

Checklist

I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have made corresponding change to the default configuration files
I have added tests that prove my fix is effective or that my feature works
I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

jsoriano · 2023-02-20T20:12:09Z

internal/pkg/server/fleet.go

+	if cfg.Fleet.Agent.Checkin {
+		sm := policy.NewSelfMonitor(cfg.Fleet, bulker, pim, cfg.Inputs[0].Policy.ID, f.reporter)
+		if f.standAlone {


I am assuming by now that "check in" and "standalone" mode are two different things, but I wonder if there is any use case for running standalone with check in (maybe only for development?), or to run in agent without checkin.

More about this in #2359 (comment)

I do not think there is some value to support checkin at all in standalone, but for these we probably need to make a change in Kibana to remove the restrictions of having an healthy fleet server agent, I can make a PR that introduce that change with a feature flag if you need it for testing.

I agree with @nchaulet, registration + checkin was added just to get around the kibana restrictions.
If we wanted to release a true "stand-alone" server it should not even need to enroll itself

We can then fully remove standAloneSetup and standAloneCheckin?

I can make a PR that introduce that change with a feature flag if you need it for testing.

@nchaulet that would be great. I guess this would solve part of the "Add Fleet Server" UI Components in https://github.com/elastic/ingest-dev/issues/1530. Instead of using a feature flag we could maybe use the check mentioned there if it is already available.

jsoriano · 2023-02-20T20:12:59Z

fleet-server.yml

@@ -3,12 +3,15 @@ output:
  elasticsearch:
    hosts: '${ELASTICSEARCH_HOSTS:https://localhost:9200}'
    service_token: '${ELASTICSEARCH_SERVICE_TOKEN}'
-    ssl.ca_trusted_fingerprint: '${ELASTICSEARCH_CA_TRUSTED_FINGERPRINT}'
+    # ssl.ca_trusted_fingerprint: '${ELASTICSEARCH_CA_TRUSTED_FINGERPRINT}'


Why is this mandatory now?

By default elasticsearch runs in https

jsoriano · 2023-02-20T20:17:02Z

@michel-laterman please take a look to this draft to check if I am going in the good direction 🙂 Thanks!

elasticmachine · 2023-02-20T20:17:46Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Start Time: 2023-03-06T16:22:24.994+0000
Duration: 13 min 23 sec

Test stats 🧪

Test	Results
Failed	0
Passed	605
Skipped	1
Total	606

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.

amitkanfer · 2023-02-21T07:57:13Z

@jsoriano won't it be best to have a new "standalone" config that will toggle the new checkin flag you're adding here? i guess that in the future we'll need to control more configs than just this one...
WDYT?

alexsapran · 2023-02-21T11:27:07Z

We discussed this with @joshdover.

We could use the internal docker registry and publish the container image under the employees path.

After some quick searching in GH I noticed how it's done in endpoint-dev repo for some dockery-related builds, setting an NS (namespace) to docker.elastic.co/employees/ instead of docker.elastic.co/.

jsoriano · 2023-02-21T12:15:24Z

@jsoriano won't it be best to have a new "standalone" config that will toggle the new checkin flag you're adding here? i guess that in the future we'll need to control more configs than just this one... WDYT?

Yeah, I guess this is related to my other question in #2359 (comment). It will depend on the use cases we support, if for all standalone Fleet Servers we want to disable the checkin, then we can have a single "standalone" config that also disables the checkin.

But, currently we also have an -agent-mode flag that disables standalone, and I think that we may have cases where we want to enable or disable the checkin independently of running in agent mode or in standalone mode:

Use case	`-agent-mode` (no standalone)	Check in
Current Fleet Server managed by Agent	true	true
Current standalone Fleet Server for development	false	true
Standalone Fleet Server on premise	false	true
Standalone managed Fleet Server (this PR)	false	false

This is why, in principle, I considered to have this as an explicit setting.

jsoriano · 2023-02-21T12:16:52Z

We discussed this with @joshdover.

We could use the internal docker registry and publish the container image under the employees path.

After some quick searching in GH I noticed how it's done in endpoint-dev repo for some dockery-related builds, setting an NS (namespace) to docker.elastic.co/employees/ instead of docker.elastic.co/.

@alexsapran sounds good. There is a separate issue about publishing the image being added here. Let's keep the discussion about this there. It is #2352.

alexsapran · 2023-02-21T12:36:25Z

We discussed this with @joshdover.

We could use the internal docker registry and publish the container image under the employees path.

After some quick searching in GH I noticed how it's done in endpoint-dev repo for some dockery-related builds, setting an NS (namespace) to docker.elastic.co/employees/ instead of docker.elastic.co/.

@alexsapran sounds good. There is a separate issue about publishing the image being added here. Let's keep the discussion about this there. It is #2352.

Makes sense, thanks for the link, I was not aware of that other issue, will cross post my comment there.

michel-laterman · 2023-02-21T20:19:25Z

Dockerfile

@@ -0,0 +1,22 @@
+ARG GO_VERSION
+FROM golang:${GO_VERSION}-buster AS builder


Why are we using this instead of the same image as we base https://github.com/elastic/fleet-server/blob/main/Dockerfile.build off of?

Both Dockerfiles have different purpouses, Dockerfile.build is used to build the release packages for all platforms, and writes the generated packages to the working copy. This one only builds the binary in the platform it is executed, without writing to the working copy, and from it builds the docker image.

I could make a common Dockerfile for both to import, but I think they wouldn't have so much common code. We would also need to somehow handle the different platforms here.

It would be good though to use here make release, to ensure that we are building the same release artifacts in both cases.

I will give another thought to this. I am mainly using this Dockerfile now to have an image for testing on Kubernetes. We could have a more final Dockerfile later.

I have replaced the build command in the Dockerfile to use make release-linux/amd64 instead, so we build with the same options as other release builds.

We will have to revisit this if we support other OSs or architectures.

…ne-build

mergify · 2023-02-27T15:56:29Z

This pull request is now in conflicts. Could you fix it @jsoriano? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b fleet-server-standalone-build upstream/fleet-server-standalone-build
git merge upstream/main
git push upstream fleet-server-standalone-build

…ne-build

jsoriano · 2023-03-01T21:07:32Z

Opening for review. @michel-laterman @nchaulet please take another look.

juliaElastic · 2023-03-02T09:30:56Z

This is great! It would be nice to add the example command to start to the README.

nchaulet · 2023-03-02T15:04:28Z

Looks great to me, I tested this locally and it worked well!

michel-laterman

lgtm; the only other thing I can think of is to add instructions in the readme's dev section on how to run Kibana so that you can run a stand-alone fleet-server.

michel-laterman · 2023-03-02T17:45:51Z

internal/pkg/policy/standalone_test.go

+// you may not use this file except in compliance with the Elastic License.
+
+//go:build !integration
+// +build !integration


(nitpick) I don't think we need the // + directives

👍 Removed from all files.

…ne-build

jsoriano · 2023-03-03T10:13:09Z

It would be nice to add the example command to start to the README.

the only other thing I can think of is to add instructions in the readme's dev section on how to run Kibana so that you can run a stand-alone fleet-server.

Added docs about building the docker image for standalone and about the experimental flag to be able to use the Fleet UI. @juliaElastic @michel-laterman please take another look.

juliaElastic

LGTM, thanks for adding the commands to the README.

joshdover

We should make it very clear in migration.go that no new migrations should be added going forward. Even better if we make the application crash if a new one is added

internal/pkg/server/fleet.go

README.md

Co-authored-by: Josh Dover <1813008+joshdover@users.noreply.github.com>

…ne-build

jsoriano added 5 commits February 20, 2023 18:50

Allow standalone mode in release builds

239e9cd

Remove self-monitor

e65d110

Comment-out fingerprint

96ec861

Set check-in to true by default

d8f4704

Add Dockerfile

15cffed

jsoriano requested review from nchaulet and michel-laterman February 20, 2023 20:08

Remove redundant if

8acc020

jsoriano commented Feb 20, 2023

View reviewed changes

Remove unreachable code

90c152e

michel-laterman reviewed Feb 21, 2023

View reviewed changes

nchaulet mentioned this pull request Feb 22, 2023

[Fleet] Add featureFlag to support standalone fleet server elastic/kibana#151865

Merged

alexsapran mentioned this pull request Feb 22, 2023

Add CI pipeline to publish docker images for standalone mode #2352

Closed

jsoriano added 3 commits February 27, 2023 12:57

Merge remote-tracking branch 'origin/main' into fleet-server-standalo…

9fac041

…ne-build

Remove checkin code for standalone

4916bda

Fix changelog

715d446

jsoriano added 5 commits March 1, 2023 13:26

Merge remote-tracking branch 'origin/main' into fleet-server-standalo…

7f10c60

…ne-build

Merge remote-tracking branch 'origin/main' into fleet-server-standalo…

e153f9d

…ne-build

Disable version compatibility check with Elasticsearch and migrations

c78b1a9

Implement a self monitor for stand-alone

aafd2c8

Fix linting

2388fe9

jsoriano self-assigned this Mar 1, 2023

jsoriano added 4 commits March 1, 2023 20:20

Remove unused code

7e35e46

Leftover

b76d153

Fix linting

8334f45

Merge remote-tracking branch 'origin/main' into fleet-server-standalo…

b6d5736

…ne-build

jsoriano marked this pull request as ready for review March 1, 2023 21:07

jsoriano requested review from a team as code owners March 1, 2023 21:07

jsoriano requested a review from AndersonQ March 1, 2023 21:07

michel-laterman approved these changes Mar 2, 2023

View reviewed changes

jsoriano and others added 5 commits March 3, 2023 10:37

Use make release to build dockr image, add to Jenkinsfile

228d848

Add docs

fbe3c41

Merge remote-tracking branch 'origin/main' into fleet-server-standalo…

8101d00

…ne-build

Include example commands

6c53bd3

Remove build tags

30cf152

juliaElastic approved these changes Mar 6, 2023

View reviewed changes

joshdover reviewed Mar 6, 2023

View reviewed changes

internal/pkg/server/fleet.go Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

jsoriano and others added 3 commits March 6, 2023 15:46

Update README.md

13c2536

Co-authored-by: Josh Dover <1813008+joshdover@users.noreply.github.com>

Add comments about version check and migrations

996513f

Merge remote-tracking branch 'origin/main' into fleet-server-standalo…

a95b146

…ne-build

jsoriano merged commit 94cccad into elastic:main Mar 6, 2023

jsoriano deleted the fleet-server-standalone-build branch March 6, 2023 17:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dockerized standalone Fleet Server #2359

Dockerized standalone Fleet Server #2359

jsoriano commented Feb 20, 2023 •

edited

Loading

jsoriano Feb 20, 2023

jsoriano Feb 21, 2023

nchaulet Feb 21, 2023 •

edited

Loading

michel-laterman Feb 21, 2023

jsoriano Feb 22, 2023 •

edited

Loading

jsoriano Feb 22, 2023 •

edited

Loading

jsoriano Feb 20, 2023

michel-laterman Feb 21, 2023

jsoriano commented Feb 20, 2023

elasticmachine commented Feb 20, 2023 •

edited

Loading

Build stats

Test stats 🧪

amitkanfer commented Feb 21, 2023

alexsapran commented Feb 21, 2023

jsoriano commented Feb 21, 2023 •

edited

Loading

jsoriano commented Feb 21, 2023

alexsapran commented Feb 21, 2023

michel-laterman Feb 21, 2023

jsoriano Feb 22, 2023

jsoriano Mar 3, 2023 •

edited

Loading

mergify bot commented Feb 27, 2023

jsoriano commented Mar 1, 2023

juliaElastic commented Mar 2, 2023

nchaulet commented Mar 2, 2023

michel-laterman left a comment

michel-laterman Mar 2, 2023

jsoriano Mar 3, 2023

jsoriano commented Mar 3, 2023

juliaElastic left a comment

joshdover left a comment

		@@ -0,0 +1,22 @@
		ARG GO_VERSION
		FROM golang:${GO_VERSION}-buster AS builder

Dockerized standalone Fleet Server #2359

Dockerized standalone Fleet Server #2359

Conversation

jsoriano commented Feb 20, 2023 • edited Loading

What is the problem this PR solves?

How does this PR solve the problem?

How to test this PR locally

Design Checklist

Checklist

Related issues

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nchaulet Feb 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsoriano Feb 22, 2023 • edited Loading

Choose a reason for hiding this comment

jsoriano Feb 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsoriano commented Feb 20, 2023

elasticmachine commented Feb 20, 2023 • edited Loading

💚 Build Succeeded

Build stats

Test stats 🧪

💚 Flaky test report

🤖 GitHub comments

amitkanfer commented Feb 21, 2023

alexsapran commented Feb 21, 2023

jsoriano commented Feb 21, 2023 • edited Loading

jsoriano commented Feb 21, 2023

alexsapran commented Feb 21, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsoriano Mar 3, 2023 • edited Loading

Choose a reason for hiding this comment

mergify bot commented Feb 27, 2023

jsoriano commented Mar 1, 2023

juliaElastic commented Mar 2, 2023

nchaulet commented Mar 2, 2023

michel-laterman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsoriano commented Mar 3, 2023

juliaElastic left a comment

Choose a reason for hiding this comment

joshdover left a comment

Choose a reason for hiding this comment

jsoriano commented Feb 20, 2023 •

edited

Loading

nchaulet Feb 21, 2023 •

edited

Loading

jsoriano Feb 22, 2023 •

edited

Loading

jsoriano Feb 22, 2023 •

edited

Loading

elasticmachine commented Feb 20, 2023 •

edited

Loading

jsoriano commented Feb 21, 2023 •

edited

Loading

jsoriano Mar 3, 2023 •

edited

Loading