Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet]: Fleet Server stuck in updating state after installation. #1934

Closed
ghost opened this issue Sep 29, 2022 · 8 comments
Closed

[Fleet]: Fleet Server stuck in updating state after installation. #1934

ghost opened this issue Sep 29, 2022 · 8 comments
Assignees
Labels
blocker bug Something isn't working impact:medium QA:Validated Validated by the QA Team Team:Fleet Label for the Fleet team

Comments

@ghost
Copy link

ghost commented Sep 29, 2022

Kibana version: 8.4.3 BC2 Kibana Cloud environment

Host OS and Browser version: Windows

Build details:
VERSION: 8.4.3 BC2
Build: 55572
Commit: 1ceb607762eaafa726c61d6eee5b95359142d4c4

Preconditions:

  • 8.4.3 BC2 Kibana Cloud environment should be available.

Steps to reproduce:

  1. On fresh Kibana setup navigate to Fleet tab.
  2. Click 'Add Fleet Server'.
  3. Generate Fleet-server policy say https//10.10.10.10:8220
  4. Copy the command in the cli.
  5. Fleet-server is installed.
  6. Observe Fleet Server is stuck on updating state.

Expected Result:
Fleet Server should not stuck in updating state after installation and should be healthy.

Screenshot:
image

Elastic-agent logs:
elastic-agent-diagnostics-2022-09-29T11-31-36Z-00.zip

@ghost ghost added bug Something isn't working Team:Fleet Label for the Fleet team impact:high Short-term priority; add to current release, or definitely next. labels Sep 29, 2022
@dikshachauhan-qasource dikshachauhan-qasource added impact:medium and removed impact:high Short-term priority; add to current release, or definitely next. labels Sep 29, 2022
@dikshachauhan-qasource
Copy link

Secondary Review is done.

@cmacknz
Copy link
Member

cmacknz commented Sep 29, 2022

There are no agent logs in the attached diagnostics. @joshdover where do we need to look to find the fleet server logs associated with this?

@ghost
Copy link
Author

ghost commented Sep 30, 2022

Hi @cmacknz
I have observed that the attached logs was incorrect. Further, I have attached the correct logs below:

Elastic-agent log:
elastic-agent-diagnostics-2022-09-30T04-12-58Z-00.zip

Please let us know if anything else is required from our end.
Thanks

@joshdover
Copy link
Contributor

I don't think this is a Fleet Server problem, the logs for FS look normal to me. There are a couple other interesting things in the logs.

It looks like the agent is in a bootloop due to this error:

{"log.level":"error","@timestamp":"2022-09-30T03:55:41.968Z","log.origin":{"file.name":"status/reporter.go","file.line":326},"message":"Elastic Agent status changed to \"error\": \"app filebeat--8.4.3--36643631373035623733363936343635-dfb4be0f: \\\"filebeat_monitoring\\\" failed to prepare monitor for \\\"Filebeat\\\": failed to create a directory \\\"\\\": mkdir : The system cannot find the path specified.\"","ecs.version":"1.6.0"}

There's 8 elastic-agent log files and all of them have the exact same contents, right down to the timestamps in milliseconds, this seems like a bug of sorts.

The agent policy is empty in the diagnostic, but I think that's due to the agent not checking in with Fleet Server (itself) yet. There's this error generated while capturing the diag itself:

Error 1: unable to gather diagnostics data: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing open \\\\.\\pipe\\elastic-agent-system: The system cannot find the file specified."
Error 2: unable to gather metrics data: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing open \\\\.\\pipe\\elastic-agent-system: The system cannot find the file specified."
Error 3: unable to gather config data: no fleet config retrieved yet

@cmacknz
Copy link
Member

cmacknz commented Oct 3, 2022

Thanks having logs definitely helps. This errors seems to be coming from here in the agent. It looks like the directory it is trying to create is empty \\\"\\\"

// Prepare executes steps in order for monitoring to work correctly
func (b *SidecarMonitor) Prepare(spec program.Spec, pipelineID string, uid, gid int) error {
	endpoint := MonitoringEndpoint(spec, b.operatingSystem, pipelineID, true)
	drop := monitoringDrop(endpoint)

	if err := os.MkdirAll(drop, 0775); err != nil {
		return errors.New(err, fmt.Sprintf("failed to create a directory %q", drop))
	}

@michalpristas any idea what is going on here?

@michalpristas
Copy link
Contributor

michalpristas commented Oct 3, 2022

This was fixed but fix did not make it to BC2
image
* top (cd40) is fix, bottom (5f09) is release

Fix PR: elastic/elastic-agent#1371
Issue: elastic/elastic-agent#1361

@cmacknz cmacknz added the blocker label Oct 3, 2022
@cmacknz
Copy link
Member

cmacknz commented Oct 3, 2022

Closing as fixed by elastic/elastic-agent#1372

@cmacknz cmacknz closed this as completed Oct 3, 2022
@amolnater-qasource amolnater-qasource added the QA:Ready For Testing Code is merged and ready for QA to validate label Oct 4, 2022
@ghost ghost added QA:Validated Validated by the QA Team and removed QA:Ready For Testing Code is merged and ready for QA to validate labels Oct 5, 2022
@ghost
Copy link
Author

ghost commented Oct 5, 2022

Hi Team
We have revalidated this issue on latest 8.4.3 BC3 kibana cloud environment and found it fixed now.

  • Fleet Server is healthy after installation.

Build Details:
Version: 8.4.3 BC3
BUILD: 55572
COMMIT: 1ceb607762eaafa726c61d6eee5b95359142d4c4

Screenshot:

image
Hence, marking this as QA:Validated.
Thanks

@ghost ghost added QA:Validated Validated by the QA Team and removed QA:Validated Validated by the QA Team labels Oct 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocker bug Something isn't working impact:medium QA:Validated Validated by the QA Team Team:Fleet Label for the Fleet team
Projects
None yet
Development

No branches or pull requests

5 participants