Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Marshal heartbeat duration as int64,string to prevent int64 -> float conversion #34280

Merged
merged 3 commits into from
Jan 26, 2023

Conversation

emilioalvap
Copy link
Collaborator

@emilioalvap emilioalvap commented Jan 16, 2023

What does this PR do?

Changes marshalling of states duration to string-encoded int64, per recommendation. Fixes #34218.

Why is it important?

When duration_ms gets long enough, it starts being marshalled in float format, which is incompatible with int64 type. It then prevents states from being recovered from ES, generating a new one every ~20min.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

@emilioalvap emilioalvap added bug Team:obs-ds-hosted-services Label for the Observability Hosted Services team v8.6.0 labels Jan 16, 2023
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Jan 16, 2023
@mergify
Copy link
Contributor

mergify bot commented Jan 16, 2023

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @emilioalvap? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

@elasticmachine
Copy link
Collaborator

elasticmachine commented Jan 16, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-01-26T16:37:34.416+0000

  • Duration: 44 min 51 sec

Test stats 🧪

Test Results
Failed 0
Passed 1889
Skipped 25
Total 1914

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@@ -60,7 +60,7 @@ type State struct {
ID string `json:"id"`
// StartedAt is the start time of the state, should be the same for a given state ID
StartedAt time.Time `json:"started_at"`
DurationMs int64 `json:"duration_ms"`
DurationMs int64 `json:"duration_ms,string"`
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding this modifier seems easier than changing the struct to a map representation

@emilioalvap emilioalvap marked this pull request as ready for review January 16, 2023 19:42
@emilioalvap emilioalvap requested a review from a team as a code owner January 16, 2023 19:42
@elasticmachine
Copy link
Collaborator

Pinging @elastic/uptime (Team:Uptime)

@emilioalvap emilioalvap requested a review from andrewvc January 16, 2023 19:42
@emilioalvap emilioalvap changed the title Marshal heartbeat state as map to prevent int64 -> float conversion Marshal heartbeat duration as int64,string to prevent int64 -> float conversion Jan 16, 2023
@andrewvc
Copy link
Contributor

What does ES return as the number gets larger here? A number in scientific notation, or just the full number?

I know we don't test ES itself in our tests here. Have you manually tested this with ES yet? I was certain I fixed and tested this in my last PR, but perhaps I missed something when testing it myself.

@emilioalvap
Copy link
Collaborator Author

@andrewvc, I tested with ES and event source is being sent with duration in scientific notation when it gets large enough.
We fixed the indexing part, where ES was rejecting documents.

This is an additional step forward, where the field is indexed correctly, but _source in ES still is stored with scientific notation. So now the error happens when loading previous states from ES, heartbeat can't parse the scientific notation to int64 and ends up generating a new state.

Copy link
Contributor

@andrewvc andrewvc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, pending changelog addition

@emilioalvap emilioalvap added the backport-v8.6.0 Automated backport with mergify label Jan 26, 2023
@mergify
Copy link
Contributor

mergify bot commented Jan 26, 2023

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b emilio-fix-state-marshal upstream/emilio-fix-state-marshal
git merge upstream/main
git push upstream emilio-fix-state-marshal

@emilioalvap emilioalvap force-pushed the emilio-fix-state-marshal branch from 3704dbc to 98ab15d Compare January 26, 2023 16:37
@emilioalvap emilioalvap merged commit 762ad18 into elastic:main Jan 26, 2023
mergify bot pushed a commit that referenced this pull request Jan 26, 2023
…conversion (#34280)

* Add test initial state and build tag

* Keep int64 precision on marshal

* Add changelog

(cherry picked from commit 762ad18)
emilioalvap added a commit that referenced this pull request Jan 26, 2023
…conversion (#34280) (#34401)

* Add test initial state and build tag

* Keep int64 precision on marshal

* Add changelog

(cherry picked from commit 762ad18)

Co-authored-by: Emilio Alvarez Piñeiro <95703246+emilioalvap@users.noreply.github.com>
@emilioalvap emilioalvap added v8.7.0 and removed v8.6.0 labels Feb 10, 2023
chrisberkhout pushed a commit that referenced this pull request Jun 1, 2023
…conversion (#34280)

* Add test initial state and build tag

* Keep int64 precision on marshal

* Add changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v8.6.0 Automated backport with mergify bug Team:obs-ds-hosted-services Label for the Observability Hosted Services team v8.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Heartbeat] Monitor state serialization is transforming int64 to float
3 participants