Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed collection method of the second counter value in the Windows Perfmon module of Metricbeat #32305

Merged
merged 16 commits into from
Sep 1, 2022

Conversation

jorgeta
Copy link
Contributor

@jorgeta jorgeta commented Jul 11, 2022

What does this PR do?

There are two ways of collecting performance counters in Windows, as explained here.

The current Metricbeat implementation uses the one where PdhCollectQueryDataEx is called to retrieve the second raw value that is required to calculate the displayable value.

This PR changes this collection method to a simpler one, the one that calls PdhCollectQueryData twice, with a sleep of one second in between.

The function randSeq was implemented in an earlier version in order to support the current implementation, but is not used anymore. The deletion of this function is therefore just considered as cleanup.

Why is it important?

Collection of the counter "Processor(_Total) % Processor Time" gave false 100 % spikes. Testing on different computers and servers showed that when the sampling frequency, or period, was larger than one second, one would obtain these false 100 % values. They have been checked for being false by comparing with Windows own Performance Monitor. When running several instances of Metricbeat alongside each other, spikes are measured for the instances

  • uses the current collection method and
  • the period is larger than one second.

This is part of one configuration file used when testing, where the sampling frequency/period is set to 2 seconds:

metricbeat.modules:

  • module: windows
    enabled: true
    metricsets:
    • perfmon
      period: 2s
      perfmon.ignore_non_existent_counters: true
      perfmon.group_measurements_by_instance: true
      perfmon.queries:
    • object: Processor
      instance:
      • '_Total'
        counters:
      • name: '% Processor Time'

In the two figures below (Figfure (1) and Figure (2)), the current implementation (named "thread"), is compared with the implementation of this PR (named "sleep"), for different sampling frequencies/periods. The names "sleep1s" and "thread1s" indicates that the collection interval, the time between doing a PdhCollectQueryData call, is hardcoded to 1s in the version.

In Figure (1), each point represents a maximum value over a period of 60 seconds, and explains why the different versions with different frequencies show different values, except for sleep1s-freq1s and thread1s-freq1s, which provide almost identical values (as expected). What this figure shows is that the thread1s-freq2s and thread1s-freq10s produce false 100 % spikes.

Zooming in on one of the 100 % spikes has been done in Figure (2), where thread1s-freq10s provides one 100 % value that is most certainly false. In addition, thread1s-freq2s produces 0 % values that does not seem correct either.

Figure (1)
thread1s-vs-sleep1s

Figure (2)
thread1s-vs-sleep1s-zoom

The figures above show tests performed using Metricbeat 7.17.2, but they have also been reproduced using Metricbeat 8.3.1. They have also been reproduced on different computers and servers, a database server for example. The code in question is the same in both versions, so this is also expected.

As one also can see from the figures above, and also generally in all the experiments done, the sleep method always produces values that seem correct and that conform with what Windows Performance Monitor is also observing. The problem only arises when the sampling frequency/period is larger than the collection interval (tests have also been made with larger collection intervals, versions of Metricbeat called "sleep2s" and "thread2s", etc.), for the current collection method. It therefore seems like there is some bug concerning the sampling frequency/period of Metricbeat in combination with the creating of the thread. This PR does not explain what this issue is precisely, but simply provides an alternative where this problem does not arise.

In order to get a closer look at what was going on, tests were run where the raw values were fetched after calling the PdhCollectQueryData function. When using either the sleep method, or the thread method with a sampling frequency/period that is not larger than the collection interval (1 second), these raw values were collected at the correct times, and the values could be used to verify the fetched displayable value. Below are some examples of logs of two consecutive samples for these cases. An entry saying "Raw values: \Processor(_Total)\% Processor Time:[{_Total 5870585175781 133017943902193928 { 0}}]])", means that the FirstValue of the raw value is 5870585175781 and the SecondValue is 133017943902193928. The displayable value is calculated using taking two consecutive raw values, and then take one minus the difference of the FirstValues divided by the difference of the SecondValues, and at the end multiply by 100 % to get the percent value. The formula is also found here. In the below examples, this adds up in all cases.

sleep1s-freq1s

First sample:
{"@timestamp":"2022-07-09T00:53:10.219+0200","message":"Raw values: \Processor(_Total)\% Processor Time:[{_Total 5870585175781 133017943902193928 { 0}}]])"}
{"@timestamp":"2022-07-09T00:53:11.223+0200","message":"Formatted values: \Processor(_Total)\% Processor Time:[{_Total 1.5947397042254785 { 0}}]])"}
{"@timestamp":"2022-07-09T00:53:11.223+0200","message":"Raw values: \Processor(_Total)\% Processor Time:[{_Total 5870595048828 133017943912226976 { 0}}]])"}
Consecutive sample:
{"@timestamp":"2022-07-09T00:53:11.224+0200","message":"Raw values: \Processor(_Total)\% Processor Time:[{_Total 5870595058593 133017943912239452 { 0}}]])"}
{"@timestamp":"2022-07-09T00:53:12.237+0200","message":"Formatted values: \Processor(_Total)\% Processor Time:[{_Total 4.005576157731239 { 0}}]])"}
{"@timestamp":"2022-07-09T00:53:12.237+0200","message":"Raw values: \Processor(_Total)\% Processor Time:[{_Total 5870604785156 133017943922371877 { 0}}]])"}

sleep1s-freq10s

First sample:
{"@timestamp":"2022-07-08T23:46:47.100+0200","message":"Raw values: \Processor(_Total)\% Processor Time:[{_Total 5831721962890 133017904071006917 { 0}}]])"}
{"@timestamp":"2022-07-08T23:46:48.116+0200","message":"Formatted values: \Processor(_Total)\% Processor Time:[{_Total 2.633179634240257 { 0}}]])"}
{"@timestamp":"2022-07-08T23:46:48.116+0200","message":"Raw values: \Processor(_Total)\% Processor Time:[{_Total 5831731855468 133017904081167029 { 0}}]])"}
Consecutive sample:
{"@timestamp":"2022-07-08T23:46:57.092+0200","message":"Raw values: \Processor(_Total)\% Processor Time:[{_Total 5831819648437 133017904170920927 { 0}}]])"}
{"@timestamp":"2022-07-08T23:46:58.101+0200","message":"Formatted values: \Processor(_Total)\% Processor Time:[{_Total 2.156219690300587 { 0}}]])"}
{"@timestamp":"2022-07-08T23:46:58.101+0200","message":"Raw values: \Processor(_Total)\% Processor Time:[{_Total 5831829521484 133017904181011550 { 0}}]])"}

thread1s-freq1s

First sample:
{"@timestamp":"2022-07-09T00:18:04.609+0200","message":"Raw values: \Processor(_Total)\% Processor Time:[{_Total 5850016337890 133017922846099090 { 0}}]])"}
{"@timestamp":"2022-07-09T00:18:05.619+0200","message":"Formatted values: \Processor(_Total)\% Processor Time:[{_Total 3.05911239227572 { 0}}]])"}
{"@timestamp":"2022-07-09T00:18:05.619+0200","message":"Raw values: \Processor(_Total)\% Processor Time:[{_Total 5850026123046 133017922856193031 { 0}}]])"}
Consecutive sample:
{"@timestamp":"2022-07-09T00:18:05.619+0200","message":"Raw values: \Processor(_Total)\% Processor Time:[{_Total 5850026132812 133017922856196130 { 0}}]])"}
{"@timestamp":"2022-07-09T00:18:06.646+0200","message":"Formatted values: \Processor(_Total)\% Processor Time:[{_Total 3.3303272308510157 { 0}}]])"}
{"@timestamp":"2022-07-09T00:18:06.646+0200","message":"Raw values: \Processor(_Total)\% Processor Time:[{_Total 5850036044921 133017922866449717 { 0}}]])"}

It is, however, not the case when using the current implementation (thread method), with a sampling frequency. Both raw value samples are identical, and hence one does not obtain the fetched displayable value when plugging them into the formula. Here is one example of such logs:

thread1s-freq10s

First sample:
{"@timestamp":"2022-07-08T23:19:38.104+0200","message":"Raw values: \Processor(_Total)\% Processor Time:[{_Total 5815925039062 133017887781042404 { 0}}]])"}
{"@timestamp":"2022-07-08T23:19:38.105+0200","message":"Formatted values: \Processor(_Total)\% Processor Time:[{_Total 0.5117438823267872 { 0}}]])"}
{"@timestamp":"2022-07-08T23:19:38.105+0200","message":"Raw values: \Processor(_Total)\% Processor Time:[{_Total 5815925039062 133017887781042404 { 0}}]])"}
Consecutive sample:
{"@timestamp":"2022-07-08T23:19:48.095+0200","message":"Raw values: \Processor(_Total)\% Processor Time:[{_Total 5816021699218 133017887880951263 { 0}}]])"}
{"@timestamp":"2022-07-08T23:19:48.097+0200","message":"Formatted values: \Processor(_Total)\% Processor Time:[{_Total 5.3058896133528215 { 0}}]])"}
{"@timestamp":"2022-07-08T23:19:48.097+0200","message":"Raw values: \Processor(_Total)\% Processor Time:[{_Total 5816021699218 133017887880951263 { 0}}]])"}

Looking into the implementation history, there does not seem to be any explanation of why the current collection method was chosen - especially with this PR's simpler alternative.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Use cases

Screenshots

Logs

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jul 11, 2022
@cla-checker-service
Copy link

cla-checker-service bot commented Jul 11, 2022

💚 CLA has been signed

@elasticmachine
Copy link
Collaborator

elasticmachine commented Jul 11, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-08-30T07:50:59.923+0000

  • Duration: 49 min 5 sec

Test stats 🧪

Test Results
Failed 0
Passed 3719
Skipped 873
Total 4592

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@mtojek mtojek added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Jul 18, 2022
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jul 18, 2022
@gtback
Copy link
Member

gtback commented Jul 20, 2022

cla/check

@jorgeta jorgeta marked this pull request as ready for review July 20, 2022 14:08
@jorgeta jorgeta requested a review from a team as a code owner July 20, 2022 14:08
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@mergify
Copy link
Contributor

mergify bot commented Jul 21, 2022

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b main upstream/main
git merge upstream/main
git push upstream main

@cmacknz
Copy link
Member

cmacknz commented Jul 21, 2022

/test

@mergify
Copy link
Contributor

mergify bot commented Jul 26, 2022

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b main upstream/main
git merge upstream/main
git push upstream main

@cmacknz
Copy link
Member

cmacknz commented Jul 26, 2022

/test

@cmacknz
Copy link
Member

cmacknz commented Jul 26, 2022

The tests won't run until the conflict is resolved it seems.

@jorgeta jorgeta requested review from a team as code owners July 31, 2022 15:04
@jorgeta jorgeta requested review from belimawr and removed request for a team July 31, 2022 15:04
@cmacknz
Copy link
Member

cmacknz commented Aug 2, 2022

/test

@belimawr
Copy link
Contributor

belimawr commented Aug 3, 2022

@jorgeta Thank you so much for this contribution and the super detailed explanation!

I started to look at your code, but the PR is bring many commits that were not made by you, making it pretty hard to review/understand what has changed.

Could you look into it and fix your branch/PR to contain only your commits?

You don't need to worry about keeping your PR up-to-date with main, as long as there are no conflicts and the branch was based on a recent version of main, it will be fine to merge.

Copy link
Contributor

@belimawr belimawr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting the changes I explained on my previous commit.

@belimawr belimawr self-assigned this Aug 3, 2022
@mergify
Copy link
Contributor

mergify bot commented Aug 8, 2022

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b main upstream/main
git merge upstream/main
git push upstream main

accidentally added another PR's changelog description

Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co>
@mergify
Copy link
Contributor

mergify bot commented Aug 16, 2022

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @jorgeta? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)
    To fixup this pull request, you need to add the backport labels for the needed
    branches, such as:
  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

@cmacknz
Copy link
Member

cmacknz commented Aug 16, 2022

/test

@cmacknz
Copy link
Member

cmacknz commented Aug 16, 2022

Some of the lint/check steps are failing for metricbeat: https://beats-ci.elastic.co/blue/organizations/jenkins/Beats%2Fbeats/detail/PR-32305/23/pipeline

[2022-08-16T17:12:20.283Z] + make -C metricbeat check
[2022-08-16T17:12:20.283Z] make: Entering directory '/var/lib/jenkins/workspace/PR-32305-23-2c910f50-e5a2-4893-9423-401bf7fece6f/src/github.com/elastic/beats/metricbeat'
[2022-08-16T17:12:20.283Z] Installing mage v1.13.0.
[2022-08-16T17:12:20.546Z] /var/lib/jenkins/workspace/PR-32305-23-2c910f50-e5a2-4893-9423-401bf7fece6f/.magefile cleaned
[2022-08-16T17:12:20.546Z] mage check
[2022-08-16T17:12:59.336Z] >> fmt - go-licenser: Adding missing headers
[2022-08-16T17:12:59.337Z] >> fmt - goimports: Formatting Go code
[2022-08-16T17:12:59.337Z] >> fmt - autopep8: Formatting Python code
[2022-08-16T17:13:00.766Z] module/windows/perfmon/reader.go
[2022-08-16T17:13:32.895Z] Generated fields.yml for metricbeat to /var/lib/jenkins/workspace/PR-32305-23-2c910f50-e5a2-4893-9423-401bf7fece6f/src/github.com/elastic/beats/metricbeat/fields.yml
[2022-08-16T17:13:32.895Z] Generated fields.yml for metricbeat to /var/lib/jenkins/workspace/PR-32305-23-2c910f50-e5a2-4893-9423-401bf7fece6f/src/github.com/elastic/beats/metricbeat/fields.yml
[2022-08-16T17:13:32.895Z] >> Building metricbeat.yml for linux/amd64
[2022-08-16T17:13:32.895Z] >> Building metricbeat.reference.yml for linux/amd64
[2022-08-16T17:13:32.895Z] >> Building metricbeat.docker.yml for linux/amd64
[2022-08-16T17:13:32.895Z] Generated fields.yml for metricbeat to /var/lib/jenkins/workspace/PR-32305-23-2c910f50-e5a2-4893-9423-401bf7fece6f/src/github.com/elastic/beats/metricbeat/build/fields/fields.all.yml
[2022-08-16T17:15:14.311Z] >> check: Checking source code for common problems
[2022-08-16T17:15:24.335Z] Error: some files are not up-to-date. Run 'make update' then review and commit the changes. Modified: [metricbeat/module/windows/perfmon/reader.go]
[2022-08-16T17:15:24.335Z] ../dev-tools/make/mage.mk:21: recipe for target 'check' failed
[2022-08-16T17:15:24.335Z] make: *** [check] Error 1
[2022-08-16T17:15:24.335Z] make: Leaving directory '/var/lib/jenkins/workspace/PR-32305-23-2c910f50-e5a2-4893-9423-401bf7fece6f/src/github.com/elastic/beats/metricbeat'
script returned exit code 2

@jorgeta
Copy link
Contributor Author

jorgeta commented Aug 17, 2022

Some of the lint/check steps are failing for metricbeat: https://beats-ci.elastic.co/blue/organizations/jenkins/Beats%2Fbeats/detail/PR-32305/23/pipeline

[2022-08-16T17:12:20.283Z] + make -C metricbeat check
[2022-08-16T17:12:20.283Z] make: Entering directory '/var/lib/jenkins/workspace/PR-32305-23-2c910f50-e5a2-4893-9423-401bf7fece6f/src/github.com/elastic/beats/metricbeat'
[2022-08-16T17:12:20.283Z] Installing mage v1.13.0.
[2022-08-16T17:12:20.546Z] /var/lib/jenkins/workspace/PR-32305-23-2c910f50-e5a2-4893-9423-401bf7fece6f/.magefile cleaned
[2022-08-16T17:12:20.546Z] mage check
[2022-08-16T17:12:59.336Z] >> fmt - go-licenser: Adding missing headers
[2022-08-16T17:12:59.337Z] >> fmt - goimports: Formatting Go code
[2022-08-16T17:12:59.337Z] >> fmt - autopep8: Formatting Python code
[2022-08-16T17:13:00.766Z] module/windows/perfmon/reader.go
[2022-08-16T17:13:32.895Z] Generated fields.yml for metricbeat to /var/lib/jenkins/workspace/PR-32305-23-2c910f50-e5a2-4893-9423-401bf7fece6f/src/github.com/elastic/beats/metricbeat/fields.yml
[2022-08-16T17:13:32.895Z] Generated fields.yml for metricbeat to /var/lib/jenkins/workspace/PR-32305-23-2c910f50-e5a2-4893-9423-401bf7fece6f/src/github.com/elastic/beats/metricbeat/fields.yml
[2022-08-16T17:13:32.895Z] >> Building metricbeat.yml for linux/amd64
[2022-08-16T17:13:32.895Z] >> Building metricbeat.reference.yml for linux/amd64
[2022-08-16T17:13:32.895Z] >> Building metricbeat.docker.yml for linux/amd64
[2022-08-16T17:13:32.895Z] Generated fields.yml for metricbeat to /var/lib/jenkins/workspace/PR-32305-23-2c910f50-e5a2-4893-9423-401bf7fece6f/src/github.com/elastic/beats/metricbeat/build/fields/fields.all.yml
[2022-08-16T17:15:14.311Z] >> check: Checking source code for common problems
[2022-08-16T17:15:24.335Z] Error: some files are not up-to-date. Run 'make update' then review and commit the changes. Modified: [metricbeat/module/windows/perfmon/reader.go]
[2022-08-16T17:15:24.335Z] ../dev-tools/make/mage.mk:21: recipe for target 'check' failed
[2022-08-16T17:15:24.335Z] make: *** [check] Error 1
[2022-08-16T17:15:24.335Z] make: Leaving directory '/var/lib/jenkins/workspace/PR-32305-23-2c910f50-e5a2-4893-9423-401bf7fece6f/src/github.com/elastic/beats/metricbeat'
script returned exit code 2

Is this related to the importing of the fmt package? I tried to make some changes unrelated to the change of the collection method, so that the github actions linting would be approved - but maybe I should leave that? Although then the checks would fail in several places.

@cmacknz
Copy link
Member

cmacknz commented Aug 17, 2022

It looks like it is just a file that needs to be formatted, if you run make -C metricbeat check and then commit the changes it will fix it. This is what it wants changed when I run that command myself:

diff --git a/metricbeat/module/windows/perfmon/reader.go b/metricbeat/module/windows/perfmon/reader.go
index 04063eb084..645938f882 100644
--- a/metricbeat/module/windows/perfmon/reader.go
+++ b/metricbeat/module/windows/perfmon/reader.go
@@ -24,8 +24,8 @@ import (
 	"fmt"
 	"regexp"
 	"strings"
-	"unicode"
 	"time"
+	"unicode"
 
 	"github.com/elastic/beats/v7/metricbeat/helper/windows/pdh"
 
@@ -132,7 +132,7 @@ func (re *Reader) getValues() (map[string][]pdh.CounterValue, error) {
 	if err != nil {
 		return nil, err
 	}
-	
+
 	// Collect the displayable value.
 	val, err = re.query.GetFormattedCounterValues()
 	if err != nil {

@cmacknz
Copy link
Member

cmacknz commented Aug 18, 2022

/test

Copy link
Contributor

@efd6 efd6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit.

metricbeat/module/windows/perfmon/reader.go Outdated Show resolved Hide resolved
Minor change to error string

Co-authored-by: Dan Kortschak <90160302+efd6@users.noreply.github.com>
@narph
Copy link
Contributor

narph commented Aug 22, 2022

Looking into the implementation history, there does not seem to be any explanation of why the current collection method was chosen - especially with this PR's simpler alternative.

@jorgeta, we had this implementation for some time and we switched last year to the PdhCollectQueryDataEx alternative #26886 as it helped with some of the issues we had at that time (missing counter values when configuring a large number of counter queries ).

Have you tested this implementation while configuring a higher number of counters? Are you getting a constant results?

@jorgeta
Copy link
Contributor Author

jorgeta commented Aug 24, 2022

Looking into the implementation history, there does not seem to be any explanation of why the current collection method was chosen - especially with this PR's simpler alternative.

@jorgeta, we had this implementation for some time and we switched last year to the PdhCollectQueryDataEx alternative #26886 as it helped with some of the issues we had at that time (missing counter values when configuring a large number of counter queries ).

Have you tested this implementation while configuring a higher number of counters? Are you getting a constant results?

Hi, @narph - thanks for your feedback. We have tested the implementation for a different amount of counters; from having just the one, and up to 90 "windows.perfmon.*" counters, also along with additional 54 "system.*" (that is, tests have collected between 1 and 144 counters in total). I could try to test with a larger number - if this is not considered large.

So, as far as I can see, the implementation in this PR is still different from the one that was before PdhCollectQueryDataEx was added. Seems like in the earlier implementation, the PhdCollectQueryData was only called once per collection. This means that if one is collecting counters every 15 seconds, then PdhGetFormattedCounterValueDouble would use the raw values collected 15 seconds ago along with the ones that were just collected, in order to compute the displayable value. When PhdCollectQueryDataEx was added, this problem was solved. This PR only wants to call PhdCollectQueryData twice with one second of sleep in between for every time a (rate) counter is collected, contrary to what I can see must have been happening in the pre-PhdCollectQueryDataEx implementation.

@mergify
Copy link
Contributor

mergify bot commented Aug 24, 2022

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b main upstream/main
git merge upstream/main
git push upstream main

Copy link
Contributor

@belimawr belimawr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm no Windows expert, but it looks good to me. Just the small changelog fix that's needed.

CHANGELOG.next.asciidoc Outdated Show resolved Hide resolved
Co-authored-by: Tiago Queiroz <contato@tiago.eti.br>
@cmacknz
Copy link
Member

cmacknz commented Aug 31, 2022

@fearful-symmetry can you review this as well?

@jorgeta sorry this is taking so long, this is a tricky piece of code with a history of issues as I'm sure you know.

Copy link
Contributor

@belimawr belimawr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good from my side, but, please, wait for @fearful-symmetry or someone with more expertise on this area than me.

Copy link
Contributor

@fearful-symmetry fearful-symmetry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I don't have much windows knowledge, but just based on the bits I do understand, this LGTM.

@cmacknz cmacknz added v8.5.0 QA:Needs Validation Needs validation by the QA Team labels Sep 1, 2022
@cmacknz
Copy link
Member

cmacknz commented Sep 1, 2022

Thanks everyone, merging this with 8.5 as the target version. We can have our QA team check this on a few different versions of Windows as a final sanity check.

@cmacknz cmacknz merged commit f0bc6c8 into elastic:main Sep 1, 2022
@jorgeta
Copy link
Contributor Author

jorgeta commented Sep 2, 2022

Thanks everyone!

@cmacknz Would it be possible to backport the changes to version 7? Upgrading to a new version in our company is rather a huge project that is somewhat further down the line, and we would hope to have this fix available as soon as possible.

@cmacknz cmacknz added the backport-7.17 Automated backport to the 7.17 branch with mergify label Sep 6, 2022
@cmacknz
Copy link
Member

cmacknz commented Sep 6, 2022

Sure, this is effectively a bug fix so we can backport to 7.17

mergify bot pushed a commit that referenced this pull request Sep 6, 2022
…Perfmon module of Metricbeat (#32305)

* changed collection method of the second counter value required to create a displayable value

Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co>
Co-authored-by: Dan Kortschak <90160302+efd6@users.noreply.github.com>
Co-authored-by: Tiago Queiroz <contato@tiago.eti.br>
(cherry picked from commit f0bc6c8)

# Conflicts:
#	metricbeat/module/windows/perfmon/reader.go
cmacknz added a commit that referenced this pull request Sep 9, 2022
…er value in the Windows Perfmon module of Metricbeat (#32992)

* Changed collection method of the second counter value in the Windows Perfmon module of Metricbeat (#32305)

* changed collection method of the second counter value required to create a displayable value

Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co>
Co-authored-by: Dan Kortschak <90160302+efd6@users.noreply.github.com>
Co-authored-by: Tiago Queiroz <contato@tiago.eti.br>
(cherry picked from commit f0bc6c8)

# Conflicts:
#	metricbeat/module/windows/perfmon/reader.go

* Resolve merge conflicts.

Co-authored-by: Jørgen Taule <36794556+jorgeta@users.noreply.github.com>
Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co>
chrisberkhout pushed a commit that referenced this pull request Jun 1, 2023
…Perfmon module of Metricbeat (#32305)

* changed collection method of the second counter value required to create a displayable value

Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co>
Co-authored-by: Dan Kortschak <90160302+efd6@users.noreply.github.com>
Co-authored-by: Tiago Queiroz <contato@tiago.eti.br>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-7.17 Automated backport to the 7.17 branch with mergify QA:Needs Validation Needs validation by the QA Team Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team v8.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants