Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #1827 - multi instances in win_perf_counters #2352

Merged
merged 6 commits into from
Mar 22, 2017

Conversation

bullshit
Copy link
Contributor

@bullshit bullshit commented Feb 1, 2017

Required for all PRs:

  • CHANGELOG.md updated (we recommend not updating this until the PR has been approved by a maintainer)
  • Sign CLA (if not already signed)
  • README.md updated (if adding a new plugin)

This PR fixes #1827. If you are using multiple instances identifier like "w3wp#1" the phd.dll returns only 2 characters in the SzName var. To be sure it is an error from the dll it self, i try to use phd.dll with c++
c++ source

Screenshot

Instance=chrome#4 Instance=chrome
single instance multi instance

As a fix i look if the configured instance name contains a "#". If this is true i look if the instance name starts with returned instance name.

@bullshit bullshit changed the title Fix #1827 - multi instances in win perf counters Fix #1827 - multi instances in win_perf_counters Feb 21, 2017
@bullshit
Copy link
Contributor Author

bullshit commented Mar 6, 2017

Is there anything i can do, to support the contributors to merge this PR?

@sparrc sparrc added this to the 1.3.0 milestone Mar 6, 2017
@@ -265,6 +265,11 @@ func (m *Win_PerfCounters) Gather(acc telegraf.Accumulator) error {
} else if metric.instance == s {
// Catch if we set it to total or some form of it
add = true
} else if strings.Contains(metric.instance, "#") && strings.HasPrefix(metric.instance, s) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function iterating over all processes on the system? If so and if metric.instance = "chrome#4", what about processes named, for instance, cha? Would it be misnamed by this?

Copy link
Contributor Author

@bullshit bullshit Mar 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No this function iterates over all process that are retrieved for the specific counter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the possibility of a process prefix collision, can it happen if matching all instances? Should we test that s is length 2?

Copy link
Contributor Author

@bullshit bullshit Mar 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion it is impossible, because telegraf creates for every configured instance and WPC combination a unique query and handle to fetch the counter data.

for example:

[[inputs.win_perf_counters.object]]
  ObjectName = "Process"
  Counters = ["% Processor Time"]
  Instances = ["chrome","chrome#4"]
  Measurement = "win_chrome_cpu"
Valid: \Process(chrome)\% Processor Time
Valid: \Process(chrome#4)\% Processor Time
win_chrome_cpu,instance=chrome,objectname=Process,host=winston-pc Percent_Processor_Time=0.16016027331352234 1490095720000000000
win_chrome_cpu,instance=chrome#4,objectname=Process,host=winston-pc Percent_Processor_Time=0 1490095720000000000

for the instance "chrome" there are no problems, but because phd returns for the query \Process(chrome#4)\% Processor Time only ch as instance name the if statment at win_perf_counters.go#L265 isn't working. Thats the reason why i check if the configured instance name contains an hash and, just to be sure, the returned name from phd starts like the configured name. If phd.dll will be fixed in the future my changes won't be executed because of the previous else if statment on line win_perf_counters.go#L265. I don't see any reason why there should be a process prefix collision. A check if the length is 2 would be possible but what if there are cases where phd returns maybe more than 2 characters. I think a check with hasPrefix is the best solution. I hope this answers your questions.

@@ -265,6 +265,11 @@ func (m *Win_PerfCounters) Gather(acc telegraf.Accumulator) error {
} else if metric.instance == s {
// Catch if we set it to total or some form of it
add = true
} else if strings.Contains(metric.instance, "#") && strings.HasPrefix(metric.instance, s) {
// FIX: #1827
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line

@@ -265,6 +265,11 @@ func (m *Win_PerfCounters) Gather(acc telegraf.Accumulator) error {
} else if metric.instance == s {
// Catch if we set it to total or some form of it
add = true
} else if strings.Contains(metric.instance, "#") && strings.HasPrefix(metric.instance, s) {
// FIX: #1827
// phd.dll only returns first 2 characters of the instance name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please expand on this a bit, when does it only return 2 characters of the instance name?

Copy link
Contributor Author

@bullshit bullshit Mar 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are using multiple instances identifier like "w3wp#1" the phd.dll returns only 2 characters in the SzName var. To be sure it is an error from the dll it self, i try to use phd.dll with c++

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so change the comment to: "If you are using a multiple instance identifier such as "w3wp#1" phd.dll returns only the first 2 characters of the identifier."

@danielnelson
Copy link
Contributor

If anyone running windows can test and report back it would be very helpful.

@bullshit
Copy link
Contributor Author

bullshit commented Mar 20, 2017

Currently, i'm running my changes in a production environment since january on 14 windows servers without any troubles

@danielnelson
Copy link
Contributor

Can you update the CHANGELOG.md before I merge?

@danielnelson danielnelson merged commit 616b66f into influxdata:master Mar 22, 2017
@discoduck2x
Copy link

@danielnelson @bullshit is this now included on the telegraf download page for win? or do i have to wait for next proper version release (our build myself) ?

@danielnelson
Copy link
Contributor

It will be in the upcoming 1.3.0 release. We have nightly builds available though, this link should always have the latest build: https://dl.influxdata.com/telegraf/nightlies/telegraf-nightly_windows_amd64.zip

ssorathia pushed a commit to ssorathia/telegraf that referenced this pull request Mar 25, 2017
calerogers pushed a commit to calerogers/telegraf that referenced this pull request Apr 5, 2017
vlamug pushed a commit to vlamug/telegraf that referenced this pull request May 30, 2017
@dudusakharovich
Copy link

@danielnelson - I have the same issue, where telegraf can't retrieve data from all "w3wp" instances.
This is my configuration:

[[inputs.win_perf_counters.object]]
ObjectName = "Process"
Counters = ["% Processor Time","Handle Count","Private Bytes","Thread Count","Virtual Bytes","Working Set"]
Instances = ["*"]
Measurement = "win_proc"
IncludeTotal=true

I can only see data retrieved for one "w3wp" instance instead of 3 (w3wp, w3wp#1 & w3wp#2).
What is wrong with my configuration ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Telegraf issues with Windows multi instance processes
5 participants