Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Windows][Perfmon] There is more data to return than will fit in the supplied buffer #20139

Closed
aevalov opened this issue Jul 22, 2020 · 4 comments · Fixed by #20630
Closed

[Windows][Perfmon] There is more data to return than will fit in the supplied buffer #20139

aevalov opened this issue Jul 22, 2020 · 4 comments · Fixed by #20630
Assignees
Labels
Team:Platforms Label for the Integrations - Platforms team

Comments

@aevalov
Copy link

aevalov commented Jul 22, 2020

In the windows/perfmon module, when using a wildcard query for Hyper-V metrics, metricbeat don't do samples at configured period.

Elastic and Metricbeat 7.8.
Metricbeat is running on Windows Server 2019. This server has a Hyper-V role and 40 VMs.

I have config with:

- module: windows
  metricsets: ["perfmon"]
  enabled: true
  period: 10s
  perfmon.ignore_non_existent_counters: true
  perfmon.group_measurements_by_instance: true
  perfmon.queries:
  - object: "Hyper-V Hypervisor Logical Processor"
    instance: "_Total"
    counters:
    - name: "% Total Run Time"
  - object: "Memory"
    counters:
    - name: "Available MBytes"
  - object: "LogicalDisk"
    instance: "*"
    counters:
    - name: "Avg. Disk sec/Transfer"
    - name: "Current Disk Queue Length"
  - object: "Network Interface"
    instance: "*"
    counters:
    - name: "Bytes Total/sec"
  - object: "Hyper-V Hypervisor Virtual Processor"
    instance: "*"
    counters:
    - name: "% Total Run Time"
  - object: "Hyper-V Dynamic Memory VM"
    instance: "*"
    counters:
    - name: "Physical Memory"
    - name: "Current Pressure"  
  - object: "Hyper-V Virtual Network Adapter"
    instance: "*"
    counters:
    - name: "Bytes Sent/sec"
    - name: "Bytes Received/sec" 
  - object: "Hyper-V Virtual Storage Device"
    instance: "*"
    counters:
    - name: "Write Bytes/sec"
    - name: "Read Bytes/sec" 
    - name: "Queue Length"
    - name: "Normalized Throughput" 

It works fine for simple counters like \Hyper-V Hypervisor Logical Processor(_Total)% Total Run Time
However for VM specific counters like \Hyper-V Hypervisor Virtual Processor(*)% Total Run Time - I didn't see samples in elastic with configured period. They collected with random interval without any notable errors.

Example from Kibana is below. You can note that sample intervals are not in configured 10s:
Time | windows.perfmon.object | event.dataset
  | Jul 22, 2020 @ 16:24:08.965 | Hyper-V Hypervisor Virtual Processor | windows.perfmon
  | Jul 22, 2020 @ 16:22:28.950 | Hyper-V Hypervisor Virtual Processor | windows.perfmon
  | Jul 22, 2020 @ 16:19:38.961 | Hyper-V Hypervisor Virtual Processor | windows.perfmon
  | Jul 22, 2020 @ 16:18:58.946 | Hyper-V Hypervisor Virtual Processor | windows.perfmon
  | Jul 22, 2020 @ 16:18:28.946 | Hyper-V Hypervisor Virtual Processor | windows.perfmon
  | Jul 22, 2020 @ 16:18:18.955 | Hyper-V Hypervisor Virtual Processor | windows.perfmon
  | Jul 22, 2020 @ 16:15:58.945 | Hyper-V Hypervisor Virtual Processor | windows.perfmon
  | Jul 22, 2020 @ 16:13:48.956 | Hyper-V Hypervisor Virtual Processor | windows.perfmon
  | Jul 22, 2020 @ 16:12:58.958 | Hyper-V Hypervisor Virtual Processor | windows.perfmon
  | Jul 22, 2020 @ 16:10:58.954 | Hyper-V Hypervisor Virtual Processor | windows.perfmon

In debug logs you can see an error:

2020-07-22T14:59:41.188+0300	INFO	module/wrapper.go:259	Error fetching data for metricset windows.perfmon: failed retrieving counters: failed to expand counter (query="\Hyper-V Hypervisor Virtual Processor(*)\% Total Run Time"): There is more data to return than will fit in the supplied buffer. Allocate a larger buffer and call the function again.
2020-07-22T14:59:41.188+0300	DEBUG	[processors]	processing/processors.go:187	Publish event: {
  "@timestamp": "2020-07-22T11:59:41.169Z",
  "@metadata": {
    "beat": "metricbeat",
    "type": "_doc",
    "version": "7.8.0"
  },
  "event": {
    "duration": 19486800,
    "dataset": "windows.perfmon",
    "module": "windows"
  },
  "metricset": {
    "name": "perfmon",
    "period": 10000
  },
  "service": {
    "type": "windows"
  },
  "error": {
    "message": "failed retrieving counters: failed to expand counter (query=\"\\Hyper-V Hypervisor Virtual Processor(*)\\% Total Run Time\"): There is more data to return than will fit in the supplied buffer. Allocate a larger buffer and call the function again."
  },
  "ecs": {
    "version": "1.5.0"
  },
  "host": {
    "name": "SRV04988"
  },
  "agent": {
    "version": "7.8.0",
    "hostname": "SRV04988",
    "ephemeral_id": "1d35c5a0-a865-424d-9ba7-29c3ef2c8935",
    "id": "395cb419-8567-4164-aa27-8bb66b81c0b1",
    "name": "SRV04988",
    "type": "metricbeat"
  }
}

The same problem described here: https://discuss.elastic.co/t/metricbeat-iis-module-there-is-more-data-to-return-than-will-fit-in-the-supplied-buffer/240442/3

In my case too, the problem occurs only with 64-bit metricbeat. The 32-bit version works fine.

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jul 22, 2020
@andresrc andresrc added the Team:Platforms Label for the Integrations - Platforms team label Jul 23, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-platforms (Team:Platforms)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jul 23, 2020
@narph
Copy link
Contributor

narph commented Aug 5, 2020

@aevalov , some improvements were implemented regarding these type of messages #18361, I am not sure if it made it to 7.8.0.
Can you try running 7.8.1 version and let us know if you are encountering the same behavior?
In event you will not see the error messages in kibana can you also check the metricbeat logs and let us know.

@lewk2
Copy link

lewk2 commented Aug 11, 2020

Hi - i just hit the same problem, and then came looking into the issues and it sums it up exactly.

I was using 7.8.1 x64 for windows, trying to harvest the network counters. I stripped back my config to this:

- module: windows
  metricsets: [perfmon]
  period: 1s  
  perfmon.ignore_non_existent_counters: true
  perfmon.group_measurements_by_instance: true
  perfmon.queries:
    - object: "Network Interface"
      instance: ["*"]
      counters:
        - name: "Bytes Sent/sec"
          field: "network.write.bytes"
          format: "long"

It would intermittently manage to update a counter, but running in debug showed this in the console:

2020-08-11T13:25:27.285Z        INFO    module/wrapper.go:259   Error fetching data for metricset windows.perfmon: no counters to read
2020-08-11T13:25:28.303Z        INFO    module/wrapper.go:259   Error fetching data for metricset windows.perfmon: no counters to read
2020-08-11T13:25:29.292Z        INFO    module/wrapper.go:259   Error fetching data for metricset windows.perfmon: no counters to read
2020-08-11T13:25:29.782Z        INFO    [monitoring]    log/log.go:145  Non-zero metrics in the last 30s        {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":1140,"time":{"ms":187}},"total":{"ticks":1796,"time":{"ms":297},"value":1796},"user":{"ticks":656,"time":{"ms":110}}},"handles":{"open":445},"info":{"ephemeral_id":"de8fde33-6626-4701-8b81-82b23a553194","uptime":{"ms":211138}},"memstats":{"gc_next":10904864,"memory_alloc":7972488,"memory_total":27846800,"rss":1335296},"runtime":{"goroutines":26}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":29,"batches":19,"total":29}},"pipeline":{"clients":1,"events":{"active":0,"published":29,"total":29},"queue":{"acked":29}}},"metricbeat":{"windows":{"perfmon":{"events":29,"failures":29}}}}}}
2020-08-11T13:25:30.300Z        INFO    module/wrapper.go:259   Error fetching data for metricset windows.perfmon: no counters to read
2020-08-11T13:25:31.284Z        INFO    module/wrapper.go:259   Error fetching data for metricset windows.perfmon: no counters to read
2020-08-11T13:25:32.280Z        INFO    module/wrapper.go:259   Error fetching data for metricset windows.perfmon: no counters to read
2020-08-11T13:25:33.280Z        INFO    module/wrapper.go:259   Error fetching data for metricset windows.perfmon: no counters to read
2020-08-11T13:25:34.289Z        INFO    module/wrapper.go:259   Error fetching data for metricset windows.perfmon: no counters to read

However, switching out the EXE for the 32-bit version bought it back to working - so that's a fix for me for now!

@narph
Copy link
Contributor

narph commented Aug 17, 2020

@aevalov , @lewk2 , took a while to reproduce this issue but somehow it started to appear once I started testing with the config provided.
I noticed that randomly the PdhExpandWildCardPathW win32 api fails to expand the query provided (the call will not retrieve the expanded buffer size initially so the next call will encounter the PDH_MORE_DATA error since the specified size on the input is still less than the required size).
Since this api is called only on 64 bit machines as it is not supported by 32 bit machines you were not able to reproduce this issue on those versions.
We've worked on several improvements there and one is expanding the path using a different win32 api if the PdhExpandWildCardPathW fails. PR opened here #20630.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Platforms Label for the Integrations - Platforms team
Projects
None yet
5 participants