Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing documentation or integration between resource discovery and scraper #1185

Closed
hd40910 opened this issue Jul 28, 2020 · 17 comments
Closed
Labels
question Further information is requested

Comments

@hd40910
Copy link

hd40910 commented Jul 28, 2020

Hi @tomkerkhove ,

Raised this request as I was not able to find out any clear documentation around how to connect resourcediscovery and the scrapper agent.

There are lots of open issues and many of them seems to part of the product already.Am am hoping the feature is already there :)
It would be great if you could help provide some guidelines around it.

As of now I am able to setup both the agents properly and they seem to do a very good job.
The last that that i require is to do is connect them so that I dont not have to mention metric definition for the scraper as its not a scalable and cumbersome.

@tomkerkhove tomkerkhove added the question Further information is requested label Jul 28, 2020
@tomkerkhove
Copy link
Owner

Glad to see your enthusiasm for resource discovery!

This will be released as part of v2.0 which is being tracked here which will include documentation on how to use it which is being added as part of this PR.

Later on there will be more high-level docs on how they work together.

As of now I am able to setup both the agents properly and they seem to do a very good job.
The last that that i require is to do is connect them so that I dont not have to mention metric definition for the scraper as its not a scalable and cumbersome.

Can you elaborate a bit more on this please? You mean it's not scalable to define all the resources and want to use the resource discovery?

@tomkerkhove tomkerkhove changed the title Missing documentaion or integration between resourcediscovery and scraper Missing documentation or integration between resource discovery and scraper Jul 28, 2020
@hd40910
Copy link
Author

hd40910 commented Jul 28, 2020

@tomkerkhove thanks for the immediate response :). Yes i meant the same.So it will be ready in couple of days ?

@tomkerkhove
Copy link
Owner

I'm doing my best but can't commit to a hard deadline, but the alpha version is already available on Docker Hub:

  • tomkerkhove/promitor-agent-scraper:2.0.0-preview-1
  • tomkerkhove/promitor-agent-discovery:0.1.0-preview-1

This should allow you to already give it a try based on the docs being added and let me know what you think!

@hd40910
Copy link
Author

hd40910 commented Jul 28, 2020

Thank you @tomkerkhove .

After navigating the available docs and the helm charts I was able to connect both the agents.
The health check for both the agents are also green.

Scraper:
image

Am also able to get a test resource from Azure Monitor using the resource discovery agent
image

But somehow its not getting scrapped to the /metrics endpoint for utilization.
Can you please help understand what could be wrong ?

metrics-declaration.yaml

azureMetadata:
  subscriptionId: <some_real_value>
  tenantId: <some_real_value>
  resourceGroupName: promitor
metricDefaults:
  aggregation:
    interval: 00:05:00
  scraping:
    schedule: 0 * * ? * *
metrics: []
version: v1

resource-discovery-declaration.yaml

version: v1
azureLandscape:
  tenantId: <some_real_value>
  subscriptions:
  - <some_real_value>
  cloud: Global
resourceDiscoveryGroups:
- name: virtual-machines
  type: VirtualMachine

@tomkerkhove
Copy link
Owner

metrics-declaration.yaml

azureMetadata:
  subscriptionId: <some_real_value>
  tenantId: <some_real_value>
  resourceGroupName: promitor
metricDefaults:
  aggregation:
    interval: 00:05:00
  scraping:
    schedule: 0 * * ? * *
metrics: []
version: v1

Here lies the issue. You still have to tell Promitor Scraper what metric you are interested in and specify the name of the resource discovery group to use.

See: https://promitor.io/configuration/v2.x/metrics/virtual-machine

@hd40910
Copy link
Author

hd40910 commented Jul 28, 2020

@tomkerkhove thanks. It started working after adding the below .

azureMetadata:
  subscriptionId: <some_real_value>
  tenantId: <some_real_value>
  resourceGroupName: <some_real_value>
metricDefaults:
  aggregation:
    interval: 00:05:00
  scraping:
    schedule: 0 * * ? * *
metrics:
- azureMetricConfiguration:
    aggregation:
      type: Average
    metricName: Percentage CPU
  description: Average percentage cpu usage on an Azure virtual machine
  name: azure_virtual_machine_percentage_cpu
  resourceType: VirtualMachine
  resources:
  - virtualMachineName: test-vm1
version: v1

Notice that we still need to provide the virtualMachineName. And this is what I was trying to eliminate.
We have 1000's of vm's in use under 100's of resourceGroups and adding them is tedious.
Is there a way we could look for all the resourceGroups and resources under one subscription?

@tomkerkhove
Copy link
Owner

If you check the link above, you'll see that you can use the resource group that you have defined for discovery.

In your case it would be:

azureMetadata:
  subscriptionId: <some_real_value>
  tenantId: <some_real_value>
  resourceGroupName: <some_real_value>
metricDefaults:
  aggregation:
    interval: 00:05:00
  scraping:
    schedule: 0 * * ? * *
metrics:
- azureMetricConfiguration:
    aggregation:
      type: Average
    metricName: Percentage CPU
  description: Average percentage cpu usage on an Azure virtual machine
  name: azure_virtual_machine_percentage_cpu
  resourceType: VirtualMachine
  resourceDiscoveryGroups:
  - name: virtual-machines
version: v1

Once you do that, it will pull in all VMs across all RGs in your subscriptions as part of the Azure landscape.

@hd40910
Copy link
Author

hd40910 commented Jul 28, 2020

@tomkerkhove Thanks . This is very helpful.

@tomkerkhove
Copy link
Owner

Let me know how it works, but I'll close the issue if it's ok?

@hd40910
Copy link
Author

hd40910 commented Jul 28, 2020

@tomkerkhove curious about one more feature. Is there a possibility to have all the aggregations typed defined in one go ?
Something like below.

azureMetadata:
  subscriptionId: <some_real_value>
  tenantId: <some_real_value>
  resourceGroupName: <some_real_value>
metricDefaults:
  aggregation:
    interval: 00:05:00
  scraping:
    schedule: 0 * * ? * *
metrics:
- azureMetricConfiguration:
    aggregation:
      type: 
         - Minimum
         - Average
         - Maximum
    metricName: Percentage CPU
  description: Average percentage cpu usage on an Azure virtual machine
  name: azure_virtual_machine_percentage_cpu
  resourceType: VirtualMachine
  resourceDiscoveryGroups:
  - name: virtual-machines
version: v1

@tomkerkhove
Copy link
Owner

No that's not supported for now.

I'm not sure if we would go there as well given one metric would represent different things in the other system which can be misleading.

@hd40910
Copy link
Author

hd40910 commented Jul 28, 2020

@tomkerkhove the filtering and processing of the metrics can be done on the client level which could overcome this scenario.
Another question is about the retention policy of Promitor. I noticed its retaining to internal file system.
Are there any docs that explain how to control the max retention and storage methods ?

@tomkerkhove
Copy link
Owner

The problem is that the more you want, the faster you will be throttled by Azure.

For every metric and resource you need to do a few calls and are limited to 12k which is not much but can't work around it.

If we take it a step further and pull all aggregations you will even hit it faster.

I'm not saying it will never come but just not a prio for now. Feel free to open a seperate issue for it.

@hd40910
Copy link
Author

hd40910 commented Jul 28, 2020

Thanks @tomkerkhove .
Any word on the retention policy that you have for Promitor itself ? The how do we manage its database and its cleanup ?

@tomkerkhove
Copy link
Owner

Not perse, we are relying on https://github.com/PrometheusClientNet/Prometheus.Client for this.

In theory if you restart the container the data will be removed as we don't persist it.

@tomkerkhove
Copy link
Owner

Ok for you if this one gets closed?

@hd40910
Copy link
Author

hd40910 commented Aug 2, 2020

Yes Thank you🙂.

@hd40910 hd40910 closed this as completed Aug 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants