Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Disable all collectors? #735

Closed
dR3b opened this issue Nov 13, 2017 · 33 comments
Closed

Question: Disable all collectors? #735

dR3b opened this issue Nov 13, 2017 · 33 comments

Comments

@dR3b
Copy link

dR3b commented Nov 13, 2017

Is there a way to disable all collectors by default? I just need "-collector. textfile" all the others are useless to me. Of course, I can disable all of them individually by "--no-collector. " but this is very tedious.

Thanks

@dR3b dR3b changed the title Disable all collectors? Question: Disable all collectors? Nov 13, 2017
@SuperQ
Copy link
Member

SuperQ commented Nov 13, 2017

No, we decided to not support this. It added a bit more code and operational complexity that we wanted. The number of collectors has been changing rapidly, and also some of the defaults for these collectors. Adding a disable-all would add to the release-by-release surprises for many users. So we decided to leave that feature out for now.

@dR3b
Copy link
Author

dR3b commented Nov 13, 2017

Adding a disable-all would add to the release-by-release surprises for many users.

I don't understand it, sorry. Please don't misunderstand me now, but only developers can make such decisions.

@SuperQ
Copy link
Member

SuperQ commented Nov 13, 2017

The users, being the people who deploy the node_exporter.

@dR3b
Copy link
Author

dR3b commented Nov 13, 2017

Why should users be surprised? They wouldn't be affected at all.

I just want an option to disable all collectors by one command and not all of them individually.

@SuperQ
Copy link
Member

SuperQ commented Nov 13, 2017

Our canonical example of this was we refactored CPU metrics out of the stat collector into a cpu collector. With --disable-all, and --collector.stat, CPU metrics would vanish on upgrade.

Our goal is to make typical use of the exporter as safe as possible with default flags. You are using the node_exporter with an extreme edge case.

@ajaegle
Copy link

ajaegle commented Nov 13, 2017

I was also looking for this feature to explicitly enable the collectors I want. I don't think an additional flag to disable all default modules first would be a breaking change for existing users. But I see that introducing such a flag would influence the way flags are parsed and handled as this special flag must come first and could conflict with other flags. Anyway for me it would be a great feature as I only want to provide a few of the possible collectors.

@dR3b
Copy link
Author

dR3b commented Nov 13, 2017

Our canonical example of this was we refactored CPU metrics out of the stat collector into a cpu collector. With --disable-all, and --collector.stat, CPU metrics would vanish on upgrade.

Thanks

You are using the node_exporter with an extreme edge case.

Yeah, but I guess I'm not the only one ;)

Would it be an option to just turn off the visual http output?

@SuperQ
Copy link
Member

SuperQ commented Nov 13, 2017

I understand that there are a few people that want this feature, but it's not something I want to support due to the complications and unintended consequences. This is not an intended use of the exporter, and making that easier is not a good idea for the project.

We added a collection filter http param in master, but it's not released yet. This is also something with unintended consequences, but it's not as bad.

@dR3b
Copy link
Author

dR3b commented Nov 13, 2017

I understand that there are a few people that want this feature, but it's not something I want to support due to the complications and unintended consequences. This is not an intended use of the exporter, and making that easier is not a good idea for the project.

OK

@roman-vynar
Copy link
Contributor

Same here, we have to maintain a huge list of --collector* and --no-collector* because we don't want some of the defaults and need other instead 😕

I think it should be up to users to decide they want to choose to go with --disable-all or not.

@roman-vynar
Copy link
Contributor

On http param filters, it does not help either as node_exporter itself is expected to be run with all the flags enabled so you can choose what params to scrape from prom config. This says we need --enable-all 😀

@SuperQ
Copy link
Member

SuperQ commented Nov 14, 2017

Yes, and now with --enable-all, it gets even more complicated. This is why we're sticking with simple flags. For most users, the node_exporter is a simple system metrics tool that do not need this complication.

@SuperQ SuperQ closed this as completed Nov 14, 2017
@pranas
Copy link

pranas commented Jan 27, 2018

I think you should reopen. As another potential user of this software I would like to explicitly pick what collectors I use. Maybe you should re-evaluate what most of the users want. The way you just closed this issue makes me doubt the way you run this project.

I'm not against having defaults but there should also be a way to get a whitelist behaviour. I think a good interface would be to get rid of disable flags and if you choose to use enable flags you would have to explicitly enable every collector you want. Your current approach with a bunch of defaults and a mess of yes/no flags is nightmare. It's also a headache when upgrading since I also have to see if you didn't add anything new into defaults that I don't want.

@c0r3xxx
Copy link

c0r3xxx commented Mar 22, 2018

I would love this feature too. In some cases on our server we need only 1 or 2 metrics, or the textfile collector.

@discordianfish
Copy link
Member

@SuperQ Honestly, I don't really remember why we dropped the flag in #640. In general, still feeling this way? I saw you mentioned we should open an issue to see if this is really needed, it looks like it is..

@SuperQ
Copy link
Member

SuperQ commented Mar 26, 2018

@discordianfish I highly object to the all option, as it will cause other, slightly different unexpected breakage for users. The primary example of this is the refactoring of cpu metrics from stat to cpu collector. Things like this will cause unexpected loss of metrics.

@CrazyNash
Copy link

I have same requirement here, most of default enabled collectors are useless to me, it would be great if there is an option to disable all, then I manually enable a few what I want.

@aramalipoor
Copy link

@SuperQ isn't this stat/cpu example (and probably other examples) a "breaking change" which means users should read UPGRADE.md before using the new version?

So we're enabling all metrics just to prevent breaking changes in case metrics are moved between collectors? Am I missing something? 🤔

@SuperQ
Copy link
Member

SuperQ commented Jun 20, 2018

@aramalipoor We're not "enabling all metrics".

The aim is to have the defaults be sane for most users, with minimal work required to add or remove specific things for most users. Many users don't read any upgrade docs.

The stat / cpu refactoring change, while technically a "breaking change", becomes non-breaking because we adjust the defaults so the upgrade was a noop from a user perspective.

We need the simplest, least surprising configuration.

I understand the "disable all" is a nice to have feature, but I don't consider it "nice to have" enough to be a requirement that it is supported. The node_exporter is intended to be as simple and as lightweight as it can be. We don't want to end up maintaining an "apache" or "postfix" level of complexity for something that amounts to a cat /proc/* > Prometheus converter.

@dR3b
Copy link
Author

dR3b commented Jun 20, 2018

We don't want to change the defaults. All we'd like is one option to disable all collectors and than enable only what we need. The "users" would not be affected at all.

@roman-vynar
Copy link
Contributor

A simple flag --no-collectors will make many users happy and do not break any existing functionality.

Currently, we do:

ExecStart=/opt/node_exporter/node_exporter \
            --web.listen-address=127.0.0.1:9100 \
            --collector.diskstats.ignored-devices="^(dm-|ram|loop|fd|(h|s|v|xv)d[a-z]|nvme\\d+n\\d+p)\\d+$" \
            --collector.filesystem.ignored-mount-points="^/(dev|proc|sys|run|var/lib/(docker|lxcfs|nobody_tmp_secure))($|/)" \
            --collector.filesystem.ignored-fs-types="^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fuse.*|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$" \
            --collector.netdev.ignored-devices="^(lo|docker[0-9]|veth.+)$" \
            --collector.textfile.directory=/var/lib/node_exporter \
            --collector.conntrack \
            --collector.cpu \
            --collector.diskstats \
            --collector.filefd \
            --collector.filesystem \
            --collector.loadavg \
            --collector.meminfo \
            --collector.netdev \
            --collector.netstat \
            --collector.ntp \
            --collector.sockstat \
            --collector.stat \
            --collector.textfile \
            --collector.time \
            --collector.uname \
            --collector.vmstat \
            --no-collector.arp \
            --no-collector.bcache \
            --no-collector.bonding \
            --no-collector.buddyinfo \
            --no-collector.drbd \
            --no-collector.edac \
            --no-collector.entropy \
            --no-collector.hwmon \
            --no-collector.infiniband \
            --no-collector.interrupts \
            --no-collector.ipvs \
            --no-collector.ksmd \
            --no-collector.logind \
            --no-collector.mdadm \
            --no-collector.meminfo_numa \
            --no-collector.mountstats \
            --no-collector.nfs \
            --no-collector.nfsd \
            --no-collector.qdisc \
            --no-collector.runit \
            --no-collector.supervisord \
            --no-collector.systemd \
            --no-collector.tcpstat \
            --no-collector.timex \
            --no-collector.wifi \
            --no-collector.xfs \
            --no-collector.zfs

😞

@Garagoth
Copy link

On the other hand, if I have to disable all collectors one by one and (getting back to stat/cpu example) you add new collector my scripts will start to break (well, more metrics exported, but I do not want them, so from my point of view it breaks).

+1 for --no-collectors and enabling only wanted. I guess people that need this are OK with metrics missing whey some collector is split in two as they will enable what is needed anyway...

@tangr
Copy link

tangr commented Nov 22, 2018

@SuperQ I need --no-collector-all too, or could make all collectors disabled default? then enable collectors one by one, I don't need so many unnecessary metrics..

@discordianfish
Copy link
Member

Just for the record, I'm +1 on adding a --no-collectors but @SuperQ is strictly against it.

So I'd suggest if people strongly want this, they should kick off a discussion on the prometheus mailing list. Otherwise I don't see this making progress.

@SuperQ
Copy link
Member

SuperQ commented Nov 23, 2018

The primary reason I'm strictly against it is because we have no stability guarantees right now.

We are still making too many breaking changes right now, where the list of collector flags are changing, being refactored, etc.

Once we have a 1.0 release, this feature would be something we can support.

@geekofalltrades
Copy link

I've also got a use-case where I could use only the textfile collector.

I'm using kube-state-metrics to describe all of the pods, services, etc. I have running in my Kubernetes cluster. Now what I need is an exporter that tells me what's supposed to be running, so that I can compare them and see if anything is missing. For example, did I scale down a deployment for maintenance and then forget to scale it back up?

I thought I might do this using a flat file of static metrics describing what's supposed to be running, and have my CI/CD populate that file. In trying to figure out how to parse and re-emit the Prometheus flat metric format, I ended up deep in the expfmt godocs for many days and didn't make much progress. Then I remembered that node_exporter had this ability, and ended up in the source code here trying to figure it out. Then I wondered whether I could just use node-exporter with only the textfile collector, which led me here.

My proposal would actually be, can the textfile collector be split from node_exporter into its own textfile exporter? There are at least two other users in this thread who seem to only need that collector for certain use-cases (including the OP). Let me know where a better place would be for this discussion, because I'm sure this isn't the best place.

@hoffie
Copy link
Contributor

hoffie commented Apr 30, 2019

@geekofalltrades The mailing lists are usually a better place than Github issues (especially closed ones) when asking for support.
In your case, it sounds like you may benefit from recording rules and may be able to avoid scraping these meta metrics at all. Feel free to ask further questions on the mailing list. :)

@ghost
Copy link

ghost commented Jul 30, 2019

This is frustrating, @SuperQ. From an end resource usage, stability, predictability and security perspective, manually blacklisting collectors when the the list of collectors enabled by default may grow with only a practically undetectable announcement, is haphazard.

Allowing an optional and disabled-by-default mode wherein all collectors must be whitelisted is the only sane approach. It matches years of the same approach in firewalls and other similar scenarios.

Please reconsider supporting this before v1.0.0.

Thank you!

@dt-rush
Copy link
Contributor

dt-rush commented Aug 23, 2019

I have nothing to add but a very heartfelt "SECOND" to @josdotso's clear explanation of why a --no-collectors flag (which does not conflict with any implicit assumptions, having never been implemented) is important.

I don't think it's correct to characterize users wanting to operate on a whitelist basis for their metrics as an "extreme edge case". We are exporting system information from a machine. As it currently stands, any future version could change the name of a collector, introduce new collectors, and these would bypass the makeshift blacklist a user is forced to write if they want some control over which internal system details are shown to the world.

@SuperQ
Copy link
Member

SuperQ commented Aug 27, 2019

Like I said, above, this is something we would support as part of a 1.0 stable release. I've been going over the currently open issues, and I think 1.0 is going to be the next release.

@Dmitry1987
Copy link

+1 for sane configuration.
I am breaking my head on a wall why the " --no-collector." does not exist in latest release, while listed in the docs as the flag for disabling collectors 😩 . How do I use a config file or something, to render the list of collectors I want to enable? (I deploy it with Ansible, a config file for such thing is a must, please don't tell me it doesn't exist? :octocat: )

@lukeyeager
Copy link

For v1+, try: node_exporter --collector.disable-defaults --collector.cpu --collector.meminfo thanks to #1460.

For older versions, you could do something like this:

node_exporter -h 2>&1 >/dev/null \
    | gawk 'match($0, /^      --collector\.([a-z_]+) /, a){c=a[1]} /enabled/{print "--no-collector." c}' \
    | grep -E -v '\.(cpu|meminfo)$' \
    | xargs node_exporter

@SuperQ
Copy link
Member

SuperQ commented Sep 11, 2020

Yes, I forgot to mention that 1.0.0 is now released, and supports this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests