Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statistics and Diagnostics service #3916

Merged
merged 8 commits into from
Sep 2, 2015
Merged

Statistics and Diagnostics service #3916

merged 8 commits into from
Sep 2, 2015

Conversation

otoolep
Copy link
Contributor

@otoolep otoolep commented Sep 1, 2015

This change introduces the new Statistics and Monitor service, and monitors the Graphite input to start. To summarize there is a new component called monitor which allow any client that implements a simple interface to register with it. The monitor service will then call back to each registered client when statistics or diagnostics are requested via a query.

In addition -- though not yet fully implemented -- if enabled (which it is by default) the statistics will be recorded in an InfluxDB system, allowing the full power of InfluxQL to be employed. More details in the associated README file.

Remaining:

  • complete diagnostics support.
  • actualling recording data in an InfluxDB system.
  • adding stats and diags to all services, as well as the TSDB package itself.

Example stats:

> show stats
name    Alloc   Frees   HeapAlloc       HeapIdle        HeapInUse       HeapObjects     HeapReleased    HeapSys         Lookups Mallocs NumGC   NumGoroutine    PauseTotalNs    Sys             TotalAlloc
runtime 3541080 8808326 3541080         15507456        5595136         33258           0               21102592        148     8841584 157     44              80489234        26802424        524115072


name            proto   batches_tx      bytes_rx        connections_active      connections_handled     points_rx       points_tx
graphite        tcp     209             5305678         0                       1                       208338          208338

> 

@otoolep otoolep changed the title New stats diags Statistics and Monitor service Sep 1, 2015
@otoolep
Copy link
Contributor Author

otoolep commented Sep 1, 2015

@otoolep otoolep changed the title Statistics and Monitor service Statistics and Diagnostics service Sep 1, 2015
@@ -38,6 +38,7 @@ type Config struct {
BindAddress string `toml:"bind-address"`
Database string `toml:"database"`
Enabled bool `toml:"enabled"`
WriteSkip bool `toml:"write-skip"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this config control/mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows me to have the Graphite input perform all processing except actually writing to the database. It is tedious to constantly recompile the code, just to check input performance, in isolation. So the idea is a developer can set this flag, and try various other configs (mostly around batching and Goroutines). There have been some complaints about network-level performance and memory usage of Graphite, and I want to be able to test just the Graphite code.

Think of like the way we sometimes processing within the system, but drop the data before writing. It's most convenient if these performance-testing-related switches are configurable. They would not be documented in the sample config however.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems odd to have this in the config though since the use case for this is mainly for developers. Won't running influxd config print it out as the sample config? Usually things like this are enabled using an env var in other systems. e.g. SKIP_WRITES=1 influxd

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, hadn't though of the sample config. OK, let me nuke it, and deal with it another way (perhaps the environment variable).

@otoolep
Copy link
Contributor Author

otoolep commented Sep 1, 2015

Thanks for the valuable feedback @jwilder -- I have responded to some and will update the code.

@otoolep otoolep force-pushed the new_stats_diags branch 3 times, most recently from 5b61e07 to 4e09484 Compare September 1, 2015 22:01
@jwilder
Copy link
Contributor

jwilder commented Sep 1, 2015

LGTM 👍 I think the write-skip shouldn't be in the config but up to you.

@otoolep
Copy link
Contributor Author

otoolep commented Sep 1, 2015

Removed write-skip from config.

@otoolep
Copy link
Contributor Author

otoolep commented Sep 2, 2015

Got the +1 from @jwilder, so will merge now.

otoolep added a commit that referenced this pull request Sep 2, 2015
Statistics and Diagnostics service
@otoolep otoolep merged commit 14c04eb into master Sep 2, 2015
@otoolep otoolep deleted the new_stats_diags branch September 2, 2015 01:30
@otoolep otoolep added this to the 0.9.4 milestone Sep 22, 2015
@otoolep otoolep self-assigned this Sep 22, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants