Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds handler for returning a profile archive #7862

Merged
merged 4 commits into from
May 17, 2017
Merged

Adds handler for returning a profile archive #7862

merged 4 commits into from
May 17, 2017

Conversation

e-dard
Copy link
Contributor

@e-dard e-dard commented Jan 22, 2017

Required for all non-trivial PRs
  • Rebased/mergable
  • Tests pass
  • CHANGELOG.md updated

Either @gunnaraasen or @rossmcdonald sowed the seed of this idea, so thanks to whichever one of you it was. 👍

Currently, when debugging issues with InfluxDB, we often ask for the
following profiles and queries:

curl -o block.txt "http://localhost:8086/debug/pprof/block?debug=1"
curl -o goroutine.txt "http://localhost:8086/debug/pprof/goroutine?debug=1"
curl -o heap.txt "http://localhost:8086/debug/pprof/heap?debug=1"
curl -o cpu.txt "http://localhost:8086/debug/pprof/profile

influx -exectute 'SHOW SHARDS' > shards.txt
influx -exectute 'SHOW STATS' > stats.txt
influx -exectute 'SHOW DIAGNOSTICS' > diagnostics.txt

Collecting these can be bothersome for users, or even difficult if they're unfamiliar with cURL (or it's not on their system).

This commit adds a new endpoint: /debug/pprof/all which will return a single compressed archive of all of the above profiles and queries. The CPU profile is optional, and not returned by default. To include a CPU profile, the URL to request should be: /debug/pprof/all?cpu=true. It's also possible to vary the length of the CPU profile by adding a seconds=x parameter, where x defaults to 30, if absent.

The new command for gathering profiles from users should now be:

curl -o profiles.tar.gz "http://localhost:8086/debug/pprof/all"

Or, if we need to see a CPU profile:

curl -o profiles.tar.gz "http://localhost:8086/debug/pprof/all?cpu=true"

It's important to remember that a CPU profile is a blocking operation, and by default it will take 30 seconds for the response (archive) to be returned to the user.

Finally, if the user is unfamiliar with cURL, they will now be able to visit http://localhost:8086/debug/pprof/all in a web browser, and the archive will be downloaded to their machine.

The archive contents look like:

⇒  tree ~/Downloads/profiles
/Users/edd/Downloads/profiles
├── block.txt
├── cpu.txt
├── diagnostics.txt
├── goroutine.txt
├── heap.txt
├── shards.txt
└── stats.txt

0 directories, 7 files

@e-dard e-dard added this to the 1.3.0 milestone Jan 22, 2017
@e-dard e-dard requested a review from jwilder January 22, 2017 00:42
Copy link
Contributor

@mark-rushakoff mark-rushakoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple small questions but other than that LGTM.

// Return the gzipped archive.
w.Header().Set("Content-Disposition", "attachment; filename=profiles.tar.gz")
w.Header().Set("Content-Type", "application/gzip")
fmt.Fprint(w, resp.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This looks like it would allocate a new string for the entire response. Would this be better as io.Copy(w, resp)?

//
func archiveProfiles(w http.ResponseWriter, r *http.Request) {
var all = []*prof{
{Name: "goroutine", Debug: 1},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goroutine profile accepts debug=2, but I think we're always more interested in debug=1 output, since it combines goroutines with identical stack traces. Maybe add a comment noting that we deliberately aren't supporting debug=2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debug=2 is not a valid format for use with the pprof tool, so it's not as useful in my opinion. I'll add something to make it explicit though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://golang.org/pkg/runtime/pprof/

The predefined profiles may assign meaning to other debug values; for example, when printing the "goroutine" profile, debug=2 means to print the goroutine stacks in the same form that a Go program uses when dying due to an unrecovered panic.

$ curl -s http://localhost:8086/debug/pprof/goroutine?debug=0 | head
goroutine profile: total 55
5 @ 0x300ba 0x3f4c9 0x3e2bc 0x1de964 0x5ffb1
5 @ 0x300ba 0x3f4c9 0x3e2bc 0x477d16 0x4b1241 0x5ffb1
5 @ 0x300ba 0x3f4c9 0x3e2bc 0x477f72 0x4b1073 0x5ffb1
5 @ 0x300ba 0x3f4c9 0x3e2bc 0x477f72 0x4b1113 0x5ffb1
5 @ 0x300ba 0x3f4c9 0x3e2bc 0x477f72 0x4b11b3 0x5ffb1
5 @ 0x300ba 0x3f4c9 0x3e2bc 0x4780f5 0x4b0fd1 0x5ffb1
5 @ 0x300ba 0x3f4c9 0x3e2bc 0x4a3da7 0x5ffb1
1 @ 0x1205b 0x43387 0x844d2 0x5ffb1
1 @ 0x23710f 0x236f10 0x233c31 0x425876 0x425bf5 0x19fd56 0x360fdd 0x35d987 0x5ffb1

$  curl -s http://localhost:8086/debug/pprof/goroutine?debug=1 | head
goroutine profile: total 55
5 @ 0x300ba 0x3f4c9 0x3e2bc 0x1de964 0x5ffb1
#	0x1de963	github.com/influxdata/influxdb/tsdb.(*Shard).monitor+0x3b3	/Users/mr/go/src/github.com/influxdata/influxdb/tsdb/shard.go:918

5 @ 0x300ba 0x3f4c9 0x3e2bc 0x477d16 0x4b1241 0x5ffb1
#	0x477d15	github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).compactCache+0x475			/Users/mr/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:963
#	0x4b1240	github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).enableSnapshotCompactions.func1+0x60	/Users/mr/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:266

5 @ 0x300ba 0x3f4c9 0x3e2bc 0x477f72 0x4b1073 0x5ffb1
#	0x477f71	github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).compactTSMLevel+0x161			/Users/mr/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:1003

$ curl -s http://localhost:8086/debug/pprof/goroutine?debug=2 | head
goroutine 102 [running]:
runtime/pprof.writeGoroutineStacks(0xa46a20, 0xc420f7f040, 0xa68940, 0x30)
	/usr/local/Cellar/go/1.7.3/libexec/src/runtime/pprof/pprof.go:585 +0x79
runtime/pprof.writeGoroutine(0xa46a20, 0xc420f7f040, 0x2, 0x0, 0xc420132000)
	/usr/local/Cellar/go/1.7.3/libexec/src/runtime/pprof/pprof.go:574 +0x44
runtime/pprof.(*Profile).WriteTo(0xa6a360, 0xa46a20, 0xc420f7f040, 0x2, 0xc420f7f040, 0xc4207af9c0)
	/usr/local/Cellar/go/1.7.3/libexec/src/runtime/pprof/pprof.go:298 +0x341
net/http/pprof.handler.ServeHTTP(0xc420b163d1, 0x9, 0xa4f0a0, 0xc420f7f040, 0xc420f81950)
	/usr/local/Cellar/go/1.7.3/libexec/src/net/http/pprof/pprof.go:209 +0x1a6
net/http/pprof.Index(0xa4f0a0, 0xc420f7f040, 0xc420f81950)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mark-rushakoff I think you misunderstood what I meant. debug=2 is not compatible with go tool pprof, though it is of course valid with the runtime/pprof package handler, and will output something useful (to a human). It's just you can't feed that profile into the pprof tool.

edd@Alameda:~|
⇒  go tool pprof $GOPATH/bin/ossinfluxd-v1.2.1 http://localhost:8086/debug/pprof/goroutine\?debug\=1
Fetching profile from http://localhost:8086/debug/pprof/goroutine?debug=1
Saved profile in /Users/edd/pprof/pprof.ossinfluxd-v1.2.1.localhost:8086.goroutine.001.pb.gz
Entering interactive mode (type "help" for commands)
(pprof) quit
edd@Alameda:~|
⇒  go tool pprof $GOPATH/bin/ossinfluxd-v1.2.1 http://localhost:8086/debug/pprof/goroutine\?debug\=2
Fetching profile from http://localhost:8086/debug/pprof/goroutine?debug=2
parsing profile: unrecognized profile format
edd@Alameda:~|
⇒  go tool pprof $GOPATH/bin/ossinfluxd-v1.2.1 http://localhost:8086/debug/pprof/goroutine\?debug\=0
Fetching profile from http://localhost:8086/debug/pprof/goroutine?debug=0
Saved profile in /Users/edd/pprof/pprof.ossinfluxd-v1.2.1.localhost:8086.goroutine.002.pb.gz
Entering interactive mode (type "help" for commands)
(pprof) quit
edd@Alameda:~|
⇒

Copy link
Contributor

@jwilder jwilder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably update https://github.com/influxdata/influxdb/blob/master/.github/ISSUE_TEMPLATE.md to call this handler instead since it's fewer steps.

Also, what do you think about incorporating these steps as well:

influx -execute "show shards" > shards.txt
influx -execute "show stats" > stats.txt
influx -execute "show diagnostics" > diagnostics.txt

@e-dard
Copy link
Contributor Author

e-dard commented Jan 24, 2017

@jwilder I did consider adding other information but it didn't feel like the right place for it, since it's under the /debug/pprof endpoint. Plus executing queries could be a lot more expensive than gathering otherwise helpful profiles.

Maybe we should add a further endpoint, which will execute the queries and also call out to the /debug/pprof endpoint, then wrap everything up into an archive? We can direct users to run either one dependent on their circumstances?

@jwilder
Copy link
Contributor

jwilder commented Jan 24, 2017

@e-dard Those three queries are returned directly from the meta cache.

@e-dard
Copy link
Contributor Author

e-dard commented Jan 25, 2017

@jwilder cool. I'll add them.

@e-dard
Copy link
Contributor Author

e-dard commented Apr 4, 2017

@jwilder I was going to fix this up the other day but then I realised we need to be able to handle meta queries where the server has authentication enabled. We can chat about that offline.

@e-dard e-dard force-pushed the er-debug-all branch 4 times, most recently from c978222 to d692e47 Compare May 12, 2017 12:52
@e-dard
Copy link
Contributor Author

e-dard commented May 12, 2017

@jwilder @mark-rushakoff see d692e47. I've added support for SHOW SHARDS, SHOW STATS and SHOW DIAGNOSTICS.

Copy link
Contributor

@mark-rushakoff mark-rushakoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few lines are unnecessarily casting a string to a byte slice to append to another byte slice.

CHANGELOG.md Outdated
- [#8366](https://github.com/influxdata/influxdb/pull/8366): Add TSI support tooling.
- [#8350](https://github.com/influxdata/influxdb/pull/8350): Track HTTP client requests for /write and /query with /debug/requests.
- [#7862](https://github.com/influxdata/influxdb/pull/7861): Add new profile endpoint for gathering all debug profiles single in archive.
- [#7862](https://github.com/influxdata/influxdb/pull/7861): Add new profile endpoint for gathering all debug profiles and querues in single archive.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/querues/queries

var out []byte
// Write the columns
for _, col := range row.Columns {
out = append(out, []byte(col+"\t")...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be out = append(out, col...); out = append(out, '\t')?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

col is a string not a []byte. It looks better as far as I can tell when we have a tab between each column in the output (including between each column header).

for _, col := range row.Columns {
out = append(out, []byte(col+"\t")...)
}
out = append(out, []byte("\n")...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out = append(out, '\n')?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always forget this...

for _, v := range val {
out = append(out, []byte(fmt.Sprintf("%v\t", v))...)
}
out = append(out, []byte("\n")...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out = append(out, '\n')?

}

// joinUint64 returns a comma-delimited string of uint64 numbers.
func joinUint64(a []uint64) string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strconv.FormatUint will create a new string and then write it to the buffer. I would expect this method to be more efficient as

func joinUint64(a []uint64) string {
  var buf []byte // Could take a guess at initial size here.
  for i, x := range a {
    if i != 0 {
      buf = append(buf, ',')
    }
    buf = strconv.AppendUint(buf, x, 10)
  }
  return string(buf)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pilfered this from https://github.com/influxdata/influxdb/blob/master/coordinator/statement_executor.go#L1182; I didn't look at the complexity at all since this is not a hot path. I don't mind changing it for that implementation.

e-dard added 4 commits May 15, 2017 14:11
Currently, when debugging issues with InfluxDB we often ask for the
following profiles:

  curl -o block.txt "http://localhost:8086/debug/pprof/block?debug=1"
  curl -o goroutine.txt
"http://localhost:8086/debug/pprof/goroutine?debug=1"
  curl -o heap.txt "http://localhost:8086/debug/pprof/heap?debug=1"
  curl -o cpu.txt "http://localhost:8086/debug/pprof/profile

This can be bothersome for users, or even difficult if they're
unfamiliar with cURL (or it's not on their system).

This commit adds a new endpoint: /debug/pprof/all which will return a
single compressed archive of all of the above profiles. The CPU profile
is optional, and not returned by default. To include a CPU profile the
URL to request should be: /debug/pprof/all?cpu=true. It's also possible
to vary the length of the CPU profile by adding a `seconds=x` parameter,
where x defaults to 30, if absent.

The new command for gathering profiles from users should now be:

  curl -o profiles.tar.gz "http://localhost:8086/debug/pprof/all"

Or, if we need to see a CPU profile:

  curl -o profiles.tar.gz
"http://localhost:8086/debug/pprof/all?cpu=true"

It's important to remember that a CPU profile is a blocking operation
and by default it will take 30 seconds for the response to be returned
to the user.

Finally, if the user is unfamiliar with cURL, they will now be able to
visit http://localhost:8086/debug/pprof/all in a web browser, and the
archive will be downloaded to their machine.
@e-dard e-dard merged commit a5fed3d into master May 17, 2017
@e-dard e-dard deleted the er-debug-all branch September 15, 2017 11:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants