Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make a dump command #1909

Merged
merged 17 commits into from
Mar 19, 2015
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@
### Bugfixes
- [#1942](https://github.com/influxdb/influxdb/pull/1942): Sort wildcard names.

### Features
- [#1909](https://github.com/influxdb/influxdb/pull/1909): Implement a dump command.

## v0.9.0-rc11 [2015-03-13]

### Bugfixes
Expand Down
21 changes: 21 additions & 0 deletions client/influxdb.go
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,27 @@ func (c *Client) Ping() (time.Duration, string, error) {
return time.Since(now), version, nil
}

func (c *Client) Dump(db string) (*http.Response, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should return a io.ReadCloser so that anyone else consuming our library wouldn't have to add the http package if they wanted to declare a variable to receive the response.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but then others consuming our library will need to include the io package, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially, yes, but it only exposes what you want to expose. Returning the entire http response is way to much to return. Also, it's very idiomatic in Go to pass around readers/writers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for returning io.ReadCloser. It also doesn't tie the Client.Dump() interface to HTTP. Maybe we'll add a named pipes transport in the future. ;)

u := c.url
u.Path = "dump"
values := u.Query()
values.Set("db", db)
values.Set("user", c.username)
values.Set("password", c.password)
u.RawQuery = values.Encode()

req, err := http.NewRequest("GET", u.String(), nil)
if err != nil {
return nil, err
}
req.Header.Set("User-Agent", c.userAgent)
resp, err := c.httpClient.Do(req)
if err != nil {
return nil, err
}
return resp, nil
}

// Structs

// Result represents a resultset returned from a single statement.
Expand Down
32 changes: 31 additions & 1 deletion cmd/influx/main.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package main

import (
"bufio"
"encoding/csv"
"encoding/json"
"flag"
Expand Down Expand Up @@ -41,6 +42,7 @@ type CommandLine struct {
Version string
Pretty bool // controls pretty print for json
Format string // controls the output format. Valid values are json, csv, or column
Dump bool
}

func main() {
Expand All @@ -53,8 +55,14 @@ func main() {
fs.StringVar(&c.Password, "password", c.Password, `password to connect to the server. Leaving blank will prompt for password (--password="")`)
fs.StringVar(&c.Database, "database", c.Database, "database to connect to the server.")
fs.StringVar(&c.Format, "output", default_format, "format specifies the format of the server responses: json, csv, or column")
fs.BoolVar(&c.Dump, "dump", false, "dump the contents of the given database to stdout")
fs.Parse(os.Args[1:])

if c.Dump {
c.connect("")
c.dump()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just return at this point so you don't have to check later if you are dumping.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also might want to add this after prompt for password so if they want to do this interactively they could. Not 100% sure it matters that much, but it's an easy enough use case to support.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dump does exit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, but if you exited here, it would be more apparent, as well as stop additional "am I dumping" checks that follow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok to call line.close even though it was deferred above?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to call it at all. As long as we just return and don't do an os.Exit or anything like that, the defer will pick up as soon as we go out of scope, which is what we want.

}

var promptForPassword bool
// determine if they set the password flag but provided no value
for _, v := range os.Args {
Expand Down Expand Up @@ -205,7 +213,9 @@ func (c *CommandLine) connect(cmd string) {
fmt.Printf("Failed to connect to %s\n", c.Client.Addr())
} else {
c.Version = v
fmt.Printf("Connected to %s version %s\n", c.Client.Addr(), c.Version)
if !c.Dump {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this check any more do we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I see it now. Sometimes it's hard to spot the reasoning in these PR's as they only show parts of the code. I had to look at the entire file to understand the flow again. My bad.

fmt.Printf("Connected to %s version %s\n", c.Client.Addr(), c.Version)
}
}
}

Expand Down Expand Up @@ -249,6 +259,26 @@ func (c *CommandLine) SetFormat(cmd string) {
}
}

func (c *CommandLine) dump() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to change this to executeDump so we don't have c.Dump and c.dump?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it looks nice the way it is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with the dump() naming but the Dump flag doesn't indicate in its name that it's a boolean. I would suggest either naming the flag DumpEnabled or ShouldDump. Also, naming two things on a struct the same name is generally confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the flag names indicate a type, although the help mesg does it somewhat implicitly.

I wish I had never agreed to creating a special flag for this at all.

I'll use '--dump-the-database'

fmt.Printf("db is %s\n", c.Database)
response, err := c.Client.Dump(c.Database)
defer response.Body.Close()
if err != nil {
fmt.Fprintf(os.Stderr, "Failed to dump database %s from %s\n", c.Database, c.Client.Addr())
os.Exit(1)
} else {
scanner := bufio.NewScanner(response.Body)
for scanner.Scan() {
fmt.Println(scanner.Text())
}
if err := scanner.Err(); err != nil {
fmt.Fprintln(os.Stderr, "Failed to dump database %s", err)
os.Exit(1)
}
}
os.Exit(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we return early above we can omit this line.

}

func (c *CommandLine) executeQuery(query string) {
results, err := c.Client.Query(client.Query{Command: query, Database: c.Database})
if err != nil {
Expand Down
125 changes: 125 additions & 0 deletions httpd/handler.go
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,10 @@ func NewHandler(s *influxdb.Server, requireAuthentication bool, version string)
"index", // Index.
"GET", "/", true, true, h.serveIndex,
},
route{
"dump", // export all points in the given db.
"GET", "/dump", true, true, h.serveDump,
},
)

for _, r := range h.routes {
Expand Down Expand Up @@ -175,6 +179,127 @@ func (h *Handler) serveQuery(w http.ResponseWriter, r *http.Request, user *influ
httpResults(w, results, pretty)
}

type Point struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can still use the client.Point and client.BatchPoints for this. That being said, I'm a little torn on if we should do it. I think this comes back to me still wanting to separate some of these models to their own package for better re-use. This comment is more food for thought at this point than an action item.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how BatchPoints would work here. All points being dumped don't share tags or timestamps. Also, I want to send small chunks off to write as they are read.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omitempty will stop any fields that don't have data, so they would be the same size. Again, just a talking point, but wanted to point out that omitempty is pretty amazing. You might want to put it on any fields in your struct that could be empty as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omitempty appears not to work for struct fields of type time.Time. I can't reuse client.BatchPoints here. I can reuse client.Point here if you really want me to. I'm not sure it's a good idea though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time.Time is a special case, and they are actually fixing this in a future release of Go. One trick is you can make it a pointer to time, but I've never been a big fan of that personally. Like I said, this was more food for thought than an action item. If it's working for now, I would leave it and we can refactor at a future date.

Name string `json:"name"`
Timestamp time.Time `json:"timestamp"`
Tags map[string]string `json:"tags"`
Fields map[string]interface{} `json:"fields"`
}

type Batch struct {
Database string `json:"database"`
RetentionPolicy string `json:"retentionPolicy"`
Points []Point `json:"points"`
}

func interfaceToString(v interface{}) string {
switch t := v.(type) {
case nil:
return ""
case bool:
return fmt.Sprintf("%v", v)
case int, int8, int16, int32, int64, uint, uint8, uint16, uint32, uint64, uintptr:
return fmt.Sprintf("%d", t)
case float32, float64:
return fmt.Sprintf("%v", t)
default:
return fmt.Sprintf("%v", t)
}
}

// Return all the measurements from the given DB
func (h *Handler) showMeasurements(db string, user *influxdb.User) (error, []string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should return the error last ([]string, error)

var measurements []string
queryString := "show measurements"
p := influxql.NewParser(strings.NewReader(queryString))
query, err := p.ParseQuery()
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could avoid all of the above logic by this:

results := h.Server.ExecuteQuery(&influxql.ShowMeasurementsStatement{}, db, user)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, no, that doesn't work. Maybe you know a one-line trick to turn a ShowMeasurementsStatement type into a Query type?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, sorry, it has to be more like &influxql.Query{[]influxql.Statement{influxql.ShowMeasurementsStatement{}}}

If that doesn't work let me know and I'll mock it out quick tomorrow morning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unwinding everything in parsing a query we find that it has to be:

&influxql.Query{[]influxql.Statement{&influxql.ShowMeasurementsStatement{}}}

I don't think that's clearer than the extant code, nevertheless I will do it this way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me it was originally about being more performant and less about readability, but after seeing it the readability is probably more important. Our parser is pretty performant, especially for that type of query. I originally thought we could pass in that simple type, but having to wrap it that way is pretty ugly, I agree.

return err, measurements
}
results := h.server.ExecuteQuery(query, db, user)
if results.Err != nil {
return results.Err, measurements
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as per above, flip the return order, error is idiomatically always last.


}
for _, measurementResult := range results.Results {
for _, measurementRow := range measurementResult.Series {
for _, measurementTuple := range (*measurementRow).Values {
for _, measurementCell := range measurementTuple {
measurements = append(measurements, interfaceToString(measurementCell))
}
}
}
}
return nil, measurements
}

// serveDump returns all points in the given database as a plaintext list of JSON structs.
// To get all points:
// Find all measurements (show measurements).
// For each measurement do select * from <measurement> group by *
func (h *Handler) serveDump(w http.ResponseWriter, r *http.Request, user *influxdb.User) {
q := r.URL.Query()
db := q.Get("db")
delim := []byte("\n")
pretty := q.Get("pretty") == "true"
err, measurements := h.showMeasurements(db, user)
if err != nil {
httpError(w, "error with dump: "+err.Error(), pretty, http.StatusBadRequest)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be a http.StatusInternalServerError as something went wrong on our end, not theirs.

return
}

// Fetch all the points for each measurement.
// From the 'select' query below, we get:
//
// columns:[col1, col2, col3, ...]
// - and -
// values:[[val1, val2, val3, ...], [val1, val2, val3, ...], [val1, val2, val3, ...]...]
//
// We need to turn that into multiple rows like so...
// fields:{col1 : values[0][0], col2 : values[0][1], col3 : values[0][2]}
// fields:{col1 : values[1][0], col2 : values[1][1], col3 : values[1][2]}
// fields:{col1 : values[2][0], col2 : values[2][1], col3 : values[2][2]}
//
for _, measurement := range measurements {
queryString := fmt.Sprintf("select * from %s group by *", measurement)
p := influxql.NewParser(strings.NewReader(queryString))
query, err := p.ParseQuery()
if err != nil {
httpError(w, "error with dump: "+err.Error(), pretty, http.StatusBadRequest)
return
}
//
results := h.server.ExecuteQuery(query, db, user)
for _, result := range results.Results {
for _, row := range result.Series {
points := make([]Point, 1)
var point Point
point.Name = row.Name
point.Tags = row.Tags
point.Fields = make(map[string]interface{})
for _, tuple := range row.Values {
for subscript, cell := range tuple {
if row.Columns[subscript] == "time" {
point.Timestamp, _ = time.Parse(time.RFC3339, interfaceToString(cell))
continue
}
point.Fields[row.Columns[subscript]] = cell
}
points[0] = point
batch := &Batch{
Database: db,
RetentionPolicy: "default",
Points: points,
}
buf, _ := json.Marshal(&batch)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't ignore the error here. do something with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we consistently ignore the error in handler.go

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Looks like I created a bad role model :-). This has bitten us in the past. Update yours to not ignore it, and I'll log an issue to update the rest of the handlers to not ignore them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created issue: #1944

w.Write(buf)
w.Write(delim)
}
}
}
}
}

// serveWrite receives incoming series data and writes it to the database.
func (h *Handler) serveWrite(w http.ResponseWriter, r *http.Request, user *influxdb.User) {
var bp client.BatchPoints
Expand Down