-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Importer for 0.8.9 data via the CLI #3502
Conversation
@corylanou how are errors reported? STDOUT? |
@beckettsean no, |
but of course... :dunce_hat: should be an emoji, I need one. |
884b269
to
fe05a53
Compare
// Query is used to send a command to the server. Both Command and Database are required. | ||
type Query struct { | ||
Command string | ||
Database string | ||
} | ||
|
||
// ParseConnectionString will parse a string to create a valid connection URL | ||
func ParseConnectionString(path string, ssl bool) (url.URL, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the path
just require http://...
or https://...
and then no need for ssl bool
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could, but then you have to parse and validate, and if they don't give you either, you aren't sure what they wanted (you could default to http, but that might not be right). Making them explicitly choose makes it their responsibility to get right. Otherwise, if they don't provide either, it's now on us to try to make the right decision without enough information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that http
or https
would be required but no strong feeling here.
Looks mostly good. The one other thing I think would be good to add is to output any of the lines that fail to parse into a new file. That way the developer can write their own custom script to handle just those data points. And if there are write timeouts that just can't be overcome, to output all the timed out data to a different file. |
|
||
// Process the scanner | ||
v8.processDDL(scanner) | ||
v8.processDML(scanner) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the DDL
and DML
need to be processed in order? Right now it sends it to goroutines that are running concurrently and it seems like they could be run out-of-order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I might of over engineered that a little. I might refactor that to straight method calls instead after seeing the final product.
@pauldix none of the lines can fail to parse as they were written out by the same line protocol code we use to ingest. I could check for it, but imo that would be the equivalent of a corrupt input file at that point. Do you want them to specify a file for "failed writes" that we write to? Otherwise I can write failed writes to |
@corylanou oh right, never mind what I said about the parse thing. For failed writes I think let them specify a file to output to? That way they can just try again and point it at that file. |
} else { | ||
c.Line.Close() | ||
os.Exit(0) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can drop the else
since the if
block will exit.
@benbjohnson ok, I think I have your comments addressed. Please take another look when you get a chance. Thanks! |
@pauldix this now writes all failed lines to |
+1, just give @beckettsean the invocation example that outputs the failed writes to another file. |
Importer for 0.8.9 data via the CLI
@corylanou is there any way to throttle or back off on the import? Most users will be running the import against a live 0.9.3 server. |
This PR allows you to import an exported file from
0.8.9
(#3477).Caveats
For the export/import to work, all requisites have to be met. For export, all series names in
0.8
should be in the following format:for example:
or any number of tags
Additionally, the fields need to have a consistent type (all float64, int64, etc) for every write in
0.8
. Otherwise they have the potential to fail writes in the import. See below for more information.Running the import command
To import via the cli, you can specify the following command:
If the file is not compressed you can issue it without the
-compressed
flag:To redirect failed import lines to another file, run this command:
influx -import -path=metrics-default.gz -compressed > failures
It will import using the line protocol in batches of 5,000 lines per batch.
Understanding the results of the import
The batch will give some basic stats when finished:
Most inserts fail due to the following types of error:
This is due to the fact that in
0.8
a field could get created and saved as int or float types for independent writes. In0.9
the field has to have a consistent type.