Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parse_influxdb function #976

Closed
jorgehermo9 opened this issue Aug 5, 2024 · 2 comments · Fixed by #982
Closed

Add parse_influxdb function #976

jorgehermo9 opened this issue Aug 5, 2024 · 2 comments · Fixed by #982
Labels
vrl: stdlib Changes to the standard library

Comments

@jorgehermo9
Copy link
Contributor

jorgehermo9 commented Aug 5, 2024

With the merge of vectordotdev/vector#19637, we thought of using that decoder instead of a custom lua one (which takes a lot of cpu), but it seems that we have malformed line protocol messages and using the decoding.codec=influxdb option in sources does allow to route the malformed messages somewhere else, as the data is being dropped.

The alternative we see is to use a remap transform with a parse_influxdb function and drop_on_abort=true, so we can route the malformed data (<component_id>.dropped) to our custom lua decoder which is less strict.

In order to handle this failing line protocol messages, we think that a parse_influxdb vrl function is needed. The implementation should be very similar to https://github.com/vectordotdev/vector/blob/210ff0925d391213556f07bf6ce621967f0368ca/lib/codecs/src/decoding/format/influxdb.rs#L97

Doubt: The source decoder option is decoding.codec=influxdb and not decoding.codec=line_protocol, shoud we call this function parse_influxdb in order to be consistent with the vector option? or should we change the vector config spec to use decoding.codec=line_protocol?

@jszwedko
Copy link
Member

jszwedko commented Aug 6, 2024

Agreed, most of our codecs have analogues in VRL for cases where people want more control. We've also previously discussed having sources be able to route events that fail codec parsing to another output, which I think would also help here, but that is a bigger change and I think we'd want this VRL function still anyway.

I think we should call this parse_influxdb to match the codec name.

The existing parse_* functions can be used as an example if you are anyone else wants to take a shot at this.

@jszwedko jszwedko added the vrl: stdlib Changes to the standard library label Aug 6, 2024
@jorgehermo9
Copy link
Contributor Author

jorgehermo9 commented Aug 6, 2024

We've also previously discussed having sources be able to route events that fail codec parsing to another output,

Yes! that is what I was initially looking for, and I think it would be a very useful feature as we wouldn't have to use this additional remap transform.

The existing parse_* functions can be used as an example if you are anyone else wants to take a shot at this.

We are very interested in this feature, so I could address it by myself soon. I took a look to the others parse_* functions and it does not seem too complicated to glue the influxdb_line_protocol crate in it

Thanks!!

@jorgehermo9 jorgehermo9 changed the title Add parse_line_protocol function Add parse_influxdb function Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
vrl: stdlib Changes to the standard library
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants