You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to parse these into fields so that I can index and search them in OpenSearch.
Describe the solution you'd like
A processor for parsing ion documents parse_ion, similar to parse_json, and csv.
The implementation would likely be very similar to parse_json, and perhaps under the hood they can share most of their logic, just supplying different ObjectMapper implementations for each as well as any language specific configurations.
Describe alternatives you've considered (Optional)
It's possible to preprocess simple well-formatted ion documents converting them to json in order to prepare them for parse_json using regular expressions (substitute_string), but this is hacky, probably slow, and very prone to bugs.
I have also considered creating a new intermediary service that converts the ion to json before submitting to data-prepper, but this adds additional complexity and just defeats the purpose of data-prepper in general.
Additional context
I'm willing to submit a PR for this, would like to get feedback on the idea & approach though.
The text was updated successfully, but these errors were encountered:
@emmachase , Thank you for this suggestion. I would suggest that we make this a new processor. This has the advantage of letting the configurations change if necessary. Perhaps we'd add certain configurations for looser parsing of one or the other. It would also be clearer for users who wouldn't look for ION processing in a JSON processor.
parse_ion:
source: /ion-string
destination: /data
And thank you for your interest in submitting a PR. We'd be happy to help get it merged in.
I think this could be easily accomplished by refactoring the ParseJsonProcessor class to make most of the logic go into a common class. And I'd be fine starting with the ParseIonProcessor in the same Gradle project (parse-json-processor) to keep it simple. Maybe we'd split it eventually to avoid unnecessary dependencies, but as it is all dependencies must deploy with Data Prepper.
I would suggest also having a different class for the configuration - ParseIonProcessorConfig. We recently did something similar in our kafka-plugins project where we decoupled the configurations for the Kafka buffer and source.
Is your feature request related to a problem? Please describe.
For my use-case I have nested ion documents in my input. For example:
I would like to parse these into fields so that I can index and search them in OpenSearch.
Describe the solution you'd like
A processor for parsing ion documents
parse_ion
, similar toparse_json
, andcsv
.The implementation would likely be very similar to parse_json, and perhaps under the hood they can share most of their logic, just supplying different ObjectMapper implementations for each as well as any language specific configurations.
Describe alternatives you've considered (Optional)
It's possible to preprocess simple well-formatted ion documents converting them to json in order to prepare them for parse_json using regular expressions (substitute_string), but this is hacky, probably slow, and very prone to bugs.
I have also considered creating a new intermediary service that converts the ion to json before submitting to data-prepper, but this adds additional complexity and just defeats the purpose of data-prepper in general.
Additional context
I'm willing to submit a PR for this, would like to get feedback on the idea & approach though.
The text was updated successfully, but these errors were encountered: