-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduced BigDecimalConverter #4557
Introduced BigDecimalConverter #4557
Conversation
Thanks for the PR @san81 . Please add Test file for all src java file that you created or modified. |
import org.opensearch.dataprepper.typeconverter.DoubleConverter; | ||
import org.opensearch.dataprepper.typeconverter.BooleanConverter; | ||
import org.opensearch.dataprepper.typeconverter.LongConverter; | ||
import org.opensearch.dataprepper.typeconverter.*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will need to change this back to not using * imports. You can make this happen by default if you are using Intellij
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted this change now 👍 Also fixed in my Intelij to not do this going forward
…entry_type processor that currently exists. Optionally, users can also specify required scaling needed on the converted Signed-off-by: Santhosh Gandhe <gandheaz@amazon.com>
9e571b9
to
9a6a439
Compare
…per the review comment Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
…ientific-notation-fix
@@ -53,7 +58,7 @@ static Record<Event> buildRecordWithEvent(final Map<String, Object> data) { | |||
} | |||
|
|||
@BeforeEach | |||
private void setup() { | |||
public void setup() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this needs to be public?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A method annotated with @BeforeEach
shouldn't be private. More details here => https://junit.org/junit5/docs/5.0.2/api/org/junit/jupiter/api/BeforeEach.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can make this package private and just remove public
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
…assing the scale while converting the instance only when the instance is BigDecimalConverter Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
|
||
private static Stream<Arguments> decimalToBigDecimalValueProvider() { | ||
return Stream.of( | ||
Arguments.of(new BigDecimal ("0.0"), 0, 1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add another test case that covers the conversion described in the issue here (#3840) where we can convert 1.70206220242E+12
to 1702062202420
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added additional scenario for this case
Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @san81 for this great contribution!
/** | ||
* Optional scale value used only in the case of BigDecimal converter | ||
*/ | ||
@JsonProperty("scale") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we expose this? What would be the value of changing the scale to something else?
You could remove the field and keep the getScale()
method such that we don't expose the config, but are flexible to use in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I listed in the PR description, I thought, we can let the user choose the scale he need per column like
- convert_entry_type:
key: "column1"
type: "bigdecimal"
scale: 5
* | ||
* @since 2.8 | ||
*/ | ||
BIGDECIMAL("bigdecimal"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just call this decimal
?
If we keep the big, we should name the enum to BIG_DECIMAL
. I also think that we should have an underscore in the type name to keep consistency: big_decimal
. This would match field names in OpenSearch like geo_point
, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kept the word BIG
and renamed the references to BIG_DECIMAL
. Both enum and data type name are now modified.
@@ -67,7 +71,11 @@ public Collection<Record<Event>> doExecute(final Collection<Record<Event>> recor | |||
if (keyVal != null) { | |||
if (!nullValues.contains(keyVal.toString())) { | |||
try { | |||
recordEvent.put(key, converter.convert(keyVal)); | |||
if(converter instanceof BigDecimalConverter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better design might be to make an interface that allows for passing additional arguments.
interface ConverterArguments {
int getScale();
}
You can implement this interface on the ConvertEntryTypeProcessorConfig
class.
Then, you can simplify this:
recordEvent.put(key, converter.convert(keyVal, convertEntryTypeProcessorConfig));
You would need to add ConverterArguments
as a argument to convert
, but most implementations won't use it. (You could even make use of Java's default method to reduce changes, but that is not necessary).
No more conditionals. And future changes similar to this will not require conditions here either.
Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
…o the converter and avoided conditional statement for calling converter methods Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making this change!
… the code Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this contribution!
Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
Signed-off-by: Santhosh Gandhe <1909520+san81@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this contribution!
To your comment about trailing zeros, I think this should be fine. This is a new converter, so it won't impact existing pipelines.
Introduced BigDecimal Converter that users can use as part of convert_entry_type processor that currently exists. Optionally, users can also specify required scaling needed while converting to BigDecimal. If scale value is given by the user then we apply HALF_EVEN rounding strategy if the original number has more decimal digits than the scale value given.
Description
Fix for the issue described at => #3840
IMPORTANT Note: This PR is disabling STRIP_TRAILING_BIGDECIMAL_ZEROES by default. In other words, we won't be calling
bigDecimal.stripTrailingZeros()
method while deserializing a BigDecimal value. A value of1.00
won't be stripped down to1.0
or1
while deserializing. It will be deserialized into1.00
. It could impact all existing pipelines as well if they depend on BigDecimal datatype. This PR itself is introducing BigDecimal converter so the assumption is that no one is depending on the BigDecimal type so far. Even if they depend on BigDecimal and are ingesting this non-stripped version of a number into OpenSearch sink, it won't be an issue. This will become an issue only if they look for a specific number of decimal places in a numeric value or if they do String comparison on the numbers.After this change merged, user will have an option to choose a BigDecimal converter with optional scale (positions after the decimal point) with
HALF_EVEN
rounding strategy. Check here for more details on this rounding strategy.Example to use this new converter:
Note: Usage of scale attribute is optional. If specified, trailing zero's will be added to match the scale given like in the examples given.
Example-1:
1.703908412707e11
will be deserialized into170390841270.70000
with scale value set to 5. If no scale value given for this case then it will be deserialized into170390841270.7
Example-2:
If no scale value is given then a number like
17020622024e1
will be deserialized into1.7020622024E+11
. In case if user want to avoid scientific notation in the deserialized value then he need to provide a scale value that can cover up all the digits in the decimal places.scale = 0 is also an acceptable value, which essentially like no scale given
Giving a negative for scale is also an acceptable value
If given, scale should always be a numeric value, Otherwise, data-prepper will fail to start
Issues Resolved
Resolves #[https://github.com//issues/3840]
Above issue will get resolved after this change merged in as it gives the user an option to convert the output to BigDecimal format with a specific scale mentioned. That should eliminate scientific notation values printed in the output stream causing the failure in the above issue.
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.