forked from raystack/depot
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Maxcompute instrumentation #55
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ekawinataa
reviewed
Nov 15, 2024
src/main/java/com/gotocompany/depot/maxcompute/client/insert/NonPartitionedInsertManager.java
Outdated
Show resolved
Hide resolved
ekawinataa
reviewed
Nov 15, 2024
src/main/java/com/gotocompany/depot/maxcompute/client/insert/PartitionedInsertManager.java
Outdated
Show resolved
Hide resolved
rajuGT
pushed a commit
that referenced
this pull request
Dec 6, 2024
* feat: Add Type Info Converter Implementation * feat: Add Test * feat: Complete test for MessageTypeInfoConverterTest * feat: Complete test for TimestampTypeInfoConverter * feat: Complete test for StructTypeInfoConverter.java * feat: Complete test for DurationTypeInfoConverter.java * feat: Complete test for BaseTypeInfoConverterTest.java * feat: get implementation diff from feat branch * test: update ConverterOrchestratorTest * test: Add InsertManagerFactoryTest * chore: remove unused constructor * test: - add DefaultPartitioningStrategyTest - add TimestampPartitioningStrategyTest - add PartitioningStrategyFactoryTest * fix: PartitioningStrategyFactoryTest * fix: remove unused dependency injection * test: add non null injected decorator * test: add test for MaxComputeSchemaCache * chore: remove unused lombok annotations * test: Add MaxComputeSinkTest * test: add test for close * test: fix typeinfo * fix: add error handler in MaxComputeSink * add ProtoUnknownFieldValidationType * test: Add test for ProtoUnknownFieldValidationType * fix: use correct converter * fix: rename method * chore: rename * chore: exclude conflicting deps * chore: fix checkstyle main * chore: fix checkstyle test * chore: fix dependencies issue * chore: remove main * fix: fix test * chore: exclude model package and bump version * feat: add max compute metrics * feat: add compression option * fix: fix ordering on instantiation * feat: Add utils and implement schema update * feat: Add logging when executing SQL * chore: checkstyle main * chore: fix wrong params * feat: add converter for MaxComputeCompressionAlgorithmConverter * chore: fix checkstyle * fix: byte array conversion * test: fix test * fix: remove record reordering, put metadata at the beginning * fix: use timestamp_ntz for proto timestamp and partition * fix: adjust unit test to use timestamp_ntz instead of timestamp * chore: remove unused lombok annotations * chore: exclude maxcompute client and factory from coverage * chore: exclude maxcompute client from coverage check * chore: delete unused proto * test: add test case for log key * chore: remove unused getters * test: add compression option test * chore: remove redundant column field * fix: duration converter cast to message * feat: add partition precondition on MaxComputeClient * feat: add allowSchemaMismatch to false * chore: remove unused config * chore: adjust config name * chore: add sink and config docs * chore: docs layout * chore: docs layout + adjust default value of config * fix: revert mistakenly pushed build gradle change * chore: fix magic constant * fix: set null instead of default value to nonexistent field * chore: fix partitioning strategy * test: complete unit test * feat: add configurable date format for timestamp partition key * chore: change default value and docs * Maxcompute instrumentation (#55) * added instrumentation for MC sink * chore: refactor InsertManager to abstract class * chore: revert spacing change * chore: refactor --------- Co-authored-by: Eka Winata <eka.winata@gopay.co.id> * feat: add zone offset for timestamp * feat: use zone offset config for metadata util * feat: Use ZoneId instead of ZoneOffset * feat: use ali auto partitioning strategy [todo fix ProtoDataColumnRecordDecoratorTest] * test: fix ProtoDataColumnRecordDecoratorTest * chore: remove unused config * fix: set MaxComputeCache when partitioningStrategy is not null * chore: checkstyle * feat: add checking on sql result * fix: metadata util * chore: checkstyle * fix: add validation for enum based config and make it fail fast on initialization * chore: fix exception type * chore: checkstyle * fix: class naming and add validation for enum value * chore: Fix MetadataUtil to accept List of TupleString * chore: switch metadata ordering * - refactor schema cache to factory method - fix schema cache to fetch from metadata server * Fix MaxComputeSchemaCache and its test * chore: if not exists true table creation parameter * - checkstyle SinkConfigUtils - add test for SinkConfigUtils * - add retry utils - separate DDL operation to DdlManager * test: Add test for RetryUtils.java * feat: add global configs as externalized parameter * feat: refactor streaming management to StreamingSessionManager class * feat: externalize partition time unit configuration * fix: refactor TimestampPartitioningStrategy * test: add case when passed object is not Record * chore: remove null checking since message couldn't be null * chore: add synchronization on schema update method * chore: checkstyle main * chore: checkstyle main * chore: update docs * chore: update schema docs * chore: reorder annotation in config * chore: change version to 0.10.0 * chore: bump aliyun version * chore: remove redundant enum converter class * feat: use guava cache for streaming session * chore: use sessionCache.getUnchecked * test: update test * chore: checkstyle * chore: use const instead of literal string * chore: wrap update statement with backtick * test: add IOException use case * chore: refactor RetryUtils to receive exception predicate * chore: remove validateConfig() * fix: make RecordWrapper immutable * chore: optimize variable declaration on MaxComputeOdpsGlobalSettingsConverter * fix: remove unused TableTunnel dependencies from InsertManager.java and its implementation * fix: setup FlushOption on the constructor for InsertManager * fix: rename static factory method of streaming session manager * chore: add javadoc for PayloadConverter * chore: refactor StreamingSessionManager to receive LoadingCache in its constructor * chore: Refactor PrimitiveTypeInfoConverter to use ImmutableMap * chore: Rename ConverterOrchestrator.java to ProtobufConverterOrchestrator.java * chore: refactor name and type info to static var * chore: move MaxComputeSchemaHelper to root MaxCompute package * chore: rename getDdlDeclaration to getDDL * chore: remove builder on MaxComputeSchema * chore: fix indent * refactor: MaxComputeSchemaCache to separate handling between non-null and null descriptor * refactor: use google Sets on PartitioningStrategyFactory * refactor: make TableValidator limit configurable * refactor: use immutable map on MetadataUtil * refactor: remove SinkConfigUtils.java and its test. Transfer the implementation of util method directly in class using it * chore: add static imports for assertions lib * chore: rename PayloadConverter.java to ProtobufPayloadConverter * feat: add validation for timestamp type * chore: update docs and default value for valid min and max timestamp * refactor: wrap payload converter to work with DTO * chore: set default maximum session count to 2 for SINK_MAXCOMPUTE_STREAMING_INSERT_MAXIMUM_SESSION_COUNT * test: fix assertion * feat: add timestamp difference validation * feat: Add NaN and Infinite check float and double * chore: rename upsertTable to createOrUpdateTable * chore: refactor inline lambda to static method * chore: update docs * fix: use server side schema for building metadata * chore: add docs for SINK_CONNECTOR_SCHEMA_PROTO_UNKNOWN_FIELDS_VALIDATION * chore: rename MaxComputeSchemaHelper.java to MaxComputeSchemaBuilder * refactor: Add method to add uniform errors in SinkResponse and apply it to MaxComputeSink * feat: parameterized SINK_MAXCOMPUTE_STREAMING_INSERT_TUNNEL_SLOT_COUNT_PER_SESSION * chore: update generic.md to use more explicit context * chore: update maxcompute.md to show the correct type mapping * chore: rename object to parsedObject in ProtoPayload.java * chore: refactor `convert` method under ProtobufConverterOrchestrator to toMaxComputeTypeInfo and toMaxComputeValue * chore: add javadoc for ProtobufTypeInfoConverter.java * chore: checkstyle * chore: add static import * chore: remove extraneous space --------- Co-authored-by: Vaishnavi190900 <152475034+Vaishnavi190900@users.noreply.github.com>
ekawinataa
added a commit
that referenced
this pull request
Dec 20, 2024
) * feat: Add Type Info Converter Implementation * feat: Add Test * feat: Complete test for MessageTypeInfoConverterTest * feat: Complete test for TimestampTypeInfoConverter * feat: Complete test for StructTypeInfoConverter.java * feat: Complete test for DurationTypeInfoConverter.java * feat: Complete test for BaseTypeInfoConverterTest.java * feat: get implementation diff from feat branch * test: update ConverterOrchestratorTest * test: Add InsertManagerFactoryTest * chore: remove unused constructor * test: - add DefaultPartitioningStrategyTest - add TimestampPartitioningStrategyTest - add PartitioningStrategyFactoryTest * fix: PartitioningStrategyFactoryTest * fix: remove unused dependency injection * test: add non null injected decorator * test: add test for MaxComputeSchemaCache * chore: remove unused lombok annotations * test: Add MaxComputeSinkTest * test: add test for close * test: fix typeinfo * fix: add error handler in MaxComputeSink * add ProtoUnknownFieldValidationType * test: Add test for ProtoUnknownFieldValidationType * fix: use correct converter * fix: rename method * chore: rename * chore: exclude conflicting deps * chore: fix checkstyle main * chore: fix checkstyle test * chore: fix dependencies issue * chore: remove main * fix: fix test * chore: exclude model package and bump version * feat: add max compute metrics * feat: add compression option * fix: fix ordering on instantiation * feat: Add utils and implement schema update * feat: Add logging when executing SQL * chore: checkstyle main * chore: fix wrong params * feat: add converter for MaxComputeCompressionAlgorithmConverter * chore: fix checkstyle * fix: byte array conversion * test: fix test * fix: remove record reordering, put metadata at the beginning * fix: use timestamp_ntz for proto timestamp and partition * fix: adjust unit test to use timestamp_ntz instead of timestamp * chore: remove unused lombok annotations * chore: exclude maxcompute client and factory from coverage * chore: exclude maxcompute client from coverage check * chore: delete unused proto * test: add test case for log key * chore: remove unused getters * test: add compression option test * chore: remove redundant column field * fix: duration converter cast to message * feat: add partition precondition on MaxComputeClient * feat: add allowSchemaMismatch to false * chore: remove unused config * chore: adjust config name * chore: add sink and config docs * chore: docs layout * chore: docs layout + adjust default value of config * fix: revert mistakenly pushed build gradle change * chore: fix magic constant * fix: set null instead of default value to nonexistent field * chore: fix partitioning strategy * test: complete unit test * feat: add configurable date format for timestamp partition key * chore: change default value and docs * Maxcompute instrumentation (#55) * added instrumentation for MC sink * chore: refactor InsertManager to abstract class * chore: revert spacing change * chore: refactor --------- Co-authored-by: Eka Winata <eka.winata@gopay.co.id> * feat: add zone offset for timestamp * feat: use zone offset config for metadata util * feat: Use ZoneId instead of ZoneOffset * feat: use ali auto partitioning strategy [todo fix ProtoDataColumnRecordDecoratorTest] * test: fix ProtoDataColumnRecordDecoratorTest * chore: remove unused config * fix: set MaxComputeCache when partitioningStrategy is not null * chore: checkstyle * feat: add checking on sql result * fix: metadata util * chore: checkstyle * fix: add validation for enum based config and make it fail fast on initialization * chore: fix exception type * chore: checkstyle * fix: class naming and add validation for enum value * chore: Fix MetadataUtil to accept List of TupleString * chore: switch metadata ordering * - refactor schema cache to factory method - fix schema cache to fetch from metadata server * Fix MaxComputeSchemaCache and its test * chore: if not exists true table creation parameter * - checkstyle SinkConfigUtils - add test for SinkConfigUtils * - add retry utils - separate DDL operation to DdlManager * test: Add test for RetryUtils.java * feat: add global configs as externalized parameter * feat: refactor streaming management to StreamingSessionManager class * feat: externalize partition time unit configuration * fix: refactor TimestampPartitioningStrategy * test: add case when passed object is not Record * chore: remove null checking since message couldn't be null * chore: add synchronization on schema update method * chore: checkstyle main * chore: checkstyle main * chore: update docs * chore: update schema docs * chore: reorder annotation in config * chore: change version to 0.10.0 * chore: bump aliyun version * chore: remove redundant enum converter class * feat: use guava cache for streaming session * chore: use sessionCache.getUnchecked * test: update test * chore: checkstyle * chore: use const instead of literal string * chore: wrap update statement with backtick * test: add IOException use case * chore: refactor RetryUtils to receive exception predicate * chore: remove validateConfig() * fix: make RecordWrapper immutable * chore: optimize variable declaration on MaxComputeOdpsGlobalSettingsConverter * fix: remove unused TableTunnel dependencies from InsertManager.java and its implementation * fix: setup FlushOption on the constructor for InsertManager * fix: rename static factory method of streaming session manager * chore: add javadoc for PayloadConverter * chore: refactor StreamingSessionManager to receive LoadingCache in its constructor * chore: Refactor PrimitiveTypeInfoConverter to use ImmutableMap * chore: Rename ConverterOrchestrator.java to ProtobufConverterOrchestrator.java * chore: refactor name and type info to static var * chore: move MaxComputeSchemaHelper to root MaxCompute package * chore: rename getDdlDeclaration to getDDL * chore: remove builder on MaxComputeSchema * chore: fix indent * refactor: MaxComputeSchemaCache to separate handling between non-null and null descriptor * refactor: use google Sets on PartitioningStrategyFactory * refactor: make TableValidator limit configurable * refactor: use immutable map on MetadataUtil * refactor: remove SinkConfigUtils.java and its test. Transfer the implementation of util method directly in class using it * chore: add static imports for assertions lib * chore: rename PayloadConverter.java to ProtobufPayloadConverter * feat: add validation for timestamp type * chore: update docs and default value for valid min and max timestamp * refactor: wrap payload converter to work with DTO * chore: set default maximum session count to 2 for SINK_MAXCOMPUTE_STREAMING_INSERT_MAXIMUM_SESSION_COUNT * test: fix assertion * feat: add timestamp difference validation * feat: Add NaN and Infinite check float and double * chore: rename upsertTable to createOrUpdateTable * chore: refactor inline lambda to static method * chore: update docs * fix: use server side schema for building metadata * chore: add docs for SINK_CONNECTOR_SCHEMA_PROTO_UNKNOWN_FIELDS_VALIDATION * chore: rename MaxComputeSchemaHelper.java to MaxComputeSchemaBuilder * refactor: Add method to add uniform errors in SinkResponse and apply it to MaxComputeSink * feat: parameterized SINK_MAXCOMPUTE_STREAMING_INSERT_TUNNEL_SLOT_COUNT_PER_SESSION * chore: update generic.md to use more explicit context * chore: update maxcompute.md to show the correct type mapping * chore: rename object to parsedObject in ProtoPayload.java * chore: refactor `convert` method under ProtobufConverterOrchestrator to toMaxComputeTypeInfo and toMaxComputeValue * chore: add javadoc for ProtobufTypeInfoConverter.java * chore: checkstyle * chore: add static import * chore: remove extraneous space * refactor: Consolidate TypeInfoConverter and PayloadConverter to single interface * chore: remove consolidated interface and implementation * chore: checkstyle * chore: checkstyle * refactor: Add typeInfo caching * fix: checkstyle * refactor: remove computeIfAbsent * fix: Cache in top level * chore: use cached convertSingularPayload * refactor: remove proxy object reference and bootstrap * refactor: remove proxy MaxComputeSinkConfig to TimestampProtobufMaxComputeConverter * refactor: unknown field validation to not use from config directly * chore: checkstyle * chore: invert the condition to validate ParsedMessage * feat: add instrumentation for unknown field validation latency * feat: add session count metrics * chore: fix checkstyle test + use captureValue instead of captureCount for session manager * feat: add streaming session initialization metrics * feat: add feature flag to instrument metrics * refactor: change instrumentation metric naming for session created * refactor: refactor class where unknown field validation happened to refer to bootstrapped flag field * chore: bump depot version * chore: Update docs for generic config * chore: Use concurrent hashmap * chore: use junit4 assertions * feat: use thread name as part of the session key * docs: Add Javadocs for maxcompute * docs: Add Javadocs for ProtoUnknownFieldValidationType * chore: revert to use partition spec key * refactor: use partition spec as sole key, remove MaxComputeClient dependency from MaxComputeSink, fix checkstyle * feat: add configuration to enable default proto value * fix: add null object directly for default value enabled message field * chore: change odps version to 0.51.0-public * fix: revert default unset field * refactor: abstract base session builder * refactor: simplify interface method structure * refactor: refactor access modifier for const * refactor: checkstyle * refactor: simplify getStreamingSessionManager * test: add static import on mockito methods * refactor: register primitives as predefined set * refactor: move MaxComputeProtobufConverterCache.java to converter package * test: add unit test for MaxComputeProtobufConverterCache * refactor: simplify InsertManager creation * refactor: remove unused interface method and its test * refactor: use get to check the existence of entry * refactor: pass StatsDReporter instead of instrumentation * chore: Add TableTunnel docs on StreamingSessionManager + refactor params * refactor: remove unused constant * refactor: use switch case * refactor: explicitly mention unknown field only checked on the first element --------- Co-authored-by: Vaishnavi190900 <152475034+Vaishnavi190900@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Added MaxCompute metrics and instrumentation