-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🎉 Update destination-s3 to handle the new data types protocol #20088
Conversation
Affected Connector ReportNOTE
|
Connector | Version | Changelog | Publish |
---|
- See "Actionable Items" below for how to resolve warnings and errors.
❌ Destinations (47)
Connector | Version | Changelog | Publish |
---|---|---|---|
destination-aws-datalake |
0.1.1 |
✅ | ✅ |
destination-azure-blob-storage |
0.1.6 |
✅ | ✅ |
destination-bigquery |
1.2.8 |
✅ | ✅ |
destination-bigquery-denormalized |
1.2.8 |
✅ | ✅ |
destination-cassandra |
0.1.4 |
✅ | ✅ |
destination-clickhouse |
0.2.1 |
✅ | ✅ |
destination-clickhouse-strict-encrypt |
0.2.1 |
🔵 (ignored) |
🔵 (ignored) |
destination-csv |
0.2.10 |
✅ | ✅ |
destination-databricks |
0.3.1 |
✅ | ✅ |
destination-dev-null |
0.2.7 |
🔵 (ignored) |
🔵 (ignored) |
destination-doris |
0.1.0 |
✅ | ✅ |
destination-dynamodb |
0.1.7 |
✅ | ✅ |
destination-e2e-test |
0.2.4 |
✅ | ✅ |
destination-elasticsearch |
0.1.6 |
✅ | ✅ |
destination-elasticsearch-strict-encrypt |
0.1.6 |
🔵 (ignored) |
🔵 (ignored) |
destination-gcs |
0.2.12 |
✅ | ✅ |
destination-iceberg |
0.1.0 |
✅ | ✅ |
destination-jdbc |
0.3.14 |
🔵 (ignored) |
🔵 (ignored) |
destination-kafka |
0.1.10 |
✅ | ✅ |
destination-keen |
0.2.4 |
✅ | ✅ |
destination-kinesis |
0.1.5 |
✅ | ✅ |
destination-local-json |
0.2.11 |
✅ | ✅ |
destination-mariadb-columnstore |
0.1.7 |
✅ | ✅ |
destination-mongodb |
0.1.9 |
✅ | ✅ |
destination-mongodb-strict-encrypt |
0.1.9 |
🔵 (ignored) |
🔵 (ignored) |
destination-mqtt |
0.1.3 |
✅ | ✅ |
destination-mssql |
0.1.22 |
✅ | ✅ |
destination-mssql-strict-encrypt |
0.1.22 |
🔵 (ignored) |
🔵 (ignored) |
destination-mysql |
0.1.20 |
✅ | ✅ |
destination-mysql-strict-encrypt |
❌ 0.1.21 (mismatch: 0.1.20 ) |
🔵 (ignored) |
🔵 (ignored) |
destination-oracle |
0.1.19 |
✅ | ✅ |
destination-oracle-strict-encrypt |
0.1.19 |
🔵 (ignored) |
🔵 (ignored) |
destination-postgres |
0.3.26 |
✅ | ✅ |
destination-postgres-strict-encrypt |
0.3.26 |
🔵 (ignored) |
🔵 (ignored) |
destination-pubsub |
0.2.0 |
✅ | ✅ |
destination-pulsar |
0.1.3 |
✅ | ✅ |
destination-r2 |
0.1.0 |
✅ | ✅ |
destination-redis |
0.1.4 |
✅ | ✅ |
destination-redpanda |
0.1.0 |
✅ | ✅ |
destination-redshift |
0.3.51 |
✅ | ✅ |
destination-rockset |
0.1.4 |
✅ | ✅ |
destination-s3 |
0.3.18 |
✅ | ✅ |
destination-s3-glue |
0.1.1 |
✅ | ✅ |
destination-scylla |
0.1.3 |
✅ | ✅ |
destination-snowflake |
0.4.40 |
✅ | ✅ |
destination-tidb |
0.1.0 |
✅ | ✅ |
destination-yugabytedb |
0.1.0 |
✅ | ✅ |
- See "Actionable Items" below for how to resolve warnings and errors.
👀 Other Modules (1)
- base-normalization
Actionable Items
(click to expand)
Category | Status | Actionable Item |
---|---|---|
Version | ❌ mismatch |
The version of the connector is different from its normal variant. Please bump the version of the connector. |
⚠ doc not found |
The connector does not seem to have a documentation file. This can be normal (e.g. basic connector like source-jdbc is not published or documented). Please double-check to make sure that it is not a bug. |
|
Changelog | ⚠ doc not found |
The connector does not seem to have a documentation file. This can be normal (e.g. basic connector like source-jdbc is not published or documented). Please double-check to make sure that it is not a bug. |
❌ changelog missing |
There is no chnagelog for the current version of the connector. If you are the author of the current version, please add a changelog. | |
Publish | ⚠ not in seed |
The connector is not in the seed file (e.g. source_definitions.yaml ), so its publication status cannot be checked. This can be normal (e.g. some connectors are cloud-specific, and only listed in the cloud seed file). Please double-check to make sure that it is not a bug. |
❌ diff seed version |
The connector exists in the seed file, but the latest version is not listed there. This usually means that the latest version is not published. Please use the /publish command to publish the latest version. |
/test connector=connectors/destination-s3
Build PassedTest summary info:
|
/test connector=connectors/destination-aws-datalake
Build FailedTest summary info:
|
/test connector=connectors/destination-azure-blob-storage
Build FailedTest summary info:
|
/test connector=connectors/destination-bigquery
Build PassedTest summary info:
|
/test connector=connectors/destination-azure-blob-storage
Build PassedTest summary info:
|
/test connector=connectors/destination-bigquery-denormalized
Build PassedTest summary info:
|
/test connector=connectors/destination-databricks
Build PassedTest summary info:
|
/test connector=connectors/destination-gcs
Build FailedTest summary info:
|
/test connector=connectors/destination-r2
Build FailedTest summary info:
|
/test connector=connectors/destination-gcs
|
/test connector=connectors/destination-gcs
|
/test connector=connectors/destination-gcs
Build FailedTest summary info:
|
/test connector=connectors/destination-s3
Build PassedTest summary info:
|
/test connector=connectors/destination-bigquery
Build PassedTest summary info:
|
Not sure that I fully understand (probably let's sync about it). Generally speaking the only adaptation that I see for removal of V0 logic is to remove old schema json to avro types https://github.com/airbytehq/airbyte/pull/20088/files#diff-ccb0476def819bb3177cf9458fb6939bc25c42ea516305171e1e11be2a3d1a39R32 (It's marked as deprecated) Plus some places here https://github.com/airbytehq/airbyte/pull/20088/files#diff-d25b772db4efd1799ea8a085d4acc27f7f5393fda2dca2498c2d27391e6148deR235 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚛
/test connector=connectors/destination-s3
Build FailedTest summary info:
|
@edgao as was discussed I run destination s3 integration tests with dataset for V0. Looks good to me: |
/publish connector=connectors/destination-s3
if you have connectors that successfully published but failed definition generation, follow step 4 here |
/publish connector=connectors/destination-s3
if you have connectors that successfully published but failed definition generation, follow step 4 here |
@suhomud - I did a test with some more complex data types and I think there's some (maybe?) slightly wrong behavior with union types. Specifically, if you have a schema with Though... we get the same thing with timestamps, which is maybe(?) preexisting behavior. So maybe we're just OK with this? Test procedure:
Observe that str_int_data and str_timestamp_data both got converted to avro strings. But if you edit catalog.json to swap the order of the |
@edgao great analysis! I think this behaviour was here before this PR and it is related to json-avro-converter lib. So if the first type in union is a string then it checks if it is an instance of string. In our case it is true, therefor nothing prevent converter to put it as a string because it is a valid string We can think about making |
nice, makes sense. Let's just keep this behavior for now. |
What
New version protocol support V1. See https://github.com/airbytehq/airbyte/blob/master/airbyte-protocol/protocol-models/src/main/resources/airbyte_protocol/well_known_types.yaml
How
In order to support new protocol the base-java-s3 module should handle new json schema types:
V0 protocol is still supported but marked as
@Deprecated
:This PR Support era and string numbers for json-aero-converter adding the string numbers and new Date type support. Released version is 1.0.2
Recommended reading order
JsonSchemaType.java
new datatypesJsonToAvroSchemaConverter.java
build avro schema with new protocol.Test related changes:
AdvancedTestDataComparator.java
Updated regex to match new date format. FixedisDateTimeWithTzValue
since it was not working and was not able to match date time with timezone. NewcompareTime
method. Added workoroundTEST_DATASET_IGNORE_LIST
those values are used as string value in the dataset i.e. inexchange_rate_messages.txt
and cannot pass some tests (probably need to be re-visit)S3AvroParquetTestDataComparator.java
new S3 specific test data comparator (the old one renamed toS3BaseAvroParquetTestDataComparator
)ProtocolVersion.V1
v1 test dataset🚨 User Impact 🚨
The only noticable thing is when Avro schema has
[int, string]
types it will try to produce number from the string so "123" become 123 in result Avro object.Pre-merge Checklist
Expand the relevant checklist and delete the others.
New Connector
Community member or Airbyter
Airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampledocs/integrations/README.md
airbyte-integrations/builds.md
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereUpdating a connector
Community member or Airbyter
Airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereConnector Generator
-scaffold
in their name) have been updated with the latest scaffold by running./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates
then checking in your changesTests
Unit
Put your unit tests output here.
Integration
Put your integration tests output here.
Acceptance
Put your acceptance tests output here.