Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow creating tables with avro.schema.literal #14426

Merged
merged 1 commit into from
Oct 24, 2022

Conversation

weijiii
Copy link
Member

@weijiii weijiii commented Oct 2, 2022

Description

This PR aims to address the following

  • Add support for creating Avro-backed Hive table with avro.schema.literal.
  • Add product tests to cover following cases
    1. Creating table by Hive with avro.schema.literal and querying from Trino (TestAvroSchemaLiteral)
    2. Creating table by Trino with avro.schema.literal and querying from Trino (TestAvroSchemaLiteral)
    3. Schema evolution on partitioned/non-partitioned tables with avro.schema.literal (Captured in BaseTestAvroSchemaEvolution)

Update also needed in https://github.com/trinodb/trino/blob/master/docs/src/main/sphinx/connector/hive.rst

Non-technical explanation

With this change, Trino users will be able to create Hive table using avro.schema.literal to define the schema.

Release notes

( ) This is not user-visible or docs only and no release notes are required.
(x) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text:

@cla-bot
Copy link

cla-bot bot commented Oct 2, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: weijiii.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email email@example.com
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@weijiii
Copy link
Member Author

weijiii commented Oct 2, 2022

Signed CLA sent via email.

@findepi
Copy link
Member

findepi commented Oct 3, 2022

Support avro schema literal

I thought we support avro schema literals for years. cc @anusudarsan
what does this PR do?

@weijiii
Copy link
Member Author

weijiii commented Oct 3, 2022

Related: #5101
cc: @phd3

@findepi
Copy link
Member

findepi commented Oct 3, 2022

@weijiii can you suggest a more descriptive PR title?

@weijiii
Copy link
Member Author

weijiii commented Oct 4, 2022

@weijiii can you suggest a more descriptive PR title?

@findepi
I should break this patch down to separate PRs.

The first commit is to make MetastoreUtil::getHiveSchema to also check the table-level storage descriptor for avro tables like Hive does.

The second commit is to allow Trino to use avro.schema.literal for avro tables.
The third commit is to use field schemas when the table/partition's SerDes does not use metastore for schema

I'd like to keep this PR for adding support for avro.schema.literal so I changed the title to such. I will also see if I can separate the logic of the latter 2 commits.

@weijiii weijiii changed the title Support avro schema literal Allow creating tables with avro.schema.literal Oct 4, 2022
@weijiii
Copy link
Member Author

weijiii commented Oct 4, 2022

The first commit is to make MetastoreUtil::getHiveSchema to also check the table-level storage descriptor for avro tables like Hive does.

#14463

@weijiii weijiii force-pushed the SupportAvroSchemaLiteral branch from 5a29c93 to 547db5a Compare October 4, 2022 20:45
@cla-bot cla-bot bot added the cla-signed label Oct 4, 2022
@weijiii weijiii force-pushed the SupportAvroSchemaLiteral branch 3 times, most recently from 82008a7 to 0d4405e Compare October 11, 2022 00:45
@weijiii
Copy link
Member Author

weijiii commented Oct 11, 2022

Previously failed test case TestHivePartitionSchemaEvolution#testOrc with config config-hdp3 is successful as I tested locally. I believe it is not related to this patch as the failure stems from a Hive execution.

@weijiii
Copy link
Member Author

weijiii commented Oct 11, 2022

Failed test:

TestKuduConnectoKerberosSmokeTest > kerberosAuthTicketExpiryTest [groups: profile_specific_tests, kudu]

java.sql.SQLException: Query failed (#20221011_024428_00002_v7zfj): Couldn't find a valid master in (kudu-master:7051). Exceptions received: [org.apache.kudu.client.NonRecoverableException: server requires authentication, but client Kerberos credentials (TGT) have expired. Authentication tokens were not used because no token is available]

Test detail:

2022-10-11T02:50:31.3721551Z | a7ef095e33264d12b34dc25a14bdb3cd | suite-7-non-generic | multinode-kerberos-kudu | config-default |      {} |  FAILED |   7.06m | Tests exited with code 1 |

Re-run succeeds locally with:

testing/bin/ptl test run --environment multinode-kerberos-kudu --config=config-default -- -t TestKuduConnectoKerberosSmokeTest.kerberosAuthTicketExpiryTest

I think it is unrelated to this patch. The error message indicates credential expiration, possibly due to test running overtime?

@weijiii weijiii requested review from posulliv and findepi October 12, 2022 19:17
Copy link
Member

@electrum electrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phd3 do you want to take a look?

@findepi findepi force-pushed the SupportAvroSchemaLiteral branch from dac71b7 to 0ba244b Compare October 24, 2022 07:25
@findepi
Copy link
Member

findepi commented Oct 24, 2022

(just squashed, the build was green https://github.com/trinodb/trino/actions/runs/3293963286)

@findepi findepi merged commit c6901d5 into trinodb:master Oct 24, 2022
@github-actions github-actions bot added this to the 401 milestone Oct 24, 2022
@findepi findepi mentioned this pull request Oct 24, 2022
@phd3
Copy link
Member

phd3 commented Oct 24, 2022

thanks @weijiii @findepi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants