Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Hive-to-Iceberg table migration errors caused by special characters. #25106

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

twowind
Copy link

@twowind twowind commented Feb 21, 2025

Description

This PR fixes a bug where iceberg.system.migrate fails when a Hive table's partition key contains special characters (e.g., @). The failure happens because the migration process incorrectly applies Iceberg’s partition parsing rules to Hive tables, leading to an Invalid partition field declaration error.

Error Example:

When running:

CREATE TABLE hive.tpch.test_migrate_partitioned_table
WITH (partitioned_by = ARRAY['special@col'])
AS SELECT 1 AS id, 'special1' AS "special@col";

CALL iceberg.system.migrate('tpch', 'test_migrate_partitioned_table')

If the Hive table has a partition key with special characters, the migration fails with:

io.trino.spi.TrinoException: Unable to parse partitioning value: Invalid partition field declaration: special@col

Copy link

cla-bot bot commented Feb 21, 2025

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@github-actions github-actions bot added the iceberg Iceberg connector label Feb 21, 2025
@ebyhr ebyhr self-requested a review February 24, 2025 22:42
Copy link

cla-bot bot commented Feb 25, 2025

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@@ -294,20 +294,22 @@ public void testMigratePartitionedTable()
String hiveTableName = "hive.tpch." + tableName;
String icebergTableName = "iceberg.tpch." + tableName;

assertUpdate("CREATE TABLE " + hiveTableName + " WITH (partitioned_by = ARRAY['part_col']) AS SELECT 1 id, 'part1' part_col", 1);
assertUpdate("CREATE TABLE " + hiveTableName + " " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you extract into a separate test case? e.g. testMigratePartitionedTableWithSpecialCharacter
Also, could add a similar test to TestIcebergAddFilesProcedure?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve implemented the suggested changes. Please review again.

2. add testAddFilesSpecialCharPartitionColumnDefinitions
Copy link

cla-bot bot commented Feb 25, 2025

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@ebyhr
Copy link
Member

ebyhr commented Feb 26, 2025

@twowind Please ping me once your CLA is registered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
iceberg Iceberg connector
Development

Successfully merging this pull request may close these issues.

2 participants