feat: Added support to write iceberg tables #5989

malhotrashivam · 2024-08-26T21:58:44Z

Closes: #6125
Should be merged after #6156, #6268

Also moves existing Iceberg tests from Junit4 to Junit5.

…eInstr

lbooker42 · 2024-10-02T18:32:55Z

extensions/iceberg/s3/src/main/java/io/deephaven/iceberg/util/IcebergToolsS3.java

        properties.put(CatalogProperties.CATALOG_IMPL, catalog.getClass().getName());
        properties.put(CatalogProperties.URI, catalogURI);
        properties.put(CatalogProperties.WAREHOUSE_LOCATION, warehouseLocation);

+        // Following is needed to write new manifest files when writing new data.
+        // Not setting this will result in using ResolvingFileIO.
+        properties.put(CatalogProperties.FILE_IO_IMPL, S3FileIO.class.getName());


Why is it a problem to use ResolvingFileIO? You will need to provide HadoopConf info.

So from what I understood, ResolvingFileIO would add additional step of resolving which file IO to use.
And based on the file name, I thought we are sure here that its in S3. That's why I thought of using S3FileIO. Does that sound reasonable?

I understand your point, we clearly know that this should resolve to S3FileIO. In all other scenarios, though, we've trusted the Iceberg API to resolve correctly and I'd be happier to stick with that.

I don't feel strongly about this however.

In this context, I'm happy to have specify S3FileIO specified. I think in general though, we are leaning away from providing these "pre-configured" entrypoints for the user, and prefer they go through the generic catalog creation? In which case I would argue that IcebergToolsS3 we might want to deprecate.

extensions/iceberg/src/main/java/io/deephaven/iceberg/base/IcebergUtils.java

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergCatalogAdapter.java

lbooker42 · 2024-10-02T19:03:34Z

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergCatalogAdapter.java

+     * @param dhTable The deephaven table to append
+     * @param instructions The instructions for customizations while writing, or null to use default instructions
+     */
+    public void append(


We aren't currently providing a way to add partitioned data to an Iceberg table, but we should create a ticket for this functionality.

Will check with Ryan/Devin if I should add this as part of this PR itself, or should start a separate ticket/PR.
If we decide to do it here, I would need a bit more clarity on the API.

The main difference from non-partitioned writing is providing a set of partition values to which a particular data file will belong to. Iceberg provides a few ways to specify this information for a new data file (reference):

withPartition: Accept a org.apache.iceberg.StructLike instance on which it can call get to access different partition values.

withPartitionPath: Provide String newPartitionPath which it splits based on the partition spec and = and / characters.

withPartitionValues: Provide a List<String> partitionValues.

We would need to decide what to accept from the user, and finalize on the API.

Do we support reading partitioned Iceberg?

Yes, we do. cc: @lbooker42

lbooker42 · 2024-10-02T19:37:20Z

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergCatalogAdapter.java

+    private static void verifyAppendCompatibility(
+            final Schema icebergSchema,
+            final TableDefinition tableDefinition) {
+        // Check that all columns in the table definition are part of the Iceberg schema


Is it an error to write a table with extra columns not in Iceberg schema? Or should we only write the matching columns?

I have added a iceberg instruction for verify compatability, if the user wants to verify that the data being appended/overwritten is compatible with the original table.

For appending, we check that all required columns are present in the data with compatible types, and no extra columns outside of the schema are present.

For overwriting, we check if the schema is identical.

If the user wants to override these checks, they can disable them through iceberg instructions.
I can add more details in the comments for the new iceberg instruction.

It's the and no extra columns outside of the schema are present test that concerns me. We will be requiring a user to dropColumns() to meet this compatibility metric when I'm not sure it's important at all.

compatible is pretty broad in definition (IMO), should not mean identical.

I agree that this can be restrictive. That is why I added an optional iceberg instructions so that user can disable validation if they are sure what they are adding.
Let me keep this thread open to see what everyone thinks.

From the perspective of DH controlling the writing, I think we can be opinionated, and prefer to give the user less control than we might otherwise want / need to. I don't think it makes sense to allow the user to specify they want to write out Deephaven columns to the parquet file that aren't mapped to the Iceberg table. By default, it may be appropriate to always use the latest Schema at the time it is being written to, but I think we need to allow the user to pick the Schema they want to use for writing. IMO, the physical parquet columns we write should be a (non-strict) subset of that Schema's columns. If there is a map between a DH column and a Schema column, we write it to parquet; otherwise, we exclude it. This also means that every column we write out in this way has an Iceberg field_id we can map into parquet's field_id.

IMO, the physical parquet columns we write should be a (non-strict) subset of that Schema's columns.
I have something similar right now, along with an extra check that all the required columns from the schema should be there in the tables being appended.

extensions/iceberg/s3/src/test/java/io/deephaven/iceberg/util/IcebergLocalStackTest.java

malhotrashivam · 2024-10-01T16:47:51Z

extensions/parquet/table/src/main/java/io/deephaven/parquet/table/ParquetInstructions.java

@@ -33,95 +36,18 @@
 */
 public abstract class ParquetInstructions implements ColumnToCodecMappings {

-    private static volatile String defaultCompressionCodecName = CompressionCodecName.SNAPPY.toString();


Removing unnecessary configuration parameters.

malhotrashivam · 2024-10-01T17:21:16Z

extensions/parquet/table/src/main/java/io/deephaven/parquet/table/ParquetInstructions.java

@@ -433,6 +382,14 @@ public boolean useDictionary() {
        public void useDictionary(final boolean useDictionary) {
            this.useDictionary = useDictionary;
        }
+
+        public OptionalInt getFieldId() {


The field Id related logic may change when #6156 gets merged.

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergCatalogAdapter.java

malhotrashivam · 2024-10-02T20:15:53Z

extensions/iceberg/s3/src/main/java/io/deephaven/iceberg/util/IcebergToolsS3.java

        properties.put(CatalogProperties.CATALOG_IMPL, catalog.getClass().getName());
        properties.put(CatalogProperties.URI, catalogURI);
        properties.put(CatalogProperties.WAREHOUSE_LOCATION, warehouseLocation);

+        // Following is needed to write new manifest files when writing new data.
+        // Not setting this will result in using ResolvingFileIO.
+        properties.put(CatalogProperties.FILE_IO_IMPL, S3FileIO.class.getName());


So from what I understood, ResolvingFileIO would add additional step of resolving which file IO to use.
And based on the file name, I thought we are sure here that its in S3. That's why I thought of using S3FileIO. Does that sound reasonable?

extensions/iceberg/src/main/java/io/deephaven/iceberg/base/IcebergUtils.java

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergCatalogAdapter.java

extensions/iceberg/src/main/java/io/deephaven/iceberg/base/IcebergUtils.java

extensions/iceberg/src/main/java/io/deephaven/iceberg/layout/IcebergBaseLayout.java

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergCatalogAdapter.java

extensions/iceberg/s3/src/test/java/io/deephaven/iceberg/util/IcebergToolsTest.java

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergParquetWriteInstructions.java

devinrsmith · 2024-10-08T01:34:28Z

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergReadInstructions.java

+    /**
+     * A {@link Map map} of rename instructions from Iceberg to Deephaven column names to use when reading the Iceberg
+     * data files.
+     */
+    public abstract Map<String, String> columnRenames();


I know this was just pulled up from Base instructions, but whenever we have an Iceberg column name, it should ideally be tied to a Schema, I mention as much in #6124. I don't know if this is actionable here, and it may be more of a discussion for #5707

devinrsmith · 2024-10-08T01:43:25Z

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergWriteInstructions.java

+    /**
+     * A one-to-one {@link Map map} from Deephaven to Iceberg column names to use when writing deephaven tables to
+     * Iceberg tables.
+     */
+    public abstract Map<String, String> dhToIcebergColumnRenames();


Ditto, this is also tied w/ a specific Schema. From a configuration point, I see the ease-of-use for using strings, but I wonder if we should have Map<NestedField, String>, or Map<Integer, String> + Schema (user can still use strings, but we materialize into this). I also suggest we model it with the iceberg data as the key, since it's the target system that is dictating uniqueness in this case (otherwise, we need to add a check that there are no duplicate values). Technically, we could support writing a single DH column to multiple iceberg columns, although I don't see a reason to offer that out of the gate without a clear use case.

I like this idea, although I would prefer to do this change as part of a separate PR for both reading and writing side together. For now, I have added a check for uniqueness for no duplicate values.
I can also link these comments in the issue #6124.

devinrsmith · 2024-10-08T01:45:44Z

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergWriteInstructions.java

+     * The inverse map of {@link #dhToIcebergColumnRenames()}.
+     */
+    @Value.Lazy
+    public Map<String, String> icebergToDhColumnRenames() {


I see you do have the inverse, but I think this should be the source of the data, and not a view of it. I think we should also consider if we even want to provide as a public helper, or if it should be only for internal use.

Same response as #5989 (comment)

malhotrashivam · 2024-10-08T16:52:33Z

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergWriteInstructions.java

+     * A one-to-one {@link Map map} from Deephaven to Iceberg column names to use when writing deephaven tables to
+     * Iceberg tables.
+     */
+    // TODO Please suggest better name for this method, on the read side its just called columnRenames


Pending TODO

.../iceberg/s3/src/test/java/io/deephaven/iceberg/util/IcebergParquetWriteInstructionsTest.java

extensions/iceberg/src/main/java/io/deephaven/iceberg/base/IcebergUtils.java

rcaudy

Partial review of IcebergCatalogAdapter. Looking pretty good so far!

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergCatalogAdapter.java

rcaudy · 2024-10-17T14:25:14Z

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergCatalogAdapter.java

+    public void overwrite(
+            @NotNull final TableIdentifier tableIdentifier,
+            @NotNull final Table[] dhTables,
+            @Nullable final IcebergWriteInstructions instructions) {
+        writeImpl(tableIdentifier, dhTables, instructions, true, true);
+    }


So, basically, you're suggesting that the API revolve around immutable parameter structs with builders? It's probably marginally more annoying to users in the console, but let's us reduce overload spam.

We could also just get out of the business of taking POJO table identifiers (or Strings; standardize on one or the other) cutting overloads by half. (Shivam points out that this will be automatically addressed by interposing the TableAdapter layer.)

Table args can just be a varargs list at the end of the method, cutting overloads again by half.

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergCatalogAdapter.java

rcaudy · 2024-10-17T14:37:31Z

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergCatalogAdapter.java

+            newNamespaceCreated = createNamespaceIfNotExists(tableIdentifier.namespace());
+            newSpecAndSchema = createSpecAndSchema(useDefinition, writeInstructions);
+            icebergTable = createNewIcebergTable(tableIdentifier, newSpecAndSchema, writeInstructions);


Can this stuff be done as a transaction? Possibly with removing/adding the data files, as well?

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergCatalogAdapter.java

rcaudy · 2024-10-17T14:54:52Z

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergCatalogAdapter.java

+        // Write the data to parquet files
+        int count = 0;
+        for (final Table dhTable : dhTables) {
+            final String filename = String.format(


I thought names came from the catalog. Table locations or whatever.

rcaudy · 2024-10-17T14:56:54Z

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergCatalogAdapter.java

+    /**
+     * Commit the changes to the Iceberg table by creating snapshots.
+     */
+    private static void commit(


Placeholder for @rcaudy

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergCatalogAdapter.java

chipkent · 2024-10-25T19:52:12Z

py/server/deephaven/experimental/iceberg.py

+                 maximum_dictionary_size: Optional[int] = None,
+                 target_page_size: Optional[int] = None,
+                 verify_schema: Optional[bool] = None,
+                 dh_to_iceberg_column_renames: Optional[Dict[str, str]] = None,


name is very long, especially if a user is specifying it. Any reason it can't just be column_renames?

you should also look through the rest of the API to see if column_renames or col_renames would be most consistent. I would guess col_renames.

chipkent · 2024-10-25T19:56:05Z

py/server/deephaven/experimental/iceberg.py

+            if compression_codec_name is not None:
+                builder.compressionCodecName(compression_codec_name)
+
+            if maximum_dictionary_keys is not None:
+                builder.maximumDictionaryKeys(maximum_dictionary_keys)
+
+            if maximum_dictionary_size is not None:
+                builder.maximumDictionarySize(maximum_dictionary_size)
+
+            if target_page_size is not None:
+                builder.targetPageSize(target_page_size)
+
+            if verify_schema is not None:
+                builder.verifySchema(verify_schema)
+
+            if dh_to_iceberg_column_renames is not None:
+                for dh_name, iceberg_name in dh_to_iceberg_column_renames.items():
+                    builder.putDhToIcebergColumnRenames(dh_name, iceberg_name)
+
+            if table_definition is not None:
+                builder.tableDefinition(TableDefinition(table_definition).j_table_definition)
+
+            if data_instructions is not None:
+                builder.dataInstructions(data_instructions.j_object)


I suspect all of these cases can have is not None removed. Confirm with @jmao-denver on what he wants to see.

chipkent · 2024-10-25T19:57:10Z

py/server/deephaven/experimental/iceberg.py

+               tables: List[Table],
+               partition_paths: Optional[List[str]] = None,
+               instructions: Optional[IcebergParquetWriteInstructions] = None):
+        # TODO Review javadoc in this file once again


chipkent · 2024-10-25T19:57:20Z

py/server/deephaven/experimental/iceberg.py

+               table_identifier: str,
+               tables: List[Table],
+               partition_paths: Optional[List[str]] = None,
+               instructions: Optional[IcebergParquetWriteInstructions] = None):


missing a return type hint

chipkent · 2024-10-25T19:58:41Z

py/server/deephaven/experimental/iceberg.py

+               instructions: Optional[IcebergParquetWriteInstructions] = None):
+        # TODO Review javadoc in this file once again
+        """
+        Append the provided Deephaven table as a new partition to the existing Iceberg table in a single snapshot. This


this says "table" and "partition", but the input is a list of tables. Does that mean multiple tables go to one partition or multiple partitions? etc.

chipkent · 2024-10-25T20:04:41Z

py/server/deephaven/experimental/iceberg.py

+                        tables: List[Table],
+                        partition_paths: Optional[List[str]] = None,


see other comments

chipkent · 2024-10-25T20:04:48Z

py/server/deephaven/experimental/iceberg.py

+                        table_identifier: str,
+                        tables: List[Table],
+                        partition_paths: Optional[List[str]] = None,
+                        instructions: Optional[IcebergParquetWriteInstructions] = None):


missing a return type hint

chipkent · 2024-10-25T20:05:23Z

py/server/deephaven/experimental/iceberg.py

+        of data files that were written. Users can use this list to create a transaction/snapshot if needed.
+
+        Args:
+            table_identifier (str): the identifier string for iceberg table to write to.


chipkent · 2024-10-25T20:05:37Z

py/server/deephaven/experimental/iceberg.py

+            tables (List[Table]): the tables to write.
+            partition_paths (Optional[List[str]]): the partitioning path at which data would be written, for example,
+                "year=2021/month=01". If omitted, we will try to write data to the table without partitioning.


see other comments

chipkent · 2024-10-25T20:06:06Z

py/server/deephaven/experimental/iceberg.py

+            partition_paths (Optional[List[str]]): the partitioning path at which data would be written, for example,
+                "year=2021/month=01". If omitted, we will try to write data to the table without partitioning.
+            instructions (Optional[IcebergParquetWriteInstructions]): the instructions for customizations while writing.
+        """


All above cases that are missing the return type hint are also missing docs on the return value

malhotrashivam added parquet Related to the Parquet integration DocumentationNeeded ReleaseNotesNeeded Release notes are needed s3 iceberg labels Aug 26, 2024

malhotrashivam added this to the 0.37.0 milestone Aug 26, 2024

malhotrashivam requested review from lbooker42 and devinrsmith August 26, 2024 21:58

malhotrashivam self-assigned this Aug 26, 2024

Initial commit

8c81883

malhotrashivam force-pushed the sm-ice-write branch from 2a60cf8 to 8c81883 Compare August 27, 2024 17:00

malhotrashivam added 4 commits August 28, 2024 14:17

Added type info map and modified instructions class hierarchy

758c1f3

Minor tweaks to the instructions class hierarchy

48cb8d8

Merged writeTable and appendTable into addPartition

244bc99

Split IcebergParquetWriteInstructions into WriteInstr and ParquetWrit…

09340c2

…eInstr

malhotrashivam marked this pull request as draft September 6, 2024 18:27

malhotrashivam changed the title ~~feat: [DO NOT MERGE] Added support to write iceberg tables~~ feat: Added support to write iceberg tables Sep 6, 2024

malhotrashivam added 6 commits September 25, 2024 13:26

Merge branch 'main' into sm-ice-write

33b60e2

Resolving more conflicts

c70b50e

Merge branch 'main' into sm-ice-write

cd278ab

Added unit tests and moved Iceberg tests to Junit5

689e8a1

Preparing change for code review Part 1

d7f2c81

Preparing for review Part 2

131a552

malhotrashivam marked this pull request as ready for review October 1, 2024 17:25

Added more unit tests

3021585

lbooker42 reviewed Oct 2, 2024

View reviewed changes

malhotrashivam commented Oct 2, 2024

View reviewed changes

malhotrashivam added 3 commits October 3, 2024 11:14

Review with Larry part 1

9f82ba0

Fix for failing job

cbae64e

Review with Larry Part 2

7de59b0

Fix for failing jobs

f0f86cc

malhotrashivam requested review from chipkent, jmao-denver and rcaudy as code owners October 7, 2024 21:51

devinrsmith reviewed Oct 8, 2024

View reviewed changes

Review with Devin Part 2

adb21e9

malhotrashivam mentioned this pull request Oct 8, 2024

Iceberg reading with explicit Schema support #6124

Open

malhotrashivam commented Oct 8, 2024

View reviewed changes

Review with Devin Part 3

744ce60

devinrsmith reviewed Oct 8, 2024

View reviewed changes

.../iceberg/s3/src/test/java/io/deephaven/iceberg/util/IcebergParquetWriteInstructionsTest.java Outdated Show resolved Hide resolved

extensions/iceberg/src/main/java/io/deephaven/iceberg/base/IcebergUtils.java Outdated Show resolved Hide resolved

Review with Devin Part 4

a8252ce

malhotrashivam requested a review from devinrsmith October 8, 2024 19:07

malhotrashivam added 5 commits October 14, 2024 11:32

Merge branch 'main' into sm-ice-write

38f55f0

Minor tweaks

5a64faf

More tweaks

ba70f1a

Updated some comments

96db353

Updated javadoc and added new tests

6e2c233

rcaudy reviewed Oct 17, 2024

View reviewed changes

malhotrashivam added 6 commits October 17, 2024 13:47

Merge branch 'main' into sm-ice-write

de6eba0

Review with Ryan Part 1

0ebeba2

Review with Ryan Part 2

31f46ba

Fix for failing parquet reads

946def0

Added more tests for writeDataFile

bd8535c

Added tests for on write callback

78bd605

malhotrashivam commented Oct 18, 2024

View reviewed changes

extensions/iceberg/src/main/java/io/deephaven/iceberg/util/IcebergCatalogAdapter.java Outdated Show resolved Hide resolved

malhotrashivam added 4 commits October 21, 2024 15:10

Merge branch 'main' into sm-ice-write

500cfe6

Added support for writing partitioned tables

e2aba1f

Merge branch 'main' into sm-ice-write

e4a936e

Minor tweaks

b32ad68

chipkent reviewed Oct 25, 2024

View reviewed changes

		tables: List[Table],
		partition_paths: Optional[List[str]] = None,

feat: Added support to write iceberg tables #5989

Are you sure you want to change the base?

feat: Added support to write iceberg tables #5989

Conversation

malhotrashivam commented Aug 26, 2024 • edited Loading

Choose a reason for hiding this comment

malhotrashivam Oct 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

malhotrashivam Oct 3, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

malhotrashivam Oct 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

malhotrashivam Oct 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

malhotrashivam Oct 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rcaudy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

malhotrashivam commented Aug 26, 2024 •

edited

Loading

malhotrashivam Oct 2, 2024 •

edited

Loading

malhotrashivam Oct 3, 2024 •

edited

Loading

malhotrashivam Oct 2, 2024 •

edited

Loading

malhotrashivam Oct 8, 2024 •

edited

Loading

malhotrashivam Oct 2, 2024 •

edited

Loading