You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After a few tests with types and SQL statements for #211 , we found out that the INSERT INTO was not behaving as expected. Fails to write data or does not find the expected columns of the schema.
How to reproduce?
Different steps about how to reproduce the problem.
1. Code that triggered the bug, or steps to reproduce:
After some analysis of the way Delta Lake handles INSERT INTO, I've found an explanation:
The command does not load the schema of the existing table. It tries to write the data as it arrives, and since the data does not have any schema, Spark generates one automatically with values: "col1, col2, col3".
According to comments on code in the Delta Lake project:
/**
* With Delta, we ACCEPT_ANY_SCHEMA, meaning that Spark doesn't automatically adjust the schema
* of INSERT INTO. Here we check if we need to perform any schema adjustment for INSERT INTO by
* name queries. We also check that any columns not in the list of user-specified columns must
* have a default expression.
*/
A solution should be to call the code in DeltaAnalysis to avoid duplicating the same behavior.
Since a lot of methods that reconstruct and check the schema are complex, I encourage us to not develop the same solution ourselves. But if there's not an easy way of delegating, that could be another possibility.
What went wrong?
After a few tests with types and SQL statements for #211 , we found out that the INSERT INTO was not behaving as expected. Fails to write data or does not find the expected columns of the
schema
.How to reproduce?
Different steps about how to reproduce the problem.
1. Code that triggered the bug, or steps to reproduce:
The test throws the following error:
c1 does not exist. Available: col1, col2, col3
2. Branch and commit id:
main at commit f9c7ab0
3. Spark version:
3.2.1
4. Hadoop version:
3.4.0
5. How are you running Spark?
Running Spark in Local Machine
6. Stack trace:
Described in 1.
The text was updated successfully, but these errors were encountered: