-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-24543][SQL] Support any type as DDL string for from_json's schema #21550
Conversation
Yea, I like this way. Did it solve your case too? |
It doesn't solve our case fully but at least it unblocks us. |
Test build #91759 has finished for PR 21550 at commit
|
retest this please |
Test build #91771 has finished for PR 21550 at commit
|
@@ -110,6 +111,8 @@ abstract class DataType extends AbstractDataType { | |||
@InterfaceStability.Stable | |||
object DataType { | |||
|
|||
def fromDDL(ddl: String): DataType = CatalystSqlParser.parseDataType(ddl) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's reasonable for DataType.fromDDL
to also support table style schema like a int, b long
. How about we put the try catch here and other places just need to call DataType.fromDDL
?
Test build #91832 has finished for PR 21550 at commit
|
retest this please |
Test build #91844 has finished for PR 21550 at commit
|
thanks, merging to master! |
@@ -354,8 +354,8 @@ class JsonFunctionsSuite extends QueryTest with SharedSQLContext { | |||
|
|||
test("SPARK-24027: from_json - map<string, map<string, int>>") { | |||
val in = Seq("""{"a": {"b": 1}}""").toDS() | |||
val schema = MapType(StringType, MapType(StringType, IntegerType)) | |||
val out = in.select(from_json($"value", schema)) | |||
val schema = "map<string, map<string, int>>" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A general suggestion. Create a new test case for these changes, instead of modifying the existing ones.
## What changes were proposed in this pull request? In the PR, I propose to support any DataType represented as DDL string for the from_json function. After the changes, it will be possible to specify `MapType` in SQL like: ```sql select from_json('{"a":1, "b":2}', 'map<string, int>') ``` and in Scala (similar in other languages) ```scala val in = Seq("""{"a": {"b": 1}}""").toDS() val schema = "map<string, map<string, int>>" val out = in.select(from_json($"value", schema, Map.empty[String, String])) ``` ## How was this patch tested? Added a couple sql tests and modified existing tests for Python and Scala. The former tests were modified because it is not imported for them in which format schema for `from_json` is provided. Author: Maxim Gekk <maxim.gekk@databricks.com> Closes apache#21550 from MaxGekk/from_json-ddl-schema.
## What changes were proposed in this pull request? In the PR, I propose to support any DataType represented as DDL string for the from_json function. After the changes, it will be possible to specify `MapType` in SQL like: ```sql select from_json('{"a":1, "b":2}', 'map<string, int>') ``` and in Scala (similar in other languages) ```scala val in = Seq("""{"a": {"b": 1}}""").toDS() val schema = "map<string, map<string, int>>" val out = in.select(from_json($"value", schema, Map.empty[String, String])) ``` ## How was this patch tested? Added a couple sql tests and modified existing tests for Python and Scala. The former tests were modified because it is not imported for them in which format schema for `from_json` is provided. Author: Maxim Gekk <maxim.gekk@databricks.com> Closes apache#21550 from MaxGekk/from_json-ddl-schema.
What changes were proposed in this pull request?
In the PR, I propose to support any DataType represented as DDL string for the from_json function. After the changes, it will be possible to specify
MapType
in SQL like:and in Scala (similar in other languages)
How was this patch tested?
Added a couple sql tests and modified existing tests for Python and Scala. The former tests were modified because it is not imported for them in which format schema for
from_json
is provided.