[SPARK-28812][SQL][DOC] Document SHOW PARTITIONS in SQL Reference #26635

dilipbiswal · 2019-11-22T07:08:26Z

What changes were proposed in this pull request?

Document SHOW PARTITIONS statement in SQL Reference Guide.

Why are the changes needed?

Currently Spark lacks documentation on the supported SQL constructs causing
confusion among users who sometimes have to look at the code to understand the
usage. This is aimed at addressing this issue.

Does this PR introduce any user-facing change?

Yes.

Before
After

SparkQA · 2019-11-22T07:20:30Z

Test build #114279 has finished for PR 26635 at commit dab073a.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dilipbiswal · 2019-11-22T07:35:41Z

cc @srowen

docs/sql-ref-syntax-aux-show-partitions.md

huaxingao · 2019-11-22T16:12:47Z

docs/sql-ref-syntax-aux-show-partitions.md

+  <dt><code><em>partition_spec</em></code></dt>
+  <dd>
+    An optional parameter that specifies a comma separated list of key and value pairs
+    for partitions. When specified, the partitions that matches the partition spec


Nit: match?

huaxingao · 2019-11-22T16:12:59Z

docs/sql-ref-syntax-aux-show-partitions.md

+  </dd>
+</dl>
+
+### Example


Nit: Examples?

SparkQA · 2019-11-22T19:22:31Z

Test build #114309 has finished for PR 26635 at commit 7e094c3.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2019-11-24T00:21:22Z

docs/sql-ref-syntax-aux-show-partitions.md

+
+### Parameters
+<dl>
+  <dt><code><em>table_identifier</em></code></dt>


Shall we use table_name instead of table_identifier because it's used more?

$ git grep 'table_name' sql-ref-syntax-aux-analyze-table.md: <dt><code>table_name</code></dt> sql-ref-syntax-aux-cache-cache-table.md: <dt><code>table_name</code></dt> sql-ref-syntax-aux-cache-uncache-table.md: <dt><code>table_name</code></dt> sql-ref-syntax-ddl-alter-table.md: <dt><code>table_name</code></dt> sql-ref-syntax-ddl-alter-table.md: <dt><code>table_name</code></dt> sql-ref-syntax-ddl-drop-table.md: <dt><code>table_name</code></dt> sql-ref-syntax-ddl-repair-table.md: <dt><code>table_name</code></dt> sql-ref-syntax-ddl-truncate-table.md: <dt><code>table_name</code></dt> sql-ref-syntax-dml-insert-into.md: <dt><code>table_name</code></dt> sql-ref-syntax-dml-insert-overwrite-table.md: <dt><code>table_name</code></dt> sql-ref-syntax-dml-load.md: <dt><code>table_name</code></dt> $ git grep 'table_identifier' sql-ref-syntax-aux-describe-table.md: <dt><code>table_identifier</code></dt> sql-ref-syntax-aux-show-tblproperties.md: <dt><code>table_identifier</code></dt>

Also, we used table_name at line 39 of this PR.

@dongjoon-hyun For consistency we can. But personally, i would like to tell the users that we accept a table identifier in this command i.e it can be a qualified table.

For example if we look at the following link.
https://docs.databricks.com/spark/latest/spark-sql/language-manual/cache-table.html

We do mention in above in the syntax that the table can be qualified. However, in our doc its not so clear. Thats the reason, i chose to define "table_identifier" as a parameter and had a sub syntax to explain further so there is no ambiguity. Another option is to just have [db_name].table_name in the main syntax block itself. My intention of using a parameter in the main syntax block was to make it easy to read the core syntax.

SHOW PARTITIONS [db_name].table_name [ PARTITION ( partition_col_name = partition_col_val [ , ... ] ) ]

Please let me know what you think and i would change.

If you think so, you are trying to do two things in this single PR.

You had better follow the existing table_name in this PR first. Then, you can create another PR to adjust all of them from the docs. In general, we didn't agree your suggestion is good or bad. In worst case, you know that your suggestion can be rejected or we may choose another 3rd option.

@dongjoon-hyun Sure.. I have made the change for this pr.

SparkQA · 2019-11-24T02:58:30Z

Test build #114332 has finished for PR 26635 at commit 6c95a38.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun

+1, LGTM. Merged to master. Thank you, @dilipbiswal .
I generated the doc locally and verified.

dilipbiswal · 2019-11-24T04:11:07Z

@dongjoon-hyun Thank you very much.

May i please request you to put some thoughts on how we should standardize on specifying table identifiers and partition specifications in the sql reference doc ?

For table identifiers, we have used a few variations that i see in our doc.

[db_name].table_name in main syntax block.
table_name in the main syntax block and a parameter table_name. In this case, we don't seem to specify that it is an table identifier and the syntax to specify it.
table_identifier in main syntax block and then a sub-syntax in the parameter definition.
tableIdentifier in main syntax block and then a sub-syntax in the parameter definition.

We can pick one from above 4 or pick a new way as you mentioned.

For partitioning spec we have used the following variations.

[ PARTITION ( partition_col_name = partition_col_val [ , ... ] ) ] in main syntax block and
define the same as a parameter as well.
[PARTITION(partition_spec)] in main syntax block and define the same as a parameter. However we have not specified the syntax of the partitioning spec.
PARTITION partition_spec in main syntax block. And define "partition_spec" as parameter. In this case also we have not specified the syntax of the partitioning spec.
[PARTITION partition_spec] in main syntax block and PARTITION ( partition_spec :[ partition_column = partition_col_value, partition_column = partition_col_value, ...] ) in parameter.
partition_spec in main syntax block. And in the parameter section, a sub syntax that defines the syntax of partitioning.

Again we can pick one of the above 5 or pick another option.

cc @srowen @huaxingao

srowen · 2019-11-24T14:05:56Z

@dilipbiswal standardization would be welcome. I know the new docs were written by 4-5 different people and I'm sure they're not entirely consistent. I don't have a strong opinion on the option, but consistency is best. Go with whatever is most prevalent in the docs already? but any consistency is an improvement.

dilipbiswal · 2019-11-30T02:38:42Z

@srowen I had an offline discussion with @huaxingao and we have agreed on a format. She has graciously agreed to fix the inconsistencies in the parameter. Thank you @huaxingao.

[SPARK-28812][SQL][DOC] Document SHOW PARTITIONS in SQL Reference

dab073a

konjac reviewed Nov 22, 2019

View reviewed changes

docs/sql-ref-syntax-aux-show-partitions.md Outdated Show resolved Hide resolved

huaxingao reviewed Nov 22, 2019

View reviewed changes

docs/sql-ref-syntax-aux-show-partitions.md Outdated

</dd>

</dl>

### Example

Copy link

Contributor

huaxingao Nov 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Examples?

dongjoon-hyun added DOCUMENTATION SQL labels Nov 22, 2019

Code review

7e094c3

dongjoon-hyun reviewed Nov 24, 2019

View reviewed changes

code review

6c95a38

dongjoon-hyun approved these changes Nov 24, 2019

View reviewed changes

dongjoon-hyun closed this in 564826d Nov 24, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-28812][SQL][DOC] Document SHOW PARTITIONS in SQL Reference #26635

[SPARK-28812][SQL][DOC] Document SHOW PARTITIONS in SQL Reference #26635

dilipbiswal commented Nov 22, 2019

SparkQA commented Nov 22, 2019

dilipbiswal commented Nov 22, 2019

huaxingao Nov 22, 2019

huaxingao Nov 22, 2019

SparkQA commented Nov 22, 2019

dongjoon-hyun Nov 24, 2019 •

edited

Loading

dongjoon-hyun Nov 24, 2019

dilipbiswal Nov 24, 2019 •

edited

Loading

dongjoon-hyun Nov 24, 2019

dilipbiswal Nov 24, 2019

SparkQA commented Nov 24, 2019

dongjoon-hyun left a comment

dilipbiswal commented Nov 24, 2019

srowen commented Nov 24, 2019

dilipbiswal commented Nov 30, 2019

[SPARK-28812][SQL][DOC] Document SHOW PARTITIONS in SQL Reference #26635

[SPARK-28812][SQL][DOC] Document SHOW PARTITIONS in SQL Reference #26635

Conversation

dilipbiswal commented Nov 22, 2019

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

SparkQA commented Nov 22, 2019

dilipbiswal commented Nov 22, 2019

huaxingao Nov 22, 2019

Choose a reason for hiding this comment

huaxingao Nov 22, 2019

Choose a reason for hiding this comment

SparkQA commented Nov 22, 2019

dongjoon-hyun Nov 24, 2019 • edited Loading

Choose a reason for hiding this comment

dongjoon-hyun Nov 24, 2019

Choose a reason for hiding this comment

dilipbiswal Nov 24, 2019 • edited Loading

Choose a reason for hiding this comment

dongjoon-hyun Nov 24, 2019

Choose a reason for hiding this comment

dilipbiswal Nov 24, 2019

Choose a reason for hiding this comment

SparkQA commented Nov 24, 2019

dongjoon-hyun left a comment

Choose a reason for hiding this comment

dilipbiswal commented Nov 24, 2019

srowen commented Nov 24, 2019

dilipbiswal commented Nov 30, 2019

dongjoon-hyun Nov 24, 2019 •

edited

Loading

dilipbiswal Nov 24, 2019 •

edited

Loading