Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-2963] [SQL] There no documentation about building to use HiveServer and CLI for SparkSQL #1885

Closed
wants to merge 4 commits into from

Conversation

sarutak
Copy link
Member

@sarutak sarutak commented Aug 11, 2014

No description provided.

@sarutak sarutak changed the title [SPARK-2963] [SPARK-2963] There no documentation about building to use HiveServer and CLI for SparkSQL Aug 11, 2014
@sarutak sarutak changed the title [SPARK-2963] There no documentation about building to use HiveServer and CLI for SparkSQL [SPARK-2963] [SQL] There no documentation about building to use HiveServer and CLI for SparkSQL Aug 11, 2014
@SparkQA
Copy link

SparkQA commented Aug 11, 2014

QA tests have started for PR 1885. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18301/consoleFull

@tianyi
Copy link
Contributor

tianyi commented Aug 11, 2014

i think the building detail has already added in https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md

@sarutak
Copy link
Member Author

sarutak commented Aug 11, 2014

Hi @tianyi .
I know that document but the document is outdated. We can not use -Phive option is no longer to use ThriftServer. In master branch, that moved into a hive-thriftserver project and we should use -Phive-thriftserver option.

In addition, the document is for programmer. People who build spark is not always programmer, but often system administrator so how to build is described in README as well as the description for using YARN.

@SparkQA
Copy link

SparkQA commented Aug 11, 2014

QA results for PR 1885:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18301/consoleFull

@tianyi
Copy link
Contributor

tianyi commented Aug 11, 2014

@sarutak the lastest sql-programming-guide.md had already included "-Phive-thriftserver" option

@sarutak
Copy link
Member Author

sarutak commented Aug 11, 2014

Oh, master is updated.
But, as I mentioned, It's not friendly for builders.

@liancheng
Copy link
Contributor

@sarutak Just checked the SQL programming guide in master, as @tianyi said, Thrift server and CLI related contents were both added. But we didn't mention sbt -Phive-thriftserver assembly is needed to enable Spark SQL CLI. Please go ahead if you'd like to complete that part, thanks! And what do you mean when you say "not friendly for builders"?

@sarutak
Copy link
Member Author

sarutak commented Aug 11, 2014

@liancheng Thanks for your reply!
"Not friendly" means, the description about the place "-Phive-thrift-server is needed" is not good.
Currently, the description is mentioned in "Programmer's guide" but a person who build spark is not always a programmer. Sometime, a person who build may a system administrator.
I don't think, NON-programmer read Programmer's guide before main README.

Actually, how to build with YARN is mentioned in main README.

@liancheng
Copy link
Contributor

Ah OK, fair enough. I agree with you.

@liancheng
Copy link
Contributor

Then I guess we should add -Phive and -Phive-thriftserver related instructions to both our main README file and building-with-maven.md.

@sarutak
Copy link
Member Author

sarutak commented Aug 11, 2014

Ah, you're right.
Now I've modified building-with-maven.md. Thanks!

@SparkQA
Copy link

SparkQA commented Aug 11, 2014

QA tests have started for PR 1885. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18311/consoleFull

@@ -115,6 +115,15 @@ If your project is built with Maven, add this to your POM file's `<dependencies>
</dependency>


## A Note About HiveServer and CLI for SparkSQL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be "Hive Thrift server", which is compatible with HiveServer2, rather than HiveServer. And usually we use "Spark SQL" instead of "SparkSQL".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In sql-programming-guide.md, it's called "Thrift JDBC server".
Should I use this notation?
The latest PR uses "Thrift JDBC server".

@liancheng
Copy link
Contributor

Since I'm not a native tongue either, we'd better wait for some native speaker to have a look. Otherwise LGTM, thanks!

@SparkQA
Copy link

SparkQA commented Aug 11, 2014

QA results for PR 1885:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18311/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 11, 2014

QA tests have started for PR 1885. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18314/consoleFull

@SparkQA
Copy link

SparkQA commented Aug 11, 2014

QA results for PR 1885:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18314/consoleFull

@marmbrus
Copy link
Contributor

Thanks for adding this! One suggestion: instead of just adding documentation about the thrift server we should probably make these general sections about Spark SQL's Hive compatibility and cover both -Phive and -Phive-thrift-server. It doesn't look to me like that is covered in these documents, correct?

@sarutak
Copy link
Member Author

sarutak commented Aug 12, 2014

Thanks @marmbrus !
The main issue I mention in this ticket is how to build to use CLI / Thrift JDBC server is not written on the proper place.

As you said, exactly documents illustrating Spark SQL's Hive compatibility is not present.
We should improve Spark SQL's document step by step.

First of all, we should add how to build to existing documents which illustrate about building for every user to use CLI / Thrift JDBC server. If they don't know how to build to use CLI / Thrift JDBC server, they cannot use those...

Next step, we should build Spark SQL's comprehensive document not only programmer's guide.

@marmbrus
Copy link
Contributor

Thanks, merged to master and 1.1

@asfgit asfgit closed this in 869f06c Aug 13, 2014
asfgit pushed a commit that referenced this pull request Aug 13, 2014
…erver and CLI for SparkSQL

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #1885 from sarutak/SPARK-2963 and squashes the following commits:

ed53329 [Kousuke Saruta] Modified description and notaton of proper noun
07c59fc [Kousuke Saruta] Added a description about how to build to use HiveServer and CLI for SparkSQL to building-with-maven.md
6e6645a [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-2963
c88fa93 [Kousuke Saruta] Added a description about building to use HiveServer and CLI for SparkSQL

(cherry picked from commit 869f06c)
Signed-off-by: Michael Armbrust <michael@databricks.com>
asfgit pushed a commit that referenced this pull request Aug 23, 2014
…g CLI and Thrift JDBC server is absent in proper document -

The most important things I mentioned in #1885 is as follows.

* People who build Spark is not always programmer.
* If a person who build Spark is not a programmer, he/she won't read programmer's guide before building.

So, how to build for using CLI and JDBC server is not only in programmer's guide.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #2080 from sarutak/SPARK-2963 and squashes the following commits:

ee07c76 [Kousuke Saruta] Modified regression of the description about building for using Thrift JDBC server and CLI
ed53329 [Kousuke Saruta] Modified description and notaton of proper noun
07c59fc [Kousuke Saruta] Added a description about how to build to use HiveServer and CLI for SparkSQL to building-with-maven.md
6e6645a [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-2963
c88fa93 [Kousuke Saruta] Added a description about building to use HiveServer and CLI for SparkSQL
asfgit pushed a commit that referenced this pull request Aug 23, 2014
…g CLI and Thrift JDBC server is absent in proper document -

The most important things I mentioned in #1885 is as follows.

* People who build Spark is not always programmer.
* If a person who build Spark is not a programmer, he/she won't read programmer's guide before building.

So, how to build for using CLI and JDBC server is not only in programmer's guide.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #2080 from sarutak/SPARK-2963 and squashes the following commits:

ee07c76 [Kousuke Saruta] Modified regression of the description about building for using Thrift JDBC server and CLI
ed53329 [Kousuke Saruta] Modified description and notaton of proper noun
07c59fc [Kousuke Saruta] Added a description about how to build to use HiveServer and CLI for SparkSQL to building-with-maven.md
6e6645a [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-2963
c88fa93 [Kousuke Saruta] Added a description about building to use HiveServer and CLI for SparkSQL
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
…erver and CLI for SparkSQL

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes apache#1885 from sarutak/SPARK-2963 and squashes the following commits:

ed53329 [Kousuke Saruta] Modified description and notaton of proper noun
07c59fc [Kousuke Saruta] Added a description about how to build to use HiveServer and CLI for SparkSQL to building-with-maven.md
6e6645a [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-2963
c88fa93 [Kousuke Saruta] Added a description about building to use HiveServer and CLI for SparkSQL
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
…g CLI and Thrift JDBC server is absent in proper document -

The most important things I mentioned in apache#1885 is as follows.

* People who build Spark is not always programmer.
* If a person who build Spark is not a programmer, he/she won't read programmer's guide before building.

So, how to build for using CLI and JDBC server is not only in programmer's guide.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes apache#2080 from sarutak/SPARK-2963 and squashes the following commits:

ee07c76 [Kousuke Saruta] Modified regression of the description about building for using Thrift JDBC server and CLI
ed53329 [Kousuke Saruta] Modified description and notaton of proper noun
07c59fc [Kousuke Saruta] Added a description about how to build to use HiveServer and CLI for SparkSQL to building-with-maven.md
6e6645a [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-2963
c88fa93 [Kousuke Saruta] Added a description about building to use HiveServer and CLI for SparkSQL
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants