Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sbt assembly and environment variables #671

Closed
wants to merge 3 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 13 additions & 4 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Spark runs on both Windows and UNIX-like systems (e.g. Linux, Mac OS). All you n

# Building

Spark uses [Simple Build Tool](http://www.scala-sbt.org), which is bundled with it. To compile the code, go into the top-level Spark directory and run
Spark uses [sbt](http://www.scala-sbt.org), which is bundled with it. To compile the code, go into the top-level Spark directory and run

sbt/sbt assembly

Expand Down Expand Up @@ -58,14 +58,23 @@ Hadoop, you must build Spark against the same version that your cluster uses.
By default, Spark links to Hadoop 1.0.4. You can change this by setting the
`SPARK_HADOOP_VERSION` variable when compiling:

SPARK_HADOOP_VERSION=2.2.0 sbt/sbt assembly
SPARK_HADOOP_VERSION=2.4.0 sbt/sbt assembly

In addition, if you wish to run Spark on [YARN](running-on-yarn.html), set
`SPARK_YARN` to `true`:

SPARK_HADOOP_VERSION=2.0.5-alpha SPARK_YARN=true sbt/sbt assembly
SPARK_HADOOP_VERSION=2.4.0 SPARK_YARN=true sbt/sbt assembly

Note that on Windows, you need to set the environment variables on separate lines, e.g., `set SPARK_HADOOP_VERSION=1.2.1`.
You may also want to set `SPARK_HIVE` to `true` to build Spark Hive module.

SPARK_HIVE=true sbt/sbt assembly

Mix the environment variables - `SPARK_HADOOP_VERSION`, `SPARK_YARN`, and `SPARK_HIVE` - to match
your (assembly) needs.

Note that on Windows, you need to set the environment variables on separate lines, e.g.

set SPARK_HADOOP_VERSION=2.4.0

# Where to Go from Here

Expand Down