Skip to content

Commit

Permalink
SPARK-1183. Don't use "worker" to mean executor
Browse files Browse the repository at this point in the history
Author: Sandy Ryza <sandy@cloudera.com>

Closes apache#120 from sryza/sandy-spark-1183 and squashes the following commits:

5066a4a [Sandy Ryza] Remove "worker" in a couple comments
0bd1e46 [Sandy Ryza] Remove --am-class from usage
bfc8fe0 [Sandy Ryza] Remove am-class from doc and fix yarn-alpha
607539f [Sandy Ryza] Address review comments
74d087a [Sandy Ryza] SPARK-1183. Don't use "worker" to mean executor
  • Loading branch information
sryza authored and pwendell committed Mar 13, 2014
1 parent e4e8d8f commit 6983732
Show file tree
Hide file tree
Showing 21 changed files with 312 additions and 294 deletions.
2 changes: 1 addition & 1 deletion docs/cluster-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ object in your main program (called the _driver program_).
Specifically, to run on a cluster, the SparkContext can connect to several types of _cluster managers_
(either Spark's own standalone cluster manager or Mesos/YARN), which allocate resources across
applications. Once connected, Spark acquires *executors* on nodes in the cluster, which are
worker processes that run computations and store data for your application.
processes that run computations and store data for your application.
Next, it sends your application code (defined by JAR or Python files passed to SparkContext) to
the executors. Finally, SparkContext sends *tasks* for the executors to run.

Expand Down
2 changes: 1 addition & 1 deletion docs/graphx-programming-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ Like RDDs, property graphs are immutable, distributed, and fault-tolerant. Chan
structure of the graph are accomplished by producing a new graph with the desired changes. Note
that substantial parts of the original graph (i.e., unaffected structure, attributes, and indicies)
are reused in the new graph reducing the cost of this inherently functional data-structure. The
graph is partitioned across the workers using a range of vertex-partitioning heuristics. As with
graph is partitioned across the executors using a range of vertex-partitioning heuristics. As with
RDDs, each partition of the graph can be recreated on a different machine in the event of a failure.

Logically the property graph corresponds to a pair of typed collections (RDDs) encoding the
Expand Down
4 changes: 2 additions & 2 deletions docs/job-scheduling.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,8 @@ Resource allocation can be configured as follows, based on the cluster type:
* **Mesos:** To use static partitioning on Mesos, set the `spark.mesos.coarse` configuration property to `true`,
and optionally set `spark.cores.max` to limit each application's resource share as in the standalone mode.
You should also set `spark.executor.memory` to control the executor memory.
* **YARN:** The `--num-workers` option to the Spark YARN client controls how many workers it will allocate
on the cluster, while `--worker-memory` and `--worker-cores` control the resources per worker.
* **YARN:** The `--num-executors` option to the Spark YARN client controls how many executors it will allocate
on the cluster, while `--executor-memory` and `--executor-cores` control the resources per executor.

A second option available on Mesos is _dynamic sharing_ of CPU cores. In this mode, each Spark application
still has a fixed and independent memory allocation (set by `spark.executor.memory`), but when the
Expand Down
4 changes: 2 additions & 2 deletions docs/mllib-classification-regression.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,8 +77,8 @@ between the two goals of small loss and small model complexity.

**Distributed Datasets.**
For all currently implemented optimization methods for classification, the data must be
distributed between the worker machines *by examples*. Every machine holds a consecutive block of
the `$n$` example/label pairs `$(\x_i,y_i)$`.
distributed between processes on the worker machines *by examples*. Machines hold consecutive
blocks of the `$n$` example/label pairs `$(\x_i,y_i)$`.
In other words, the input distributed dataset
([RDD](scala-programming-guide.html#resilient-distributed-datasets-rdds)) must be the set of
vectors `$\x_i\in\R^d$`.
Expand Down
6 changes: 3 additions & 3 deletions docs/python-programming-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,9 @@ def is_error(line):
errors = logData.filter(is_error)
{% endhighlight %}

PySpark will automatically ship these functions to workers, along with any objects that they reference.
Instances of classes will be serialized and shipped to workers by PySpark, but classes themselves cannot be automatically distributed to workers.
The [Standalone Use](#standalone-use) section describes how to ship code dependencies to workers.
PySpark will automatically ship these functions to executors, along with any objects that they reference.
Instances of classes will be serialized and shipped to executors by PySpark, but classes themselves cannot be automatically distributed to executors.
The [Standalone Use](#standalone-use) section describes how to ship code dependencies to executors.

In addition, PySpark fully supports interactive use---simply run `./bin/pyspark` to launch an interactive shell.

Expand Down
29 changes: 14 additions & 15 deletions docs/running-on-yarn.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ System Properties:
* `spark.yarn.submit.file.replication`, the HDFS replication level for the files uploaded into HDFS for the application. These include things like the spark jar, the app jar, and any distributed cache files/archives.
* `spark.yarn.preserve.staging.files`, set to true to preserve the staged files(spark jar, app jar, distributed cache files) at the end of the job rather then delete them.
* `spark.yarn.scheduler.heartbeat.interval-ms`, the interval in ms in which the Spark application master heartbeats into the YARN ResourceManager. Default is 5 seconds.
* `spark.yarn.max.worker.failures`, the maximum number of executor failures before failing the application. Default is the number of executors requested times 2 with minimum of 3.
* `spark.yarn.max.executor.failures`, the maximum number of executor failures before failing the application. Default is the number of executors requested times 2 with minimum of 3.

# Launching Spark on YARN

Expand All @@ -60,11 +60,10 @@ The command to launch the Spark application on the cluster is as follows:
--jar <YOUR_APP_JAR_FILE> \
--class <APP_MAIN_CLASS> \
--args <APP_MAIN_ARGUMENTS> \
--num-workers <NUMBER_OF_EXECUTORS> \
--master-class <ApplicationMaster_CLASS>
--master-memory <MEMORY_FOR_MASTER> \
--worker-memory <MEMORY_PER_EXECUTOR> \
--worker-cores <CORES_PER_EXECUTOR> \
--num-executors <NUMBER_OF_EXECUTOR_PROCESSES> \
--driver-memory <MEMORY_FOR_ApplicationMaster> \
--executor-memory <MEMORY_PER_EXECUTOR> \
--executor-cores <CORES_PER_EXECUTOR> \
--name <application_name> \
--queue <queue_name> \
--addJars <any_local_files_used_in_SparkContext.addJar> \
Expand All @@ -85,10 +84,10 @@ For example:
--jar examples/target/scala-{{site.SCALA_BINARY_VERSION}}/spark-examples-assembly-{{site.SPARK_VERSION}}.jar \
--class org.apache.spark.examples.SparkPi \
--args yarn-cluster \
--num-workers 3 \
--master-memory 4g \
--worker-memory 2g \
--worker-cores 1
--num-executors 3 \
--driver-memory 4g \
--executor-memory 2g \
--executor-cores 1

The above starts a YARN client program which starts the default Application Master. Then SparkPi will be run as a child thread of Application Master. The client will periodically poll the Application Master for status updates and display them in the console. The client will exit once your application has finished running. Refer to the "Viewing Logs" section below for how to see driver and executor logs.

Expand All @@ -100,12 +99,12 @@ With yarn-client mode, the application will be launched locally, just like runni

Configuration in yarn-client mode:

In order to tune worker cores/number/memory etc., you need to export environment variables or add them to the spark configuration file (./conf/spark_env.sh). The following are the list of options.
In order to tune executor cores/number/memory etc., you need to export environment variables or add them to the spark configuration file (./conf/spark_env.sh). The following are the list of options.

* `SPARK_WORKER_INSTANCES`, Number of executors to start (Default: 2)
* `SPARK_WORKER_CORES`, Number of cores per executor (Default: 1).
* `SPARK_WORKER_MEMORY`, Memory per executor (e.g. 1000M, 2G) (Default: 1G)
* `SPARK_MASTER_MEMORY`, Memory for Master (e.g. 1000M, 2G) (Default: 512 Mb)
* `SPARK_EXECUTOR_INSTANCES`, Number of executors to start (Default: 2)
* `SPARK_EXECUTOR_CORES`, Number of cores per executor (Default: 1).
* `SPARK_EXECUTOR_MEMORY`, Memory per executor (e.g. 1000M, 2G) (Default: 1G)
* `SPARK_DRIVER_MEMORY`, Memory for driver (e.g. 1000M, 2G) (Default: 512 Mb)
* `SPARK_YARN_APP_NAME`, The name of your application (Default: Spark)
* `SPARK_YARN_QUEUE`, The YARN queue to use for allocation requests (Default: 'default')
* `SPARK_YARN_DIST_FILES`, Comma separated list of files to be distributed with the job.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -61,9 +61,9 @@ class ApplicationMaster(args: ApplicationMasterArguments, conf: Configuration,
YarnConfiguration.DEFAULT_RM_AM_MAX_RETRIES)
private var isLastAMRetry: Boolean = true

// Default to numWorkers * 2, with minimum of 3
private val maxNumWorkerFailures = sparkConf.getInt("spark.yarn.max.worker.failures",
math.max(args.numWorkers * 2, 3))
// Default to numExecutors * 2, with minimum of 3
private val maxNumExecutorFailures = sparkConf.getInt("spark.yarn.max.executor.failures",
sparkConf.getInt("spark.yarn.max.worker.failures", math.max(args.numExecutors * 2, 3)))

private var registered = false

Expand Down Expand Up @@ -96,7 +96,7 @@ class ApplicationMaster(args: ApplicationMasterArguments, conf: Configuration,

// Call this to force generation of secret so it gets populated into the
// hadoop UGI. This has to happen before the startUserClass which does a
// doAs in order for the credentials to be passed on to the worker containers.
// doAs in order for the credentials to be passed on to the executor containers.
val securityMgr = new SecurityManager(sparkConf)

// Start the user's JAR
Expand All @@ -115,7 +115,7 @@ class ApplicationMaster(args: ApplicationMasterArguments, conf: Configuration,
}

// Allocate all containers
allocateWorkers()
allocateExecutors()

// Wait for the user class to Finish
userThread.join()
Expand Down Expand Up @@ -215,7 +215,7 @@ class ApplicationMaster(args: ApplicationMasterArguments, conf: Configuration,
t
}

// this need to happen before allocateWorkers
// this need to happen before allocateExecutors
private def waitForSparkContextInitialized() {
logInfo("Waiting for spark context initialization")
try {
Expand Down Expand Up @@ -260,21 +260,21 @@ class ApplicationMaster(args: ApplicationMasterArguments, conf: Configuration,
}
}

private def allocateWorkers() {
private def allocateExecutors() {
try {
logInfo("Allocating " + args.numWorkers + " workers.")
logInfo("Allocating " + args.numExecutors + " executors.")
// Wait until all containers have finished
// TODO: This is a bit ugly. Can we make it nicer?
// TODO: Handle container failure

// Exists the loop if the user thread exits.
while (yarnAllocator.getNumWorkersRunning < args.numWorkers && userThread.isAlive) {
if (yarnAllocator.getNumWorkersFailed >= maxNumWorkerFailures) {
while (yarnAllocator.getNumExecutorsRunning < args.numExecutors && userThread.isAlive) {
if (yarnAllocator.getNumExecutorsFailed >= maxNumExecutorFailures) {
finishApplicationMaster(FinalApplicationStatus.FAILED,
"max number of worker failures reached")
"max number of executor failures reached")
}
yarnAllocator.allocateContainers(
math.max(args.numWorkers - yarnAllocator.getNumWorkersRunning, 0))
math.max(args.numExecutors - yarnAllocator.getNumExecutorsRunning, 0))
ApplicationMaster.incrementAllocatorLoop(1)
Thread.sleep(100)
}
Expand All @@ -283,7 +283,7 @@ class ApplicationMaster(args: ApplicationMasterArguments, conf: Configuration,
// so that the loop in ApplicationMaster#sparkContextInitialized() breaks.
ApplicationMaster.incrementAllocatorLoop(ApplicationMaster.ALLOCATOR_LOOP_WAIT_COUNT)
}
logInfo("All workers have launched.")
logInfo("All executors have launched.")

// Launch a progress reporter thread, else the app will get killed after expiration
// (def: 10mins) timeout.
Expand All @@ -309,15 +309,15 @@ class ApplicationMaster(args: ApplicationMasterArguments, conf: Configuration,
val t = new Thread {
override def run() {
while (userThread.isAlive) {
if (yarnAllocator.getNumWorkersFailed >= maxNumWorkerFailures) {
if (yarnAllocator.getNumExecutorsFailed >= maxNumExecutorFailures) {
finishApplicationMaster(FinalApplicationStatus.FAILED,
"max number of worker failures reached")
"max number of executor failures reached")
}
val missingWorkerCount = args.numWorkers - yarnAllocator.getNumWorkersRunning
if (missingWorkerCount > 0) {
val missingExecutorCount = args.numExecutors - yarnAllocator.getNumExecutorsRunning
if (missingExecutorCount > 0) {
logInfo("Allocating %d containers to make up for (potentially) lost containers".
format(missingWorkerCount))
yarnAllocator.allocateContainers(missingWorkerCount)
format(missingExecutorCount))
yarnAllocator.allocateContainers(missingExecutorCount)
}
else sendProgress()
Thread.sleep(sleepTime)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ import org.apache.spark.util.{Utils, AkkaUtils}
import org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend
import org.apache.spark.scheduler.SplitInfo

class WorkerLauncher(args: ApplicationMasterArguments, conf: Configuration, sparkConf: SparkConf)
class ExecutorLauncher(args: ApplicationMasterArguments, conf: Configuration, sparkConf: SparkConf)
extends Logging {

def this(args: ApplicationMasterArguments, sparkConf: SparkConf) = this(args, new Configuration(), sparkConf)
Expand Down Expand Up @@ -89,7 +89,7 @@ class WorkerLauncher(args: ApplicationMasterArguments, conf: Configuration, spar
val minimumMemory = appMasterResponse.getMinimumResourceCapability().getMemory()

if (minimumMemory > 0) {
val mem = args.workerMemory + YarnAllocationHandler.MEMORY_OVERHEAD
val mem = args.executorMemory + YarnAllocationHandler.MEMORY_OVERHEAD
val numCore = (mem / minimumMemory) + (if (0 != (mem % minimumMemory)) 1 else 0)

if (numCore > 0) {
Expand All @@ -102,7 +102,7 @@ class WorkerLauncher(args: ApplicationMasterArguments, conf: Configuration, spar
waitForSparkMaster()

// Allocate all containers
allocateWorkers()
allocateExecutors()

// Launch a progress reporter thread, else app will get killed after expiration (def: 10mins) timeout
// ensure that progress is sent before YarnConfiguration.RM_AM_EXPIRY_INTERVAL_MS elapse.
Expand Down Expand Up @@ -199,7 +199,7 @@ class WorkerLauncher(args: ApplicationMasterArguments, conf: Configuration, spar
}


private def allocateWorkers() {
private def allocateExecutors() {

// Fixme: should get preferredNodeLocationData from SparkContext, just fake a empty one for now.
val preferredNodeLocationData: scala.collection.Map[String, scala.collection.Set[SplitInfo]] =
Expand All @@ -208,16 +208,16 @@ class WorkerLauncher(args: ApplicationMasterArguments, conf: Configuration, spar
yarnAllocator = YarnAllocationHandler.newAllocator(yarnConf, resourceManager, appAttemptId,
args, preferredNodeLocationData, sparkConf)

logInfo("Allocating " + args.numWorkers + " workers.")
logInfo("Allocating " + args.numExecutors + " executors.")
// Wait until all containers have finished
// TODO: This is a bit ugly. Can we make it nicer?
// TODO: Handle container failure
while ((yarnAllocator.getNumWorkersRunning < args.numWorkers) && (!driverClosed)) {
yarnAllocator.allocateContainers(math.max(args.numWorkers - yarnAllocator.getNumWorkersRunning, 0))
while ((yarnAllocator.getNumExecutorsRunning < args.numExecutors) && (!driverClosed)) {
yarnAllocator.allocateContainers(math.max(args.numExecutors - yarnAllocator.getNumExecutorsRunning, 0))
Thread.sleep(100)
}

logInfo("All workers have launched.")
logInfo("All executors have launched.")

}

Expand All @@ -228,10 +228,10 @@ class WorkerLauncher(args: ApplicationMasterArguments, conf: Configuration, spar
val t = new Thread {
override def run() {
while (!driverClosed) {
val missingWorkerCount = args.numWorkers - yarnAllocator.getNumWorkersRunning
if (missingWorkerCount > 0) {
logInfo("Allocating " + missingWorkerCount + " containers to make up for (potentially ?) lost containers")
yarnAllocator.allocateContainers(missingWorkerCount)
val missingExecutorCount = args.numExecutors - yarnAllocator.getNumExecutorsRunning
if (missingExecutorCount > 0) {
logInfo("Allocating " + missingExecutorCount + " containers to make up for (potentially ?) lost containers")
yarnAllocator.allocateContainers(missingExecutorCount)
}
else sendProgress()
Thread.sleep(sleepTime)
Expand Down Expand Up @@ -264,9 +264,9 @@ class WorkerLauncher(args: ApplicationMasterArguments, conf: Configuration, spar
}


object WorkerLauncher {
object ExecutorLauncher {
def main(argStrings: Array[String]) {
val args = new ApplicationMasterArguments(argStrings)
new WorkerLauncher(args).run()
new ExecutorLauncher(args).run()
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -38,24 +38,24 @@ import org.apache.hadoop.yarn.util.{Apps, ConverterUtils, Records, ProtoUtils}
import org.apache.spark.{SparkConf, Logging}


class WorkerRunnable(
class ExecutorRunnable(
container: Container,
conf: Configuration,
spConf: SparkConf,
masterAddress: String,
slaveId: String,
hostname: String,
workerMemory: Int,
workerCores: Int)
extends Runnable with WorkerRunnableUtil with Logging {
executorMemory: Int,
executorCores: Int)
extends Runnable with ExecutorRunnableUtil with Logging {

var rpc: YarnRPC = YarnRPC.create(conf)
var cm: ContainerManager = _
val sparkConf = spConf
val yarnConf: YarnConfiguration = new YarnConfiguration(conf)

def run = {
logInfo("Starting Worker Container")
logInfo("Starting Executor Container")
cm = connectToCM
startContainer
}
Expand All @@ -81,8 +81,8 @@ class WorkerRunnable(
credentials.writeTokenStorageToStream(dob)
ctx.setContainerTokens(ByteBuffer.wrap(dob.getData()))

val commands = prepareCommand(masterAddress, slaveId, hostname, workerMemory, workerCores)
logInfo("Setting up worker with commands: " + commands)
val commands = prepareCommand(masterAddress, slaveId, hostname, executorMemory, executorCores)
logInfo("Setting up executor with commands: " + commands)
ctx.setCommands(commands)

// Send the start request to the ContainerManager
Expand Down
Loading

0 comments on commit 6983732

Please sign in to comment.