Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

default task number misleading in several places #766

Closed
wants to merge 2 commits into from

Conversation

CrazyJvm
Copy link
Contributor

private[streaming] def defaultPartitioner(numPartitions: Int = self.ssc.sc.defaultParallelism){
new HashPartitioner(numPartitions)
}

it represents that the default task number in Spark Streaming relies on the variable defaultParallelism in SparkContext, which is decided by the config property spark.default.parallelism

the property "spark.default.parallelism" refers to #389

<code>
  private[streaming] def defaultPartitioner(numPartitions: Int = self.ssc.sc.defaultParallelism){
    new HashPartitioner(numPartitions)
  }
</code>

it represents that the default task number in Spark Streaming relies on the variable defaultParallelism in SparkContext, which is decided by the config property spark.default.parallelism

the property "spark.default.parallelism" refers to apache#389
@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14971/

this uses Spark's default number of parallel tasks (2 for local machine, 8 for a cluster) to
do the grouping. You can pass an optional <code>numTasks</code> argument to set a different
number of tasks.</td>
this uses Spark's default number of parallel tasks (local mode is 2, while cluster mode is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about "2 for local mode, and in cluster mode the number is determined by ..."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's good i think : )

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@rxin
Copy link
Contributor

rxin commented May 15, 2014

Thanks. I've merged this.

asfgit pushed a commit that referenced this pull request May 15, 2014
  private[streaming] def defaultPartitioner(numPartitions: Int = self.ssc.sc.defaultParallelism){
    new HashPartitioner(numPartitions)
  }

it represents that the default task number in Spark Streaming relies on the variable defaultParallelism in SparkContext, which is decided by the config property spark.default.parallelism

the property "spark.default.parallelism" refers to #389

Author: Chen Chao <crazyjvm@gmail.com>

Closes #766 from CrazyJvm/patch-7 and squashes the following commits:

0b7efba [Chen Chao] Update streaming-programming-guide.md
cc5b66c [Chen Chao] default task number misleading in several places

(cherry picked from commit 2f63995)
Signed-off-by: Reynold Xin <rxin@apache.org>
@asfgit asfgit closed this in 2f63995 May 15, 2014
@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15001/

pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
  private[streaming] def defaultPartitioner(numPartitions: Int = self.ssc.sc.defaultParallelism){
    new HashPartitioner(numPartitions)
  }

it represents that the default task number in Spark Streaming relies on the variable defaultParallelism in SparkContext, which is decided by the config property spark.default.parallelism

the property "spark.default.parallelism" refers to apache#389

Author: Chen Chao <crazyjvm@gmail.com>

Closes apache#766 from CrazyJvm/patch-7 and squashes the following commits:

0b7efba [Chen Chao] Update streaming-programming-guide.md
cc5b66c [Chen Chao] default task number misleading in several places
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants