Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-35825][INFRA][FOLLOWUP] Increase it in build/mvn script #33180

Closed
wants to merge 1 commit into from
Closed

[SPARK-35825][INFRA][FOLLOWUP] Increase it in build/mvn script #33180

wants to merge 1 commit into from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Jul 2, 2021

What changes were proposed in this pull request?

This is a follow up of #32961.

This PR additionally sets the stack size in build/mvn.

Why are the changes needed?

We are still hitting StackOverflowError in Jenkins.

[INFO] compiling 166 Scala sources and 19 Java sources to /home/jenkins/workspace/spark-master-test-maven-hadoop-3.2/sql/catalyst/target/scala-2.12/classes ...
[ERROR] ## Exception when compiling 480 sources to /home/jenkins/workspace/spark-master-test-maven-hadoop-3.2/sql/catalyst/target/scala-2.12/classes
java.lang.StackOverflowError

This PR increases the JVM of mvn instead of the plugin.

$ MAVEN_OPTS="-XX:+PrintFlagsFinal" build/mvn clean | grep 'intx ThreadStackSize'
     intx ThreadStackSize                           = 2048                                {pd product}

$ MAVEN_OPTS="-Xss128m -XX:+PrintFlagsFinal" build/mvn clean | grep 'intx ThreadStackSize'
     intx ThreadStackSize                          := 131072                              {pd product}

Does this PR introduce any user-facing change?

No.

How was this patch tested?

N/A

@github-actions github-actions bot added the BUILD label Jul 2, 2021
@dongjoon-hyun
Copy link
Member Author

How do you think about this, @gengliangwang , @HyukjinKwon , and @LuciferYang .

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-35825][INFRA][FOLLOWUP] Increase it in build/mvn script [SPARK-35825][INFRA][FOLLOWUP][test-maven] Increase it in build/mvn script Jul 2, 2021
@dongjoon-hyun
Copy link
Member Author

Retest this please

@SparkQA
Copy link

SparkQA commented Jul 2, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45061/

@SparkQA
Copy link

SparkQA commented Jul 2, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45064/

@SparkQA
Copy link

SparkQA commented Jul 2, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45064/

@LuciferYang
Copy link
Contributor

This problem often occurs in my local compilation recently, let me add this arg to verify the compilation problem.

On the other hand, is -Xmx2g too small for -Xss128m ?

@dongjoon-hyun
Copy link
Member Author

Thank you, @LuciferYang ! Yes, it would be great if you can test this locally.
I triggered two Jenkins jobs but I'm not sure because of the running timeout.

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented Jul 2, 2021

Increasing to 3g is our last resort. This PR intentionally aims to adjust within the current memory limit (2g). If there is no way to stabilize our Jenkins, we should go with higher memory.

On the other hand, is -Xmx2g too small for -Xss128m ?

BTW, this is a follow-up. And, original PR is the following. This PR is consistent with the original PR in terms of the values.

@SparkQA
Copy link

SparkQA commented Jul 2, 2021

Test build #140548 has finished for PR 33180 at commit 4c4fcec.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gengliangwang
Copy link
Member

LGTM. @LuciferYang what is the local test result?

@SparkQA
Copy link

SparkQA commented Jul 2, 2021

Test build #140552 has finished for PR 33180 at commit 4c4fcec.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

The above Maven run actually passed the compilation on catalyst and sql which we see the StackOverflowError frequently. Only fails with the following. It seems to be a flaky test.

- driver side SQL metrics *** FAILED ***
  Map(573099 -> "total (min, med, max (stageId: taskId))
  0 ms (0 ms, 0 ms, 0 ms (stage 4.0: task 8))", 573101 -> "2", 573100 -> "1") did not contain key 573169 (SQLAppStatusListenerSuite.scala:590)

And, there is another Maven run is still running.

@dongjoon-hyun
Copy link
Member Author

Since this is a flaky compilation issue, the above two Maven runs might be insufficient for verification. However, I believe this patch is no harm for the build and only provides the consistency for Maven.

@dongjoon-hyun
Copy link
Member Author

I'll merge this. Please let us know your result when you have some time, @LuciferYang ~

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-35825][INFRA][FOLLOWUP][test-maven] Increase it in build/mvn script [SPARK-35825][INFRA][FOLLOWUP] Increase it in build/mvn script Jul 2, 2021
@dongjoon-hyun dongjoon-hyun deleted the SPARK-35825 branch July 2, 2021 05:25
@LuciferYang
Copy link
Contributor

@dongjoon-hyun @gengliangwang It seems to work, I have compile and test catalyst and related modules for many times in my compilation environment, no StackOverflowError was thrown at the moment.

@dongjoon-hyun
Copy link
Member Author

Oh! Thank you for sharing that, @LuciferYang . Ya, I saw this on catalyst mostly.

@LuciferYang
Copy link
Contributor

Yes, the catalyst module often has this problem

@gengliangwang
Copy link
Member

@dongjoon-hyun @LuciferYang Awesome, hopefully the issue is resolved this time.

@dongjoon-hyun
Copy link
Member Author

Yep, I will keep monitoring. BTW, I saw branch-3.2. Thank you, @gengliangwang .

@SparkQA
Copy link

SparkQA commented Jul 2, 2021

Test build #140551 has finished for PR 33180 at commit 4c4fcec.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants