Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: Add memory tuning section to user guide #684

Merged
merged 3 commits into from
Jul 20, 2024

Conversation

viirya
Copy link
Member

@viirya viirya commented Jul 18, 2024

Which issue does this PR close?

Closes #595

Rationale for this change

What changes are included in this PR?

How are these changes tested?

@viirya viirya requested a review from andygrove July 18, 2024 23:05
viirya and others added 2 commits July 19, 2024 12:20
Co-authored-by: Andy Grove <agrove@apache.org>
Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very helpful. Thanks @viirya

@viirya viirya merged commit 1f77953 into apache:main Jul 20, 2024
@viirya
Copy link
Member Author

viirya commented Jul 20, 2024

Merged. Thanks @andygrove @comphead

@viirya viirya deleted the add_memory_info branch July 20, 2024 03:11
@orthoxerox
Copy link

Am I correct in assuming I don't have to increase spark.executor.memoryOverhead or spark.memory.offHeap.size when using Comet, only spark.comet.memoryOverhead? How does this interact with YARN memory management?

@comphead
Copy link
Contributor

spark.comet.memoryOverhead configures the memory pool size for native execution. it is not the same as spark.executor.memoryOverhead or spark.memory.offHeap.size.

For YARN memory management, the YARN container respects the spark.executor.memoryOverhead and the Comet override this value in CometDriverPlugin

sc.conf.set(EXECUTOR_MEMORY_OVERHEAD.key, s"${execMemOverhead + cometMemOverhead}M")

@viirya
Copy link
Member Author

viirya commented Aug 13, 2024

If Comet CometPlugin is configured to be used, it will update Spark executor overhead based on the config value spark.comet.memoryOverhead etc. But I don't test it on YARN, not sure Spark plugins comes in before YARN's resource configuration. You probably can test it.

If YARN resource manager doesn't respect Spark plugins, you can manually set up executor overhead based on spark.comet.memoryOverhead.

@orthoxerox
Copy link

@comphead Hm, interesting, I tried launching the container with and without specifying spark.comet.memoryOverhead and YARN reported the same amount of memory allocated to my application. I'll check again tomorrow with debug enabled, hopefully I'll see this: https://github.com/apache/datafusion-comet/blob/acb78280e495079312a1067b80e91671852389a5/spark/src/main/scala/org/apache/spark/Plugins.scala#L63C7-L63C14

@comphead
Copy link
Contributor

@comphead Hm, interesting, I tried launching the container with and without specifying spark.comet.memoryOverhead and YARN reported the same amount of memory allocated to my application. I'll check again tomorrow with debug enabled, hopefully I'll see this: https://github.com/apache/datafusion-comet/blob/acb78280e495079312a1067b80e91671852389a5/spark/src/main/scala/org/apache/spark/Plugins.scala#L63C7-L63C14

@orthoxerox please try it with applying a separate plugin

--conf spark.plugins=org.apache.spark.CometPlugin

himadripal pushed a commit to himadripal/datafusion-comet that referenced this pull request Sep 7, 2024
* doc: Add memory tuning section to user guide

* Update docs/source/user-guide/tuning.md

Co-authored-by: Andy Grove <agrove@apache.org>

* More

---------

Co-authored-by: Andy Grove <agrove@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clarify memory requirements for Spark OSS with Comet vs Spark OSS
4 participants