Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuring cluster to increase the max number of files that can be open (needed by Spark during large shuffles) #148

Open
jbherman opened this issue Sep 19, 2016 · 2 comments

Comments

@jbherman
Copy link

i saw you encountered this problem during big shuffles. here is how i fixed it. hope it helps.
Note: this is on amazon linux.

  1. copy os's sysctl.conf
  2. append "fs.file-max = 100000" to sysctl.conf
  3. copy-file bigCluster /Users/jason/projects-misc/sysctl.conf /home/ec2-user/sysctl.conf
  4. run-command bigCluster "sudo cp /home/ec2-user/sysctl.conf /etc/sysctl.conf"
  5. run-command bigCluster "sudo shutdown -r now"
  6. cat < limits.conf
  • soft nproc 65535
  • hard nproc 65535
  • soft nofile 65535
  • hard no file 65535
    EOF
    7) copy-file bigCluster /Users/jason/projects-misc/limits.conf /home/ec2-user/limits.conf
    8)run-command bigCluster "sudo cp /home/ec2-user/limits.conf /etc/security/limits.conf"
    9)run-command bigCluster "sudo shutdown -r now"

thanks for your great work.
-jason

@nchammas
Copy link
Owner

Hi Jason, and thank you for the kind words!

Is this in reference to this comment?

@jbherman
Copy link
Author

jbherman commented Sep 26, 2016

yes it's a reference to that comment

@nchammas nchammas changed the title need to increase the max number of files that can be open Configuring cluster to increase the max number of files that can be open (needed by Spark during large shuffles) Sep 26, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants