Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passes --public-ip-address to hailctl dataproc start #14653

Merged
merged 4 commits into from
Aug 19, 2024

Conversation

iris-garden
Copy link
Contributor

@iris-garden iris-garden commented Aug 5, 2024

Closes #14652.

See populationgenomics#346. Thanks for the contribution @illusional!

Gives Dataproc clusters started via hailctl dataproc start internet access by default, since we need it to install some of our dependencies, per the error message in the linked issue.

illusional and others added 2 commits August 5, 2024 11:33
* Add support for public-ip-address in dataproc

* Use the same code style as previous code

---------

Co-authored-by: Michael Franklin <illusional@users.noreply.github.com>
Copy link
Member

@ehigham ehigham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this change. Looks good.

Copy link
Member

@ehigham ehigham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no changes since last review

Copy link
Collaborator

@cjllanwarne cjllanwarne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small question about whether this needs to be an option, since it seems like a strict necessity. But happy to move over to 👍 if the answer is "yes"

Comment on lines 196 to 197
Opt(
help='Allow nodes to have a public IP address, and hence make requests on the public internet (default is internal-only from dataproc 2.2).'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From your description:

since we need internet access to install some of our dependencies

Is there any use case where someone would want to disable public IP addresses and not have the dependencies set up? If not, should this just be forced to be true (instead of being an option)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, looks like we get the list of dependencies from hail/python/hailtop/hailctl/deploy.yaml, which is generated when we build the wheel, so there is not a way for the user to avoid using the packages we specify. Revised to just make this part of the command string instead of an option!

@illusional
Copy link
Contributor

FWIW, this change worked fixed our dataproc initialisation :)

@iris-garden iris-garden changed the title Adds --public-ip-address arg to hailctl dataproc start and enables it by default Passes --public-ip-address to hailctl dataproc start Aug 6, 2024
@hail-ci-robot hail-ci-robot merged commit 856d388 into hail-is:main Aug 19, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

hailctl dataproc start requires --public-ip-address to work
5 participants