-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Start outline of documentation updates for running on GCP #2
Conversation
~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
TODO: Add any GCP-specific installation instructions | ||
TODO: Need to install and run docker? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, yes, you need to install Docker. Installation automatically starts the service; you shouldn't need to explicitly start/run anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, that's what I thought - that's also true for the AWS version, right? Even though that's not in the instructions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct
Google Cloud Platform | ||
~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
TODO: Add any GCP-specific installation instructions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, on the client side (workstation where you run the script), there's no GCP stuff to "install." You do need to setup the GOOGLE_APPLICATION_CREDENTIALS key. There is setup to do on the GCP side (like creating a repository).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think the basic "you need a GCP project and the appropriate credentials" stuff can go here, similar to the AWS section above. (This reminds me that we'll also need to make sure to update setup.py to include any python packages we're using.)
I'm hoping all the GCP-side setup can be done automatically, via Terraform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reminds me that we'll also need to make sure to update setup.py to include any python packages we're using.
Yea, earlier revisions of my work added 'google-api-python-client' and 'google-cloud-storage' to 'install_requires', but I ended up not needing those since interaction with GCP's Artifact Registry went through the Docker Client (and use of Docker standard APIs with AR).
gcp_project: myorg_project | ||
region: us-central1 | ||
artifact_registry: buildstockbatch | ||
gcs: | ||
bucket: mybucket | ||
prefix: national01_run01 | ||
use_spot: true | ||
batch_array_size: 10000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reflect the current state of the schema:
gcp_project
→project
- no
gcs
noruse_spot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those changes are in the PR I'm about to send! Sorry they're out of order - I pulled these changes out of that branch.
(I'm using gcp_project
to distinguish it from the BSB "project" that's being run.)
When the simulation and postprocessing is all complete, run ``buildstock_gcp | ||
--clean your_project_file.yml``. This will clean up all the GCP resources that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somewhat inaccurate (for lack of better word) for the version that we're developing since doing a run now also includes cleanup (i.e., you don't usually need to run cleanup separately). So, this should probably start with something more like: "Running a simulation automatically cleans up GCP resources when it completes. If you need to clean up manually (e.g., because the run did not complete) or cancel a run, run buildstock_gcp --clean your_project_file.yml
..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that will stay true for all resources (for example, nothing is deleting the docker image you're uploading), but I'll add a TODO for now to make sure it's accurate when we actually implement this option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, also, to clarify, my code for the GCP Batch job just kicks off the job, then exits. So the current code
batch.run_batch()
batch.process_results()
batch.clean()
won't actually work correctly, since cleanup needs to wait until the job actually finishes.
And since these jobs may take hours to run, I think that's the right model - the user shouldn't have to have to leave a script running locally for that long.
See :ref:`gcp-config` for details. | ||
|
||
|
||
List existing jobs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feature is coming soon in another PR.
Google Cloud Platform | ||
~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
TODO: Add any GCP-specific installation instructions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think the basic "you need a GCP project and the appropriate credentials" stuff can go here, similar to the AWS section above. (This reminds me that we'll also need to make sure to update setup.py to include any python packages we're using.)
I'm hoping all the GCP-side setup can be done automatically, via Terraform.
~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
TODO: Add any GCP-specific installation instructions | ||
TODO: Need to install and run docker? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, that's what I thought - that's also true for the AWS version, right? Even though that's not in the instructions?
gcp_project: myorg_project | ||
region: us-central1 | ||
artifact_registry: buildstockbatch | ||
gcs: | ||
bucket: mybucket | ||
prefix: national01_run01 | ||
use_spot: true | ||
batch_array_size: 10000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those changes are in the PR I'm about to send! Sorry they're out of order - I pulled these changes out of that branch.
(I'm using gcp_project
to distinguish it from the BSB "project" that's being run.)
When the simulation and postprocessing is all complete, run ``buildstock_gcp | ||
--clean your_project_file.yml``. This will clean up all the GCP resources that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that will stay true for all resources (for example, nothing is deleting the docker image you're uploading), but I'll add a TODO for now to make sure it's accurate when we actually implement this option.
Create section headers and TODOs for adding docs, so we can fill them in as we work in the implementation.