-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Start outline of documentation updates for running on GCP #2
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -188,3 +188,10 @@ The installation instructions are the same as the :ref:`local-install` | |
installation. You will need to use an AWS account with appropriate permissions. | ||
The first time you run ``buildstock_aws`` it may take several minutes, | ||
especially over a slower internet connection as it is downloading and building a docker image. | ||
|
||
|
||
Google Cloud Platform | ||
~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
TODO: Add any GCP-specific installation instructions | ||
TODO: Need to install and run docker? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FYI, yes, you need to install Docker. Installation automatically starts the service; you shouldn't need to explicitly start/run anything. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay, that's what I thought - that's also true for the AWS version, right? Even though that's not in the instructions? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. correct |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -110,3 +110,55 @@ When the simulation and postprocessing is all complete, run ``buildstock_aws | |
--clean your_project_file.yml``. This will clean up all the AWS resources that | ||
were created on your behalf to run the simulations. Your results will still be | ||
on S3 and queryable in Athena. | ||
|
||
Google Cloud Platform | ||
~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Running a batch on GCP is done by calling the ``buildstock_gcp`` command line | ||
tool. | ||
|
||
.. command-output:: buildstock_gcp --help | ||
:ellipsis: 0,8 | ||
|
||
GCP Specific Project configuration | ||
.................................. | ||
|
||
TODO: add more GCP configuration details here | ||
|
||
For the project to run on GCP, you will need to add a section to your config | ||
file, something like this: | ||
|
||
.. code-block:: yaml | ||
|
||
gcp: | ||
job_identifier: national01 | ||
gcp_project: myorg_project | ||
region: us-central1 | ||
artifact_registry: buildstockbatch | ||
gcs: | ||
bucket: mybucket | ||
prefix: national01_run01 | ||
use_spot: true | ||
batch_array_size: 10000 | ||
Comment on lines
+135
to
+142
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To reflect the current state of the schema:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Those changes are in the PR I'm about to send! Sorry they're out of order - I pulled these changes out of that branch. (I'm using |
||
notifications_email: your_email@somewhere.com | ||
|
||
See :ref:`gcp-config` for details. | ||
|
||
|
||
List existing jobs | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This feature is coming soon in another PR. |
||
.................. | ||
|
||
Run ``buildstock_gcp --list_jobs your_project_file.yml`` to see a list of all existing | ||
jobs matching the project specified. This can show you whether a previously-started job | ||
has completed or is still running. | ||
|
||
|
||
Cleaning up after yourself | ||
.......................... | ||
|
||
TODO: Review and update this after implementing cleanup option. | ||
|
||
When the simulation and postprocessing is all complete, run ``buildstock_gcp | ||
--clean your_project_file.yml``. This will clean up all the GCP resources that | ||
Comment on lines
+161
to
+162
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is somewhat inaccurate (for lack of better word) for the version that we're developing since doing a run now also includes cleanup (i.e., you don't usually need to run cleanup separately). So, this should probably start with something more like: "Running a simulation automatically cleans up GCP resources when it completes. If you need to clean up manually (e.g., because the run did not complete) or cancel a run, run There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure that will stay true for all resources (for example, nothing is deleting the docker image you're uploading), but I'll add a TODO for now to make sure it's accurate when we actually implement this option. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, also, to clarify, my code for the GCP Batch job just kicks off the job, then exits. So the current code
won't actually work correctly, since cleanup needs to wait until the job actually finishes. And since these jobs may take hours to run, I think that's the right model - the user shouldn't have to have to leave a script running locally for that long. |
||
were created to run the specified project. If the project is still running, it | ||
will be cancelled. Your output files will still be available in GCS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, on the client side (workstation where you run the script), there's no GCP stuff to "install." You do need to setup the GOOGLE_APPLICATION_CREDENTIALS key. There is setup to do on the GCP side (like creating a repository).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think the basic "you need a GCP project and the appropriate credentials" stuff can go here, similar to the AWS section above. (This reminds me that we'll also need to make sure to update setup.py to include any python packages we're using.)
I'm hoping all the GCP-side setup can be done automatically, via Terraform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, earlier revisions of my work added 'google-api-python-client' and 'google-cloud-storage' to 'install_requires', but I ended up not needing those since interaction with GCP's Artifact Registry went through the Docker Client (and use of Docker standard APIs with AR).