-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SDSC: PKG - expanse/0.17.3/cpu/b - Missing Spark (example application) #42
Comments
Using this issue as a first example workflow for deploying a new package into a production instance on Expanse. In this case, we will first deploy and test into a shared instance in my HOME directory on Expanse. To setup the shared instance configuration, you first run the following configuration script ...
|
Once the
|
Upon starting a new shell session, you should have access to production Spack instance on Expanse, but it is now scoped to perform any installs into your
|
Before installing a spark via Spack, let's check if the version we want is available.
The latest version at the time of this writing is: https://archive.apache.org/dist/spark/spark-3.4.0. As such, let's also plan to upgrade our spark Spack package to include the latest downloadable tarball. |
In this case, we'll likely want to install spark in the
|
To upgrade the spark version, we'll need to modify the
we can test it locally in my personal repo setup as part of the shared spack instance configuration.
Since the default package is currently coming from the
|
Download and compute
|
And then add the new version to the
|
And now let's see if the new spark package from my personal package repo now has precedence ...
Everything looks good. Now we need to construct a spec |
|
The following is the spark@3.4.0.sh spec build script submitted to Slurm on Expanse above to install Spark within my local Spack instance in my HOME directory prior to submitting script to the sdsc/spack repo as a pull request.
Please note, the only custom changes needed to use the shared spack instance configuration setup in my
which can be easily removed prior to production deployment --- and even if forgotten prior to deployment, should have no effect. |
|
|
Getting the module-based access working ...
|
Spark is now available from my local Spack install_tree ..
|
Redo with
|
Let's do this one more time ... forgot
|
|
|
Picking back up where I left off last month ...
... the next step would be to run tests against the locally deployed application before submitting it upstream as a pull request. Unfortunately, here in the case of Spark, the application testing is/was more complicated and is not working as intended and/or expected at this time. However, I will continue on and complete the deployment as an initial example for documenting the process, which will then be written up in the sdsc/spack project's documentation. |
Before updating your personal fork of the sdsc/spack repository with the spec build script, you'll need to first re-sync it with the sdsc/spack as its upstream. Using a command-line approach on Expanse, this workflow is as follows ...
We will also provide in the future a workflow using the GitHub web interface as well. |
Next, copy both the new spec build script and test build output files to the correct location in your fork of the sdsc/spack repository ...
Once the new spec build script and the test build output file are placed in the correct location, add the build script to the end of existing package dependency chain for the production deployment.
See https://github.com/mkandes/spack/blob/sdsc-0.17.3/etc/spack/sdsc/expanse/0.17.3/cpu/b/SPECS.md OR ... check the deployed instance itself for the last known set of deployed build scripts to find the appropriate location to link in the new build script into the dependency chain ...
We are working on some additional tooling to make this discovery process simpler in the future. For now, best effort to keep things organized is what we aim to do. Using the SPECS.md documentation on the dependency chains for this example. |
When ready, commit the changes to your fork. Please note, however, in this case this is a direct commit on the
|
Push the changes back to your fork.
|
Create pull request, review pull request (with other team members), merge pull request with sdsc/spack deployment branch prior to production deployment. |
When ready to deploy package into production, login to role account, start interactive session for reserved node, pull down changes from sdsc/spack repo and deploy ...
|
But don't forget any final changes that may be needed prior to running the build ...
|
Doh. Or that you also changed the Spack package for the application itself ...
|
To fix this issue, we'll need to walk through the pull request workflow again by placing the updated Spark package ...
... into SDSC's custom Spack package repo ...
|
Re-sync fork.
|
Copy custom Spack package to SDSC's Spack package repo in your fork.
|
|
|
Create pull request, review, then merge upstream into sdsc/spack repo. |
Deploy custom Spack package to production instance from sdsc/spack repo ...
|
Try spack build again ...
|
Check that build succeeded.
|
|
Note, however, the module is not immediately visible for some reason ...
|
It was discovered that both login nodes currently have a system spider cache that is preventing an updated view of the available modules ...
However, this can be ignored with the
|
To finalize this production deployment of Spark, we'll commit final changes to spec build script back to sdsc/repo, remove test build output and replace with final build output.
|
Deployment complete. bd28762 In general, what should follow next, if time permits and tests exist, is testing the newly deployed package in production before closing out the issue. |
No description provided.
The text was updated successfully, but these errors were encountered: