Skip to content

Commit

Permalink
#157 finalize v2 spark infrastructure
Browse files Browse the repository at this point in the history
Default all new applications to use spark-infrastructure to the V2
charts. Mark any V1 users with a depreciation warning. Provide migration
instructions for going from V1 -> V2.

Update spark-events to be saved to a more dynamic PVC instead of a S3
bucket.

Signed-off-by: Peter McClonski <mcclonski_peter@bah.com>

 #300 Resolve several issues with the V2 charts

Correct the Spark-event PV host node mount point. Update the spark
configmap name to be in line with our documentaion (-conf -> -config).

Corrected a bug in the Configuration-store PVC being writable instead of
read only.
  • Loading branch information
peter-mcclonski authored and cwoods-cpointe committed Sep 6, 2024
1 parent 18036ab commit d80975f
Show file tree
Hide file tree
Showing 36 changed files with 684 additions and 245 deletions.
8 changes: 5 additions & 3 deletions DRAFT_RELEASE_NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ Created a helm chart with the necessary infrastructure for deploying your aiSSEM
_[A short bulleted list of changes that will cause downstream projects to be partially or wholly inoperable without changes. Instructions for those changes should live in the How To Upgrade section]_
Note: instructions for adapting to these changes are outlined in the upgrade instructions below.

- Projects MUST upgrade to the new v2 spark-infrastructure chart in order to retain functionality for data-delivery pipelines.

# Known Issues
There are no known issues with the 1.9.0 release.

Expand Down Expand Up @@ -37,6 +39,7 @@ To reduce burden of upgrading aiSSEMBLE, the Baton project is used to automate t
| upgrade-v2-chart-files-aissemble-version-migration | Updates the helm chart dependencies within your project's deployment resources (<YOUR_PROJECT>-deploy/src/main/resources/apps/) to use the latest version of the aiSSEMBLE |
| upgrade-v1-chart-files-aissemble-version-migration | Updates the docker image tags within your project's deployment resources (<YOUR_PROJECT>-deploy/src/main/resources/apps/) to use the latest version of the aiSSEMBLE |
| ml-flow-dockerfile-migration | Updates the MLFlow's Dockerfile to use the bitnami/mlflow image as a base instead of the deprecated boozallen/aissemble-mlflow image" |
| update-data-access-thrift-endpoint-migration | For projects using the default data-access thrift endpoint, updates to the new endpoint associated with v2 spark-infrastructure |

To deactivate any of these migrations, add the following configuration to the `baton-maven-plugin` within your root `pom.xml`:

Expand Down Expand Up @@ -74,13 +77,12 @@ To start your aiSSEMBLE upgrade, update your project's pom.xml to use the 1.9.0

## Conditional Steps


## Final Steps - Required for All Projects
### Finalizing the Upgrade
1. Run `./mvnw org.technologybrewery.baton:baton-maven-plugin:baton-migrate` to apply the automatic migrations
1. Run `./mvnw clean install` and resolve any manual actions that are suggested
2. Run `./mvnw clean install` and resolve any manual actions that are suggested
- **NOTE:** This will update any aiSSEMBLE dependencies in 'pyproject.toml' files automatically
1. Repeat the previous step until all manual actions are resolved
3. Repeat the previous step until all manual actions are resolved

# What's Changed
- MLFlow Chart version upgraded from `0.2.1` to `1.4.22`.
Expand Down
21 changes: 11 additions & 10 deletions docs/modules/ROOT/pages/containers.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -189,16 +189,17 @@ TIP: If your project is under version control, we recommend using a diff tool to
==== Available V2 Profiles (Kubernetes)
[width="100%",options="header"]
|======
|Application | Fermenter Profile | ReadMe
|Airflow | `airflow-deploy-v2` |https://github.com/boozallen/aissemble/blob/{git-tree}/extensions/extensions-helm/aissemble-airflow-chart/README.md[Airflow,role=external,window=_blank]
|Data Access | `data-access-deploy-v2` |https://github.com/boozallen/aissemble/blob/{git-tree}/extensions/extensions-helm/aissemble-data-access-chart/README.md[Data Access,role=external,window=_blank]
|Kafka | `kafka-deploy-v2` |https://github.com/boozallen/aissemble/blob/{git-tree}/extensions/extensions-helm/aissemble-kafka-chart/README.md[Kafka,role=external,window=_blank]
|Metadata | `metadata-deploy-v2` |https://github.com/boozallen/aissemble/blob/{git-tree}/extensions/extensions-helm/aissemble-metadata-chart/README.md[Metadata,role=external,window=_blank]
|MLflow | `mlflow-deploy-v2` |https://github.com/boozallen/aissemble/blob/{git-tree}/extensions/extensions-helm/aissemble-mlflow-chart/README.md[MLFlow,role=external,window=_blank]
|Policy Decision Point | `policy-decision-point-deploy-v2` |https://github.com/boozallen/aissemble/blob/{git-tree}/extensions/extensions-helm/aissemble-policy-decision-point-chart/README.md[Policy Decision Point,role=external,window=_blank]
|S3 Local | `s3local-deploy-v2` |https://github.com/boozallen/aissemble/blob/{git-tree}/extensions/extensions-helm/aissemble-localstack-chart/README.md[S3 Local,role=external,window=_blank]
|Configuration Store | `configuration-store-deploy-v2` |https://github.com/boozallen/aissemble/blob/{git-tree}/extensions/extensions-helm/aissemble-configuration-store-chart/README.md[Configuration Store,role=external,window=_blank]
|Spark Operator | `aissemble-spark-operator-deploy-v2` |https://github.com/boozallen/aissemble/blob/{git-tree}/extensions/extensions-helm/aissemble-spark-operator-chart/README.md[Spark Operator,role=external,window=_blank]
|Application | Fermenter Profile | ReadMe
|Airflow | `airflow-deploy-v2` |https://github.com/boozallen/aissemble/blob/{git-tree}/extensions/extensions-helm/aissemble-airflow-chart/README.md[Airflow,role=external,window=_blank]
|Data Access | `data-access-deploy-v2` |https://github.com/boozallen/aissemble/blob/{git-tree}/extensions/extensions-helm/aissemble-data-access-chart/README.md[Data Access,role=external,window=_blank]
|Kafka | `kafka-deploy-v2` |https://github.com/boozallen/aissemble/blob/{git-tree}/extensions/extensions-helm/aissemble-kafka-chart/README.md[Kafka,role=external,window=_blank]
|Metadata | `metadata-deploy-v2` |https://github.com/boozallen/aissemble/blob/{git-tree}/extensions/extensions-helm/aissemble-metadata-chart/README.md[Metadata,role=external,window=_blank]
|MLflow | `mlflow-deploy-v2` |https://github.com/boozallen/aissemble/blob/{git-tree}/extensions/extensions-helm/aissemble-mlflow-chart/README.md[MLFlow,role=external,window=_blank]
|Policy Decision Point | `policy-decision-point-deploy-v2` |https://github.com/boozallen/aissemble/blob/{git-tree}/extensions/extensions-helm/aissemble-policy-decision-point-chart/README.md[Policy Decision Point,role=external,window=_blank]
|S3 Local | `s3local-deploy-v2` |https://github.com/boozallen/aissemble/blob/{git-tree}/extensions/extensions-helm/aissemble-localstack-chart/README.md[S3 Local,role=external,window=_blank]
|Configuration Store | `configuration-store-deploy-v2` |https://github.com/boozallen/aissemble/blob/{git-tree}/extensions/extensions-helm/aissemble-configuration-store-chart/README.md[Configuration Store,role=external,window=_blank]
|Spark Operator | `aissemble-spark-operator-deploy-v2` |https://github.com/boozallen/aissemble/blob/{git-tree}/extensions/extensions-helm/aissemble-spark-operator-chart/README.md[Spark Operator,role=external,window=_blank]
|Spark Infrastructure | `aissemble-spark-infrastructure-deploy-v2 |https://github.com/boozallen/aissemble/blob/{git-tree}/extensions/extensions-helm/extensions-helm-spark-infrastructure/README.md[Spark Infrastructure,role=external,window=_blank]
|======

//todo drop example file replace with link to dockerfile build docs
Expand Down
7 changes: 2 additions & 5 deletions docs/modules/ROOT/pages/guides/guides-spark-job.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -58,10 +58,7 @@ through their own mechanisms, most frequently Helm charts.
|Component of internal Spark Operator functionality.
|`spark-infrastructure`
|Hosts Spark History and Hive Thrift Server.
|`hive-metastore-service`
|Standalone Hive metastore deployment shared by all Spark pipelines and executions.
|Hosts Spark History, Hive Thrift Server, and Hive metastore Deployment shared by all Spark pipelines and executions.
|`s3-local`
|Default shared storage solution from which Spark executors can read and write.
Expand Down Expand Up @@ -137,4 +134,4 @@ To retrieve execution logs, we leverage a final `kubectl` command:
[source]
----
kubectl logs <pipeline-name>-driver
----
----
Loading

0 comments on commit d80975f

Please sign in to comment.