Skip to content

Commit

Permalink
Adding Azure databricks workspace service (#1857)
Browse files Browse the repository at this point in the history
* Azure Databricks TRE workspace service

Co-authored-by: Guy Bertental <guybartal@gmail.com>
Co-authored-by: Tamir Kamara <26870601+tamirkamara@users.noreply.github.com>
Co-authored-by: Ross Smith <ross-p-smith@users.noreply.github.com>
Co-authored-by: Marcus Robinson <marrobi@microsoft.com>
  • Loading branch information
5 people authored Jan 31, 2023
1 parent 7f59c6c commit 35b486b
Show file tree
Hide file tree
Showing 29 changed files with 2,645 additions and 8 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
:warning: Any custom rules you have added manually will be **lost** and you'll need to add it back after the upgrade has been completed.
FEATURES:
* Add Azure Databricks as workspace service [#1857](https://github.com/microsoft/AzureTRE/pull/1857)
ENHANCEMENTS:
* Add support for referencing IP Groups from the Core Resource Group in firewall rules created via the pipeline [#3089](https://github.com/microsoft/AzureTRE/pull/3089)
Expand Down
1 change: 1 addition & 0 deletions core/terraform/locals.tf
Original file line number Diff line number Diff line change
Expand Up @@ -30,5 +30,6 @@ locals {
"privatelink.postgres.database.azure.com",
"nexus-${var.tre_id}.${var.location}.cloudapp.azure.com",
"privatelink.mysql.database.azure.com",
"privatelink.azuredatabricks.net"
])
}
2 changes: 1 addition & 1 deletion core/terraform/network/locals.tf
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,6 @@ locals {

private_dns_zone_names = toset([
"privatelink.queue.core.windows.net",
"privatelink.table.core.windows.net",
"privatelink.table.core.windows.net"
])
}
2 changes: 1 addition & 1 deletion core/version.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.7.1"
__version__ = "0.7.2"
Binary file added docs/assets/databricks_workspace_service.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
17 changes: 17 additions & 0 deletions docs/tre-templates/workspace-services/databricks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Azure Databricks workspace service bundle

See: [https://azure.microsoft.com/en-us/products/databricks/](https://azure.microsoft.com/en-us/products/databricks/)

This service installs the following resources into an existing virtual network within the workspace:

![Azure Databricks workspace service](../../assets/databricks_workspace_service.png)


## Properties

- `is_exposed_externally` - If `True`, the Azure Databricks workspace is accessible from outside of the worksapce virtual network. If `False` use a Guacamole VM and the `internal_connection_uri` to access Databricks workspace.


## Prerequisites

- [A base workspace bundle installed](../workspaces/base.md)
12 changes: 7 additions & 5 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -82,11 +82,11 @@ nav:
- Set up of a Virtual Machine: using-tre/tre-for-research/using-vms.md
- Importing/exporting data with Airlock: using-tre/tre-for-research/importing-exporting-data-airlock.md
- Reviewing Airlock Requests: using-tre/tre-for-research/review-airlock-request.md
# - Workspaces:
# - using-tre/wks/index.md # Documentation describing what a workspace is
# - Using Workspaces: using-tre/wks/using-wks.md # Interacting with workspaces (via the UI)
# - The Workspace Owner: using-tre/wks/wks-owner.md # Workspace Owners. The concept, and tasks
# - FAQ: using-tre/faq.md # FAQ section (to allow easy contribution)
# - Workspaces:
# - using-tre/wks/index.md # Documentation describing what a workspace is
# - Using Workspaces: using-tre/wks/using-wks.md # Interacting with workspaces (via the UI)
# - The Workspace Owner: using-tre/wks/wks-owner.md # Workspace Owners. The concept, and tasks
# - FAQ: using-tre/faq.md # FAQ section (to allow easy contribution)

- Templates and Services: # Docs to highlight and illustrate workspaces, workspace services etc
- Workspaces:
Expand All @@ -100,6 +100,7 @@ nav:
- InnerEye: tre-templates/workspace-services/inner-eye.md
- MLFlow: tre-templates/workspace-services/mlflow.md
- Health Services: tre-templates/workspace-services/health_services.md
- Azure Databricks: tre-templates/workspace-services/databricks.md
- Shared Services:
- Gitea (Source Mirror): tre-templates/shared-services/gitea.md
- Nexus (Package Mirror): tre-templates/shared-services/nexus.md
Expand Down Expand Up @@ -129,6 +130,7 @@ nav:
- Registering Templates: tre-admins/registering-templates.md
- Install Resources via API:
- Install Base Workspace: tre-admins/setup-instructions/installing-base-workspace.md
# yamllint disable-line rule:line-length
- Install Workspace Service and User Resource: tre-admins/setup-instructions/installing-workspace-service-and-user-resource.md
- Upgrading AzureTRE Version: tre-admins/upgrading-tre.md
- Upgrading Resources Version: tre-admins/upgrading-resources.md
Expand Down
8 changes: 8 additions & 0 deletions templates/workspace_services/databricks/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Local .terraform directories
**/.terraform/*

# TF backend files
**/*_backend.tf
Dockerfile.tmpl
terraform/deploy.sh
terraform/destroy.sh
5 changes: 5 additions & 0 deletions templates/workspace_services/databricks/.env.sample
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
ID=__CHANGE_ME__
WORKSPACE_ID=__CHANGE_ME__
AZURE_LOCATION=__CHANGE_ME__
HOST_SUBNET_ADDRESS_PREFIX=__CHANGE_ME__
CONTAINER_SUBNET_ADDRESS_PREFIX=__CHANGE_ME__
18 changes: 18 additions & 0 deletions templates/workspace_services/databricks/Dockerfile.tmpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# syntax=docker/dockerfile-upstream:1.4.0
FROM debian:bullseye-slim

# PORTER_INIT

RUN rm -f /etc/apt/apt.conf.d/docker-clean; echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache

# Install git - required for https://registry.terraform.io/modules/claranet/regions/azurerm
RUN apt-get update && apt-get install --no-install-recommends -y git \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*

# PORTER_MIXINS

RUN apt-get remove -y git

# Use the BUNDLE_DIR build argument to copy files into the bundle
COPY --link . ${BUNDLE_DIR}/
47 changes: 47 additions & 0 deletions templates/workspace_services/databricks/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Contents

## porter.yaml

This is the porter manifest. See <https://porter.sh/author-bundles/> for
details on every field and how to configure your bundle. This is a required
file.

## helpers.sh

This is a bash script where you can place helper functions that you can call
from your porter.yaml file.

## README.md

This explains the files created by `porter create`. It is not used by porter and
can be deleted.

## Dockerfile.tmpl

This is a template Dockerfile for the bundle's invocation image. You can
customize it to use different base images, install tools and copy configuration
files. Porter will use it as a template and append lines to it for the mixin and to set
the CMD appropriately for the CNAB specification. You can delete this file if you don't
need it.

Add the following line to **porter.yaml** to enable the Dockerfile template:

```yaml
dockerfile: Dockerfile.tmpl
```
By default, the Dockerfile template is disabled and Porter automatically copies
all of the files in the current directory into the bundle's invocation image. When
you use a custom Dockerfile template, you must manually copy files into the bundle
using COPY statements in the Dockerfile template.
## .gitignore
This is a default file that we provide to help remind you which files are
generated by Porter, and shouldn't be committed to source control. You can
delete it if you don't need it.
## .dockerignore
This is a default file that controls which files are copied into the bundle's
invocation image by default. You can delete it if you don't need it.
56 changes: 56 additions & 0 deletions templates/workspace_services/databricks/parameters.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
{
"schemaType": "ParameterSet",
"schemaVersion": "1.0.1",
"namespace": "",
"name": "tre-service-databricks",
"parameters": [
{
"name": "id",
"source": {
"env": "ID"
}
},
{
"name": "tre_id",
"source": {
"env": "TRE_ID"
}
},
{
"name": "workspace_id",
"source": {
"env": "WORKSPACE_ID"
}
},
{
"name": "address_space",
"source": {
"env": "ADDRESS_SPACE"
}
},
{
"name": "is_exposed_externally",
"source": {
"env": "IS_EXPOSED_EXTERNALLY"
}
},
{
"name": "tfstate_container_name",
"source": {
"env": "TERRAFORM_STATE_CONTAINER_NAME"
}
},
{
"name": "tfstate_resource_group_name",
"source": {
"env": "MGMT_RESOURCE_GROUP_NAME"
}
},
{
"name": "tfstate_storage_account_name",
"source": {
"env": "MGMT_STORAGE_ACCOUNT_NAME"
}
}
]
}
176 changes: 176 additions & 0 deletions templates/workspace_services/databricks/porter.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
---
schemaVersion: 1.0.0
name: tre-service-databricks
version: 0.1.71
description: "An Azure TRE service for Azure Databricks."
registry: azuretre
dockerfile: Dockerfile.tmpl

credentials:

- name: azure_tenant_id
env: ARM_TENANT_ID
- name: azure_subscription_id
env: ARM_SUBSCRIPTION_ID
- name: azure_client_id
env: ARM_CLIENT_ID
- name: azure_client_secret
env: ARM_CLIENT_SECRET

parameters:
- name: workspace_id
type: string
- name: tre_id
type: string
- name: id
type: string
description: "Resource ID"
- name: address_space
type: string
- name: is_exposed_externally
type: boolean
- name: tfstate_resource_group_name
type: string
description: "Resource group containing the Terraform state storage account"
- name: tfstate_storage_account_name
type: string
description: "The name of the Terraform state storage account"
- name: tfstate_container_name
env: tfstate_container_name
type: string
default: "tfstate"
description: "The name of the Terraform state storage container"
- name: arm_use_msi
env: ARM_USE_MSI
type: boolean
default: false

outputs:
- name: databricks_workspace_name
type: string
applyTo:
- install
- upgrade
- name: connection_uri
type: string
applyTo:
- install
- upgrade
- name: internal_connection_uri
type: string
applyTo:
- install
- upgrade
- name: databricks_storage_account_name
type: string
applyTo:
- install
- upgrade
- name: dbfs_blob_storage_domain
type: string
applyTo:
- install
- upgrade
- name: metastore_addresses
type: string
applyTo:
- install
- upgrade
- name: event_hub_endpoint_addresses
type: string
applyTo:
- install
- upgrade
- name: log_blob_storage_domains
type: string
applyTo:
- install
- upgrade
- name: artifact_blob_storage_domains
type: string
applyTo:
- install
- upgrade
- name: workspace_address_spaces
type: string
applyTo:
- install
- upgrade
- name: databricks_address_prefixes
type: string
applyTo:
- install
- upgrade

mixins:
- terraform:
clientVersion: 1.3.6

install:
- terraform:
description: "Deploy Databricks Service"
vars:
tre_resource_id: ${ bundle.parameters.id }
tre_id: ${ bundle.parameters.tre_id }
workspace_id: ${ bundle.parameters.workspace_id }
address_space: ${ bundle.parameters.address_space }
is_exposed_externally: ${ bundle.parameters.is_exposed_externally }
backendConfig:
resource_group_name: ${ bundle.parameters.tfstate_resource_group_name }
storage_account_name: ${ bundle.parameters.tfstate_storage_account_name }
container_name: ${ bundle.parameters.tfstate_container_name }
key: ${ bundle.name }-${ bundle.parameters.id }
outputs:
- name: databricks_workspace_name
- name: connection_uri
- name: internal_connection_uri
- name: databricks_storage_account_name
- name: dbfs_blob_storage_domain
- name: metastore_addresses
- name: event_hub_endpoint_addresses
- name: log_blob_storage_domains
- name: artifact_blob_storage_domains
- name: workspace_address_spaces
- name: databricks_address_prefixes

upgrade:
- terraform:
description: "Upgrade Databricks Service"
vars:
tre_resource_id: ${ bundle.parameters.id }
tre_id: ${ bundle.parameters.tre_id }
workspace_id: ${ bundle.parameters.workspace_id }
address_space: ${ bundle.parameters.address_space }
is_exposed_externally: ${ bundle.parameters.is_exposed_externally }
backendConfig:
resource_group_name: ${ bundle.parameters.tfstate_resource_group_name }
storage_account_name: ${ bundle.parameters.tfstate_storage_account_name }
container_name: ${ bundle.parameters.tfstate_container_name }
key: ${ bundle.name }-${ bundle.parameters.id }
outputs:
- name: databricks_workspace_name
- name: connection_uri
- name: internal_connection_uri
- name: databricks_storage_account_name
- name: dbfs_blob_storage_domain
- name: metastore_addresses
- name: event_hub_endpoint_addresses
- name: log_blob_storage_domains
- name: artifact_blob_storage_domains
- name: workspace_address_spaces
- name: databricks_address_prefixes

uninstall:
- terraform:
description: "Uninstall Azure Databricks Service"
vars:
tre_resource_id: ${ bundle.parameters.id }
tre_id: ${ bundle.parameters.tre_id }
workspace_id: ${ bundle.parameters.workspace_id }
address_space: ${ bundle.parameters.address_space }
is_exposed_externally: ${ bundle.parameters.is_exposed_externally }
backendConfig:
resource_group_name: ${ bundle.parameters.tfstate_resource_group_name }
storage_account_name: ${ bundle.parameters.tfstate_storage_account_name }
container_name: ${ bundle.parameters.tfstate_container_name }
key: ${ bundle.name }-${ bundle.parameters.id }
Loading

0 comments on commit 35b486b

Please sign in to comment.