This is a deployment accelerator based on the reference architecture described in the Azure Architecture Center article Analytics end-to-end with Azure Synapse. This deployment accelerator aims to automate not only the deployment of the services covered by the reference architecture, but also to fully automate the configuration and permissions required for the services to work together. The deployed architecture enables the end-to-end analytics platform capable of handling the most common uses cases for most organizations.
The implementation of this deployment accelerator is done through the use of Azure Bicep, a domain-specific language (DSL) that uses declarative syntax to deploy Azure resources.
Before you hit the deploy button, make sure you review the details about the services deployed.
Note: The "Deploy to Azure" button above will redirect you to the Azure Portal with a reference to the resulting ARM template file generated by the build of the Bicep code. Please refer to Bicep files for the true source of the code for this accelerator.
You can also use Azure CLI to deploy the services:
For a full deployment of all workloads with public endpoints use the command below:
az deployment group create --resource-group resource-group-name --template-file ./AzureAnalyticsE2E.bicep --parameters synapseSqlAdminPassword=use-complex-password-here
For a full deployment of all workloads with vNet integrated endpoints use the command below:
az deployment group create --resource-group resource-group-name --template-file ./AzureAnalyticsE2E.bicep --parameters networkIsolationMode=vNet synapseSqlAdminPassword=use-complex-password-here
You can have more control over the deployment by providing values to optional template parameters in the form of:
az deployment group create --resource-group resource-group-name --template-file ./AzureAnalyticsE2E.bicep --parameters synapseSqlAdminPassword=use-complex-password-here param1=value1 param2=value2...
Important: This deployment accelerator is meant to be executed under no interference from Azure Policies that deny certain configurations as they might prevent the its successful completion. Please use a sandbox environment if you need to validate the deployment resulting configuration before you run it against other environments under Azure Policies.
Important: This deployment accelerator implements some service features that are still in Public Preview. Please consider those before you plan for a production deployment.
The target subscription for the deployment accelerator needs to have the following resource providers enabled before the deployment execution:
- Microsoft.Synapse
- Microsoft.Purview
- Microsoft.MachineLearningServices
- Microsoft.ContainerRegistry
- Microsoft.Network
- Microsoft.DataShare
- Microsoft.Authorization
- Microsoft.CognitiveServices
- Microsoft.ManagedIdentity
- Microsoft.KeyVault
- Microsoft.Storage
- Microsoft.StreamAnalytics
- Microsoft.Devices
- Microsoft.Insights
- Microsoft.EventHub
The deployment accelerator can be deployed in two network isolation modes: default or vNet.
Network Isolation Mode | Description |
---|---|
default | Deploys the selected components to Azure using public endpoints. |
vNet | Deploys the selected components to Azure and the additional services to support private connectivity and restricted inter-service connectivity where possible. This includes provisioning and configuration of virtual networks, managed virtual network deployments for Azure Synapse Analytics, the private endpoints for all services that support Private Link and the supporting Private DNS Zones. |
The scope of this deployment accelerator is illustrated in the diagram below.
Important: All services are deployed in a single resource group and in the same region as the resource group. Before creating the resource group that will host the workloads, check the Azure Products by Region and select a region that has all selected services available. The deployment will fail if any of the services is not available in the chosen region.
Important: For a fully automated deployment and configuration of Synapse Analytics and Purview the deployment accelerator makes use of post-deployment PowerShell scripts to perform data plane operations. The operations executed by these scripts are to execute operations to complement the final environment configuration as not every setring is available through Bicep. Because of these imperative actions executed by the scripts, the template is no longer idempotent and should only be used for initial deployment and configuration. For more details about the scripts see the deployment accelerator documentation.
The default pricing tier for all services are provisioned are their lowest possible to meet the initial deployment requirements. If you choose to provide different different values to the input parameters, please observe the pricing information for each service in the table below.
If explicit names are not provided, all services names will be appended with a unique 5-letter suffix to ensure name uniqueness in Azure.
The Azure services used in the architecture above have been divided into workloads (see workload tables below) that can be conditionally deployed based on input parameters. The only mandatory workload is Synapse Analytics represented in the grey box in the diagram above.
Name | Type | Default Pricing Tier | Conditional | Notes |
---|---|---|---|---|
az-resource group name-uami | Managed Identity | N/A | No | Required to run post-deployment scripts. It is deleted by clean-up post deployment script. |
azkeyvaultsuffix | Key vault | Standard A | No |
Name | Type | Default Pricing Tier | Conditional | Notes |
---|---|---|---|---|
azsynapsewkssuffix | Synapse workspace | N/A | No | Default workspace deployment doesn't incur costs. |
SparkCluster | Apache Spark pool | Small (3 nodes) | Yes | |
EnterpriseDW | Synapse SQL pool | DW100 | Yes | |
adxpoolsuffix | Data Explorer pool | Extra Small (2 nodes) | Yes | |
azwksdatalakesuffix | Storage account | Standard LRS | No | |
azrawdatalakesuffix | Storage account | Standard GRS | No | |
azcurateddatalakesuffix | Storage account | Standard GRS | No | |
SynapsePostDeploymentScript | Deployment Script | N/A | No | Deployment script resources will be automatically deleted after 24hs. |
Name | Type | Default Pricing Tier | Conditional | Notes |
---|---|---|---|---|
azpurviewsuffix | Purview account | 1 Capacity Unit | Yes | |
PurviewPostDeploymentScript | Deployment Script | N/A | Yes | Deployment script resources will be automatically deleted after 24hs. |
Name | Type | Default Pricing Tier | Conditional | Notes |
---|---|---|---|---|
azanomalydetectorsuffix | Anomaly detector | Standard | Yes | |
aztextanalyticssuffix | Language | Standard | Yes | |
azmlwkssuffix | Machine learning workspace | N/A | Yes | Default workspace deployment doesn't incur costs. |
azmlstoragesuffix | Storage account | Standard LRS | Yes | |
azmlcontainerregsuffix | Container registry | Basic or Premium (see notes) | Yes | Premium service tier required for private link support |
azmlappinsightssuffix | Application Insights | On-demand data ingestion charges | Yes |
Name | Type | Default Pricing Tier | Conditional | Notes |
---|---|---|---|---|
azdatasharesuffix | Data Share | On-demand data processing charges | Yes |
Name | Type | Default Pricing Tier | Conditional | Notes |
---|---|---|---|---|
azeventhubnssuffix | Event Hub namespace | Basic | Yes | |
aziothubsuffix | IoT Hub | Free | Yes | |
azstreamjobsuffix | Stream Analytics job | Standard | Yes |
Beyond the deployment of the services that make up the reference architecture, this template also automates the configuration of connections and permissions between the services in order for the to work properly. Every arrow you see in the diagram above represents a configuration step that has been automated for you saving you a lot of time to get to insights.
Each connection and permission in the list below has been implemented following the technical documentation for the services involved below. Check the reference documentation links below for more information about them.
These are the service connections explicitly defined in deployment accelerator template. These connections represent the necessary configuration for the services to be fully integrated and work well together. Note that these connections may result in implicit RBAC permissions set between resources participating in the connection that are not in the permission list below. Check the reference documentation of each service connection below for more information.
Beyond the service connections created above, the deployment accelerator template defined Azure RBAC permissions between the services. These are the minimum level of permissions granted to their system-assigned identity (MSI) for the integration to function properly. These are the Azure RBAC permissions explicitly set by the template and the reason for these permissions to exist is describer in the reference documentation for each one of them.
ID | Granted To Service | Granted On Service | Permission Level | Reference Documentation |
---|---|---|---|---|
azsynapsewkssuffix | azwksdatalakesuffix | Storage Blob Data Contributor | Grant permissions to workspace managed identity | |
azpurviewsuffix | azsynapsewkssuffix | Reader | Connect to and manage Azure Synapse Analytics workspaces in Azure Purview | |
azsynapsewkssuffix | azrawdatalakesuffix, azcurateddatalakesuffix | Storage Blob Data Contributor | Grant permissions to workspace managed identity | |
azsynapsewkssuffix | azmlwkssuffix | Contributor | Create a new Azure Machine Learning linked service in Synapse | |
azpurviewsuffix | azrawdatalakesuffix, azcurateddatalakesuffix | Storage Blob Data Reader | Connect to Azure Data Lake Gen2 in Azure Purview | |
azdatasharesuffix | azrawdatalakesuffix, azcurateddatalakesuffix | Storage Blob Data Reader | Roles and requirements for Azure Data Share | |
azmlwkssuffix | azrawdatalakesuffix, azcurateddatalakesuffix | Storage Blob Data Reader | Connect to storage by using identity-based data access | |
azstreamjobsuffix | azrawdatalakesuffix, azcurateddatalakesuffix | Storage Blob Data Contributor | Use Managed Identity to authenticate your Azure Stream Analytics job to Azure Blob Storage | |
aziothubsuffix | azrawdatalakesuffix, azcurateddatalakesuffix | Storage Blob Data Contributor | ||
azstreamjobsuffix | azeventhubnssuffix | Event Hub Data Owner | Use managed identities to access Event Hub from an Azure Stream Analytics job | |
azstreamjobsuffix | aziothubsuffix | IoT Hub Data Receiver | Control access to IoT Hub by using Azure Active Directory | |
azpurviewsuffix | Resource Group | Storage Blob Data Reader | Connect to and manage Azure Synapse Analytics workspaces in Azure Purview |
ID | Granted to Service | Granted On Service | Permission Level | Reference Documentation |
---|---|---|---|---|
azsynapsewkssuffix | azkeyvaultsuffix | Get and List Secrets | Use Azure Key Vault secrets in pipeline activities | |
azpurviewsuffix | azkeyvaultsuffix | Get and List Secrets | Credentials for source authentication in Azure Purview | |
azmlwkssuffix | azsynapsewkssuffix | Synapse Apache Spark Administrator | Link Azure Synapse Analytics and Azure Machine Learning workspaces and attach Apache Spark pools | |
azsynapewkssuffix | azpurviewsuffix | Data Curator | Connect a Synapse workspace to an Azure Purview account | |
azdatasharesuffix | azpurviewsuffix | Data Curator | How to connect Azure Data Share and Azure Purview |
If you choose for a 'vNet Integrated' network isolation mode then the following applies:
- The Synapse Workspace will be deployed with a Managed Virtual Network.
- Managed private endpoints for some of the services will be created in the Synapse Workspace managed virtual network.
- Either a new or an existing virtual network will be used to deploy the private endpoints for all services in the architecture that support Private Link.
- Public access will be disabled and firewall rules will be set to restrict connectivity to and from the virtual network and between the services in the architecture.
- Private DNS zones required by the different private link domains can be optionally deployed and linked to the selected virtual network.
The following extra services will be deployed to support the private connectivity configuration:
Component | Name | Type | Optional |
---|---|---|---|
Synapse Analytics | privatelink.azuresynapse.net | Private DNS Zone | Yes |
Synapse Analytics | privatelink.dev.azuresynapse.net | Private DNS Zone | Yes |
Synapse Analytics | privatelink.azuresynapse.net | Private DNS Zone | Yes |
Synapse Analytics | privatelink.sql.azuresynapse.net | Private DNS Zone | Yes |
Synapse Analytics | privatelink.dfs.core.windows.net | Private DNS Zone | Yes |
Synapse Analytics | privatelink.vaultcore.azure.net | Private DNS Zone | Yes |
AI | privatelink.api.azureml.ms | Private DNS Zone | Yes |
AI | privatelink.azurecr.io | Private DNS Zone | Yes |
AI | privatelink.file.core.windows.net | Private DNS Zone | Yes |
AI | privatelink.notebooks.azure.net | Private DNS Zone | Yes |
Data Governance | privatelink.queue.core.windows.net | Private DNS Zone | Yes |
Data Governance | privatelink.servicebus.windows.net | Private DNS Zone | Yes |
Data Governance | privatelink.blob.core.windows.net | Private DNS Zone | Yes |
Data Governance | privatelink.purview.azure.com | Private DNS Zone | Yes |
Streaming | privatelink.azure-devices.net | Private DNS Zone | Yes |
Synapse Analytics | azvnetsuffix | Virtual Network | No |
Synapse Analytics | azsynapsehubsuffix | Synapse private link hub | No |
Synapse Analytics | azsynapsewkssuffix-web | Private Endpoint | No |
Synapse Analytics | azsynapsewkssuffix-sqlserverless | Private Endpoint | No |
Synapse Analytics | azsynapsewkssuffix-sql | Private Endpoint | No |
Synapse Analytics | azsynapsewkssuffix-dev | Private Endpoint | No |
Synapse Analytics | azkeyvaultsuffix | Private Endpoint | No |
Synapse Analytics | azwksdatalakesuffix-dfs | Private Endpoint | No |
Synapse Analytics | azrawdatalakesuffix-dfs | Private Endpoint | No |
Synapse Analytics | azcurateddatalakesuffix-dfs | Private Endpoint | No |
Data Governance | azpurviewsuffix-queue | Private Endpoint | No |
Data Governance | azpurviewsuffix-portal | Private Endpoint | No |
Data Governance | azpurviewsuffix-namespace | Private Endpoint | No |
Data Governance | azpurviewsuffix-blob | Private Endpoint | No |
Data Governance | azpurviewsuffix-account | Private Endpoint | No |
AI | aztextanalyticssuffix-account | Private Endpoint | No |
AI | azanomalydetectorsuffix-account | Private Endpoint | No |
AI | azmlwkssuffix-amlworkspace | Private Endpoint | No |
AI | azmlstoragesuffix-file | Private Endpoint | No |
AI | azmlstoragesuffix-blob | Private Endpoint | No |
AI | azmlcontainerregsuffix-registry | Private Endpoint | No |
Streaming | azeventhubnssuffix-namespace | Private Endpoint | No |
Streaming | azeiothubsuffix-iothub | Private Endpoint | No |
Beyond the extra services above required to support the network isolation mode, the following network settings are applied to the services:
Workload | Name | Type | Network Settings | Notes | Reference Documentation |
---|---|---|---|---|---|
Platform Services | azkeyvaultsuffix | Key vault | 'Allow Azure Services' required for access from Azure Purview and Azure ML | Configure Azure Key Vault networking settings | |
Synapse Analytics | azsynapsewkssuffix | Synapse workspace | Managed Virtual Network enabled | Understanding Azure Synapse Private Endpoints | |
Synapse Analytics | azwksdatalakesuffix | Storage account | Configure Azure Storage firewalls and virtual networks | ||
Synapse Analytics | azrawdatalakesuffix | Storage account | 'Allow Azure Services' enabled only when deploying Streaming workloads with Event Hubs | Configure Azure Storage firewalls and virtual networks | |
Synapse Analytics | azcurateddatalakesuffix | Storage account | 'Allow Azure Services' enabled only when deploying Streaming workloads with Event Hubs | Configure Azure Storage firewalls and virtual networks | |
Data Governance | azpurviewsuffix | Purview account | Connect to your Azure Purview and scan data sources privately and securely | ||
AI | azanomalydetectorsuffix | Anomaly detector | Configure Azure Cognitive Services virtual networks | ||
AI | aztextanalyticssuffix | Language | Configure Azure Cognitive Services virtual networks | ||
AI | azmlwkssuffix | Machine learning workspace | Secure Azure Machine Learning workspace resources using virtual networks (VNets) | ||
AI | azmlstoragesuffix | Storage account | Secure an Azure Machine Learning workspace with virtual networks | ||
AI | azmlcontainerregsuffix | Container registry | Secure an Azure Machine Learning workspace with virtual networks | ||
Streaming | azeventhubnssuffix | Event Hub namespace | Network security for Azure Event Hubs | ||
Streaming | aziothubsuffix | IoT Hub | IoT Hub support for virtual networks with Private Link and Managed Identity | ||
Streaming | azstreamjobsuffix | Stream Analytics job | Stream Analytics Jobs don't support vNet integration. For that you should use Stream Analytics Clusters |
If you would like to contribute to the solution (log bugs, issues, or add code) we have details on how to do that in our CONTRIBUTING.md file.
Details on licensing for the project can be found in the LICENSE file.