Data Ingestion into Azure Service Bus To SQL

Event-Driven Microservice Architecture Diagram

Session Recorded

Session Video - 06/19/2024

Session Video - 05/08/2024

Session Video - 04/25/2024

Session Video - 03/27/2024

Session Video - 01/18/2024

Data Ingestion Services in Azure

Here's a comparison table highlighting the key features, strengths, and use cases for Azure Data Factory (ADF), Azure Databricks, Azure Synapse Analytics and Custom Data Ingestion Service

Feature	Azure Data Factory (ADF)	Azure Databricks	Azure Synapse Analytics	Custom Data Ingestion Service
Purpose	Data integration, ETL, and data orchestration	Big data analytics, data engineering, and machine learning	Analytics, data warehousing, and big data processing	Data integration, ETL, and data orchestration
Key Components	Pipelines, Dataflows, Linked Services, Datasets	Workspaces, Notebooks, Clusters, Jobs	Dedicated SQL Pool, Serverless SQL Pool, Spark Pool, Data Integration	Dedicated SQL Pool, Serverless SQL Workflow, DataFlow. DataSets Linkservice :- api
Data Integration	Extensive connectors for various data sources	Built-in integration with many data sources via notebooks	Integrated data ingestion and transformation	Connectors to Azure Table Storage, Azure SQL
Data Processing	ETL/ELT processing using dataflows and pipelines	Advanced data processing and analytics with Apache Spark	Unified analytics, SQL-based data processing, Spark, and pipelines	ETL/ELT processing using Orchestration WorkFlow(JSON) sand pipelines
Programming Languages	JSON for pipeline definitions, Dataflow expressions	Scala, Python, R, SQL	T-SQL, Python, Scala, .NET	T-SQL, .NET
Scalability	Automatically scalable	Highly scalable with auto-scaling clusters	Scalable from small to very large workloads	Scalable from small to very large workloads
Compute Management	Managed compute for data movement and transformation	Customizable and managed Spark clusters	Managed pools for SQL and Spark, auto-scaling	Managed pools for Azure functions and SQL, auto-scaling
Integration with Azure Services	Deep integration with Azure services like Azure SQL, Blob Storage	Strong integration with Azure storage, SQL, and AI services	Comprehensive integration with Azure ecosystem	Deep integration with Azure services like Azure SQL, Blob Storage
Data Movement	Copy data activity for moving data across sources	Supports data movement through Spark jobs	Integrated data movement and transformation	Copy data, move,slice and transform data across azure resources
Orchestration	Advanced scheduling, event-based triggers, and monitoring	Workflow management via jobs and notebooks	Unified data and pipeline orchestration	event-based triggers, and monitoring, Workflow management
Machine Learning	Limited ML capabilities	Advanced ML with integrated MLflow and Spark MLlib	Integrated ML capabilities with Synapse Studio	Integrated ML capabilities with c# .NET
User Interface	GUI for pipeline design and monitoring	Notebooks for interactive development and visualization	Unified web-based interface for SQL, Spark, and pipelines	Web-based interface for SQL
Security	Built-in security features, VNET support, managed identity	Secure workspace, role-based access control (RBAC)	Comprehensive security features, RBAC, encryption	Built-in security features, VNET support, managed identity, RBAC
Use Cases	ETL/ELT processes, data migration, data integration	Data engineering, big data analytics, machine learning workflows	Advanced analytics, data warehousing, big data processing, and data integration	ETL/ELT processes, data migration, data integration
Cost Structure	Pay-as-you-go based on pipeline runs, data movement	Pay-as-you-go based on compute and storage usage	Pay-as-you-go for different pools, storage, and compute	Pay-as-you-go for different dedicated pools, storage, and compute

ADF Ingestion Sequence Diagram

Azure Data Bricks Ingestion Sequence Diagram

Azure Synapse Analytics Ingestion Sequence Diagram

Set Up Steps

Creating a serverless API using Azure that leverages Service Bus to communicate with an SQL Database involves several steps. Here's a high-level overview of how you can set this up:

Set Up Azure SQL Database:
- Create an Azure SQL Database instance.
- Set up the necessary tables and schemas you'll need for your application.
Create Azure Service Bus:
- Set up an Azure Service Bus namespace.
- Within the namespace, create a queue or topic (based on your requirement).
Deploy Serverless API using Azure Functions:
- Create a new Azure Function App.
- Develop an HTTP-triggered function that will act as your API endpoint.
- In this function, when data is received, send a message to the Service Bus queue or topic.
Deploy 2 Service Bus Triggered Function:
- Create another Azure Function that is triggered by the Service Bus queue or topic.
- This function will read the message from the Service Bus and process it. The processing might involve parsing the message and inserting the data into the Azure SQL Database.
Deploy a Timer Triggered Function:
- Create another Azure Function that is triggered when a file is dropped in a container.
- This function will stream in a file, read it and place on the service bus topic.
Implement Error Handling:
- Ensure that you have error handling in place. If there's a failure in processing the message and inserting it into the database, you might want to log the error or move the message to a dead-letter queue.
Secure Your Functions:
- Ensure that your HTTP-triggered function (API endpoint) is secured, possibly using Azure Active Directory or function keys.
Optimize & Monitor:
- Monitor the performance of your functions using Azure Monitor and Application Insights.
- Optimize the performance, scalability, and cost by adjusting the function's plan (Consumption Plan, Premium Plan, etc.) and tweaking the configurations.
Deployment:
- Deploy your functions to the Azure environment. You can use CI/CD pipelines using tools like Azure DevOps or GitHub Actions for automated deployments.

By following these steps, you'll have a serverless API in Azure that uses Service Bus as a mediator to process data and store it in an SQL Database. This architecture ensures decoupling between data ingestion and processing, adding a layer of resilience and scalability to your solution.

Appplication Setting

Key	Value	Comment
AzureWebJobsStorage	[CONNECTION STRING]	RECOMMENDATION : store in AzureKey Vault.
ConfigurationPath	[CONFIGURATION FOLDER PATH]	Folder is optional
ApiKeyName	[API KEY NAME]	Will be passed in the header : the file name of the config.
AppName	[APPLICATION NAME]	This is the name of the Function App, used in log analytics
StorageAcctName	[STORAGE ACCOUNT NAME]	Example "AzureWebJobsStorage"
ServiceBusConnectionString	[SERVICE BUS CONNECTION STRING]	Example "ServiceBusConnectionString". Recommmended to store in Key vault.
DatabaseConnection	[DATABASE CONNECTION STRING]	Example "DatabaseConnection". Recommmended to store in Key vault.
TimerInterval	[TIMER_INTERVAL]	Example "0 /1 * * *" 1 MIN

Note: Look at the configuration file in the Config Folder and created a Table to record information.

Configuration Files

Note: The Configuration is located in the FunctionApp in a Config Folder.

FileName	Description
99F77BEF300E4660A63A939ADD0BCF68.json	Upload File Parse CSV file --> Write Batched Files To Storage
43EFE991E8614CFB9EDECF1B0FDED37A.json	File Parser Parse CSV file --> File received from SFTP will use this process to parse files
43EFE991E8614CFB9EDECF1B0FDED37D.json	Upload File Parse JSON/CSV Directly to NO SQL DB
43EFE991E8614CFB9EDECF1B0FDED37C.json	Service Bus Trigger for SQL DB
43EFE991E8614CFB9EDECF1B0FDED37F.json	Service Bus Trigger for No SQL DB
43EFE991E8614CFB9EDECF1B0FDED37E.json	Blob Trigger Send parsed/sharded file to Send to Service Bus
43EFE991E8614CFB9EDECF1B0FDED37B.json	Search Resullt from NO SQLDB
43EFE991E8614CFB9EDECF1B0FDED37G.json	Search SQL DB. Return resultset
3FB620B0E0FD4E8F93C9E4D839D00E1E.json	Copy File from SFTP into the pickup folder
3FB620B0E0FD4E8F93C9E4D839D00E1F.json	Create a new Record in NoSQL Database
CC244934898F46789734A9437B6F76CA.json	Encode Payload Request
6B427917E36A4DA281D57F9A64AD9D55.json	Get reports from DB

Create the following blob containers and share in azure storage

ContainerName	Description
config	Location for the configuration files
pickup	Thes are files that are copied from the SFTP share and dropped in the pickup container
processed	These are files the have been parsed and dropped in th processed container

Table	Description
csvbatchfiles	Track the CSV parsed files
training[YYYYMMDD]	N0 SQL DataStore

Share	Description
training[YYYYMMDD]	Create a share location for SFTP to drop files

Service Bus Subscription information

Subscription Name	Description
request	Create a Topic
nosqlmessage	Create a Subscription
sqlmessage	Create a Subscription

Create Azure Container Instance for SFTP

User the following link to create a Azure Container Instance(ACI for SFTP)

https://docs.microsoft.com/en-us/samples/azure-samples/sftp-creation-template/sftp-on-azure

Kusto Queries used for Application Insights

search "ReceiveMessageFromServieBus"
| summarize count() by bin(timestamp, 1h)
| order by timestamp desc

customEvents
| where isnotnull(customDimensions.ProcessName)
//| where customDimensions.ProcessName == 'ReceiveMessageFromServieBus'  
| summarize count() by bin(timestamp, 1m),  Key = tostring(customDimensions.ProcessName) 
| order by timestamp desc
| render columnchart

Products

products	links	Comments
Azure Getting Started	https://azure.microsoft.com/en-us/free/	Create free account + $200 in Credit
Sample Data Sets	https://www.kaggle.com/datasets	Useful site for pulling sample payload
Azure Storage Explorer	https://azure.microsoft.com/en-us/features/storage-explorer/#features	useful view and query data in azure table storage
Postman	https://www.postman.com/downloads/	Postman supports the Web or Desktop Version
VsCode	https://visualstudio.microsoft.com/downloads/	Required extensions. Azure Functions, Azure Account
VS Studio Community Edition	https://visualstudio.microsoft.com/downloads/	Recommended. Nice intergration with Azure. software is free.
Liquid Template	https://liquidjs.com/tutorials/intro-to-liquid.html

Configuration-driven development

"Configuration-driven development," refers to an approach in software development where the behavior and functionality of an application are primarily controlled through configuration files, rather than writing code. It focuses on separating application logic from configuration parameters, allowing developers to easily modify the behavior of the software without the need for extensive code changes. Xenhey.BPM.Core.Net6 offers a orchrestration pipeline using configuration to drive the execution of business logic, providing a tailored solution for each Line Of Business(LOB). The following are some benefits.

Flexibility: By using configuration files, developers can easily tweak and adjust the application's behavior without modifying the underlying code. It allows for quick customization and adaptation to different scenarios or client requirements.
Maintenance: Separating configuration from code simplifies maintenance processes. Updates and modifications can be made by adjusting the configuration files, reducing the risk of introducing bugs or breaking existing functionality. It also facilitates version control and collaboration, as changes to configuration can be tracked separately from code changes.
Scalability: Configuration-driven development promotes scalability by enabling the application to handle different environments, configurations, or user preferences. It allows for seamless deployment across multiple instances or environments with minimal code changes.
Testing and Validation: Configuration-driven development enhances testing and validation processes. Since configuration changes are isolated from the codebase, it becomes easier to verify the impact of different configurations on the application's behavior. It also facilitates A/B testing or experimentation by quickly switching between different configurations.
Domain-Specific Customization: Configuration-driven development enables domain-specific customization by providing options and settings tailored to specific use cases. This empowers non-technical users or administrators to configure the application according to their specific requirements without the need for coding expertise

Data is crucial for AI for several reasons:

Training Machine Learning Models: AI systems, particularly those based on machine learning, require large amounts of data to learn and make accurate predictions. The quality and quantity of data directly influence the performance of the AI models.
Improving Accuracy and Performance: High-quality data helps improve the accuracy and performance of AI systems. With more data, AI can recognize patterns and make better decisions, leading to more reliable and effective applications.
Enabling Personalization: AI systems use data to personalize user experiences. For example, recommendation systems for streaming services or e-commerce sites rely on user data to suggest content or products that align with individual preferences.
Continuous Learning and Adaptation: AI systems need data to continuously learn and adapt to changing environments and new information. This ability to update and improve over time is essential for maintaining the relevance and accuracy of AI applications.
Driving Insights and Innovation: Data enables AI to uncover hidden patterns and insights that might not be evident to human analysts. These insights can drive innovation, optimize processes, and create new opportunities across various industries.

Contact

For questions related to this project, can be reached via email :- support@xenhey.com

Name		Name	Last commit message	Last commit date
Latest commit History 369 Commits
.github/workflows		.github/workflows
.vscode		.vscode
AzureServiceBusToSQL		AzureServiceBusToSQL
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Ingestion into Azure Service Bus To SQL

Event-Driven Microservice Architecture Diagram

Session Recorded

Data Ingestion Services in Azure

ADF Ingestion Sequence Diagram

Azure Data Bricks Ingestion Sequence Diagram

Azure Synapse Analytics Ingestion Sequence Diagram

Appplication Setting

Configuration Files

Service Bus Subscription information

Create Azure Container Instance for SFTP

Products

Configuration-driven development

Data is crucial for AI for several reasons:

Contact

About

Releases

Packages

Languages

oigbokwe73/Data-Ingestion-Azure-Service-Bus-To-SQL

Folders and files

Latest commit

History

Repository files navigation

Data Ingestion into Azure Service Bus To SQL

Event-Driven Microservice Architecture Diagram

Session Recorded

Data Ingestion Services in Azure

ADF Ingestion Sequence Diagram

Azure Data Bricks Ingestion Sequence Diagram

Azure Synapse Analytics Ingestion Sequence Diagram

Appplication Setting

Configuration Files

Service Bus Subscription information

Create Azure Container Instance for SFTP

Products

Configuration-driven development

Data is crucial for AI for several reasons:

Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages