Snowflake Pipeline Project

This repository contains a Snowflake pipeline project that leverages Snowpipe, tasks, and streams to handle data ingestion, transformation, and loading. Below are the setup instructions:

Architecture

Prerequisites

Before running the script, ensure the following prerequisites are met:

AWS Role and Access Key:
- Create a role in AWS with the necessary permissions.
- Download the AWS access key.
S3 Bucket and Event Notification:
- Create an S3 bucket.
- Enable event notification on the S3 bucket.
- Copy the SQS ARN of Snowpipe to the S3 bucket event notification settings.

Setup

Database and Schemas:
- Create a Snowflake database named DB1.
- Create a schema named TS1 within the DB1 database.
External Stage:
- Create an external stage named S3_STAGE with your S3 bucket credentials.
Initial Table:
- Create a table named PERSON_NESTED within the TS1 schema.
JSON File Format:
- Create a file format named JSON of type JSON and compression AUTO.
Snowpipe:
- Create a Snowpipe named PERSON_PIPE with auto_ingest enabled.
- Point the Snowpipe to the PERSON_NESTED table and use the JSON file format.
- Refer to the code available in the repo for the detailed Snowpipe setup.
Stream and Target Tables:
- Create a stream named PERSON_NESTED_STREAM on the PERSON_NESTED table.
- Create target tables PERSON_MASTER and PERSON_LOCATION within the TS1 schema.
Procedure and Task:
- Create a stored procedure named PERSON_PROC within the TS1 schema.
- Create a task named PERSON_TASK1 to schedule the procedure when the stream has data.
Run Task:
- Suspend the task using ALTER TASK DB1.TS1.PERSON_TASK1 SUSPEND for testing.

Test the Pipeline

Check Pipeline Status:
- Execute SELECT system$pipe_status('DB1.TS1.PERSON_PIPE');.
Check Contents of Target Tables:
- Execute SELECT * FROM DB1.TS1.PERSON_MASTER;, SELECT * FROM DB1.TS1.PERSON_LOCATION;, and SELECT * FROM DB1.TS1.PERSON_NESTED_STREAM;.
File for Testing:
- Use the file person_intl_1.json for testing Snowpipe functionality.
- Upload the rest of the files to the S3 bucket.
- Wait for a minute and check if the data has been ingested into the target tables.

Feel free to refer to the code available in the repository for detailed implementation steps.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Architecture.png		Architecture.png
README.md		README.md
person_1.json		person_1.json
person_2.json		person_2.json
person_3.json		person_3.json
person_intl_1.json		person_intl_1.json
snowflake code.txt		snowflake code.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Snowflake Pipeline Project

Architecture

Prerequisites

Setup

Test the Pipeline

About

Releases

Packages

Sayanss99/Data_Pipeline

Folders and files

Latest commit

History

Repository files navigation

Snowflake Pipeline Project

Architecture

Prerequisites

Setup

Test the Pipeline

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages