This repository has a collection of utilities for Glue Crawlers. These utilities come in the form of AWS CloudFormation templates or AWS CDK applications.
-
Updated
Dec 21, 2021 - Python
This repository has a collection of utilities for Glue Crawlers. These utilities come in the form of AWS CloudFormation templates or AWS CDK applications.
ETL Data pipeline using aws services
Terraform configuration that creates several AWS services, uploads data in S3 and starts the Glue Crawler and Glue Job.
Automation framework to catalog AWS data sources using Glue
Creating an audit table for a DynamoDB table using CloudTrail, Kinesis Data Stream, Lambda, S3, Glue and Athena and CloudFormation
Unveiling job market trends with Scrapy and AWS
Analyzed a multicategory e-commerce store using big data techniques on a Kaggle dataset with the help of AWS EC2, AWS S3, PySpark, AWS Glue ETL, AWS Athena, AWS CloudFormation, AWS Lambda and Power BI!
An end-to-end solution for managing and analyzing YouTube video data from Kaggle, leveraging AWS services and visualized through Quicksight and Tableau
Smart City Realtime Data Engineering Project
Implementing data pipeline using AWS services for airlines data
The Project aims to establish a robust data pipeline for tracking and analyzing sales performance using various AWS services. The process involves creating a DynamoDB database, implementing Change Data Capture (CDC), utilizing Kinesis streams, and finally, storing and querying the data in Amazon Athena.
Developed an ETL pipeline for real-time ingestion of stock market data from the stock-market-data-manage.onrender.com API. Engineered the system to store data in Parquet format for optimized query processing and incorporated data quality checks to ensure accuracy prior to visualization.
This project automates an ETL pipeline using AWS Glue, S3, Athena, and Step Functions to transform raw Airbnb data. It cleanses, enriches, and organizes the data into separate raw and transformed databases, enabling efficient querying and analysis via Athena, with automated notifications through SNS.
An end-to-end solution for managing and analyzing YouTube video data from Kaggle, leveraging AWS services and visualized through Quicksight and Tableau
An end-to-end data engineering project in which five NYC DOT datasets were modified in an ETL process and analyzed for insights.
ETL pipeline using AWS services
Add a description, image, and links to the aws-glue-crawler topic page so that developers can more easily learn about it.
To associate your repository with the aws-glue-crawler topic, visit your repo's landing page and select "manage topics."