aws-glue-crawler

Here are 16 public repositories matching this topic...

aws-samples / aws-glue-crawler-utilities

This repository has a collection of utilities for Glue Crawlers. These utilities come in the form of AWS CloudFormation templates or AWS CDK applications.

aws-glue aws-glue-crawler

Updated Dec 21, 2021
Python

fermat01 / ETL-Data-Pipeline-using-AWS-EMR-Spark-Glue-Athena

Star

ETL Data pipeline using aws services

aws aws-s3 aws-athena aws-emr-clusters aws-glue-crawler

Updated Aug 23, 2024
Python

GabrielDan92 / AWS_Terraform_PySpark-ETL_Job

Star

Terraform configuration that creates several AWS services, uploads data in S3 and starts the Glue Crawler and Glue Job.

aws terraform s3-bucket pyspark glue-job glue-catalog aws-glue-crawler

Updated Feb 10, 2022
Python

aws-samples / automated-datastore-discovery-with-aws-glue

Star

Automation framework to catalog AWS data sources using Glue

aws typescript aws-s3 dynamodb glue python3 data-catalog rds gdpr pii data-governance aws-cdk aws-glue-workflow aws-glue-crawler aws-glue-data-catalog

Updated Jan 29, 2025
Python

subhamay-cloudworks / 0090-deutzia-cft

Sponsor

Star

Creating an audit table for a DynamoDB table using CloudTrail, Kinesis Data Stream, Lambda, S3, Glue and Athena and CloudFormation

aws-python-lambda aws-iam aws-cloudformation aws-cloudtrail aws-cloudwatch aws-athena aws-cloudwatch-logs aws-kinesis-stream aws-glue-crawler aws-iam-roles aws-iam-policies aws-s3-bucket aws-glue-data-catalog

Updated Jul 6, 2023
Python

SadafAsad / LinkedIn-Jobs-Analysis

Star

Unveiling job market trends with Scrapy and AWS

python aws-s3 scrapy aws-ec2 aws-athena aws-quicksight aws-glue-crawler aws-glue-data-catalog

Updated Apr 5, 2024
Python

Saurabhkhandebharad / BigData-SK

Star

Analyzed a multicategory e-commerce store using big data techniques on a Kaggle dataset with the help of AWS EC2, AWS S3, PySpark, AWS Glue ETL, AWS Athena, AWS CloudFormation, AWS Lambda and Power BI!

aws big-data aws-lambda power-bi pyspark aws-ec2 aws-cloudformation aws-athena kaggle-dataset aws-services end-to-end-pipeline end-to-end-project aws-glue-crawler aws-s3-bucket

Updated Sep 7, 2024
Python

dhvani-k / YouTrend_Insights_Analyzing_YouTube_Video_Landscape

Star

An end-to-end solution for managing and analyzing YouTube video data from Kaggle, leveraging AWS services and visualized through Quicksight and Tableau

aws marketing youtube aws-lambda aws-s3 youtube-api aws-iam tableau content-strategy aws-athena aws-lambda-python aws-glue quicksight aws-glue-crawler user-insights

Updated Sep 23, 2023
Python

DivineSamOfficial / SmartCityProject

Star

Smart City Realtime Data Engineering Project

python aws kafka aws-s3 pyspark spark-streaming aws-ec2 aws-athena aws-redshift aws-glue aws-quicksight aws-glue-crawler aws-glue-data-catalog

Updated May 24, 2024
Python

VvEK-Hiremath / Airlines-Data-Pipeline-Project-AWS

Star

Implementing data pipeline using AWS services for airlines data

python aws aws-s3 aws-sns aws-redshift step-functions aws-eventbridge aws-glue-workflow aws-glue-crawler

Updated Oct 15, 2024
Python

KRISHNASAIRAJ / AWS-Driven-Sales-Performance-Outlook

Star

The Project aims to establish a robust data pipeline for tracking and analyzing sales performance using various AWS services. The process involves creating a DynamoDB database, implementing Change Data Capture (CDC), utilizing Kinesis streams, and finally, storing and querying the data in Amazon Athena.

python aws-lambda dynamodb s3-bucket kinesis kinesis-firehose aws-athena glue-catalog aws-glue-crawler eventbridge-pipes

Updated Feb 11, 2024
Python

ShreyasLengade / serverless_etl_pipeline

Star

Developed an ETL pipeline for real-time ingestion of stock market data from the stock-market-data-manage.onrender.com API. Engineered the system to store data in Parquet format for optimized query processing and incorporated data quality checks to ensure accuracy prior to visualization.

aws-lambda aws-s3 data-engineering aws-kinesis aws-glue data-engineering-pipeline aws-glue-crawler aws-grafana aws-glue-data-catalog

Updated Jun 25, 2024
Python

shahidmalik4 / aws-glue-stepfunctions-etl

Star

This project automates an ETL pipeline using AWS Glue, S3, Athena, and Step Functions to transform raw Airbnb data. It cleanses, enriches, and organizes the data into separate raw and transformed databases, enabling efficient querying and analysis via Athena, with automated notifications through SNS.

aws aws-s3 pyspark aws-sns aws-athena aws-step-functions etl-pipeline aws-glue aws-glue-crawler

Updated Nov 22, 2024
Python

imverma / DataEngineering-YouTube-Analysis-Project

Star

An end-to-end solution for managing and analyzing YouTube video data from Kaggle, leveraging AWS services and visualized through Quicksight and Tableau

aws youtube aws-lambda aws-s3 youtube-api aws-iam tableau aws-athena aws-lambda-python aws-glue quicksight aws-glue-crawler user-insights

Updated Nov 29, 2023
Python

Tyriek-cloud / NYC-Mobility-Survey-Analysis

Star

An end-to-end data engineering project in which five NYC DOT datasets were modified in an ETL process and analyzed for insights.

python aws json aws-s3 data-engineering data-analysis aws-athena etl-pipeline aws-glue aws-quicksight aws-glue-crawler

Updated Sep 3, 2024
Python

desininja / Quality-Movie-Data-Pipeline

Star

ETL pipeline using AWS services

aws etl aws-s3 data-engineering redshift aws-step-function etl-pipeline aws-glue aws-eventbridge aws-glue-crawler

Updated Sep 12, 2024
Python

Improve this page

Add a description, image, and links to the aws-glue-crawler topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the aws-glue-crawler topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aws-glue-crawler

Here are 16 public repositories matching this topic...

aws-samples / aws-glue-crawler-utilities

fermat01 / ETL-Data-Pipeline-using-AWS-EMR-Spark-Glue-Athena

GabrielDan92 / AWS_Terraform_PySpark-ETL_Job

aws-samples / automated-datastore-discovery-with-aws-glue

subhamay-cloudworks / 0090-deutzia-cft

SadafAsad / LinkedIn-Jobs-Analysis

Saurabhkhandebharad / BigData-SK

dhvani-k / YouTrend_Insights_Analyzing_YouTube_Video_Landscape

DivineSamOfficial / SmartCityProject

VvEK-Hiremath / Airlines-Data-Pipeline-Project-AWS

KRISHNASAIRAJ / AWS-Driven-Sales-Performance-Outlook

ShreyasLengade / serverless_etl_pipeline

shahidmalik4 / aws-glue-stepfunctions-etl

imverma / DataEngineering-YouTube-Analysis-Project

Tyriek-cloud / NYC-Mobility-Survey-Analysis

desininja / Quality-Movie-Data-Pipeline

Improve this page

Add this topic to your repo