Skip to content

hbzhxying/aws-glue-samples

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

80 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AWS Glue ETL Code Samples

This repository has samples that demonstrate various aspects of the new AWS Glue service, as well as various AWS Glue utilities.

You can find the AWS Glue open-source Python libraries in a separate repository at: awslabs/aws-glue-libs.

Content

  • FAQ and How-to

    Helps you get started using the many ETL capabilities of AWS Glue, and answers some of the more common questions people have.

Examples

  • Join and Relationalize Data in S3

    This sample ETL script shows you how to use AWS Glue to load, transform, and rewrite data in AWS S3 so that it can easily and efficiently be queried and analyzed.

  • Clean and Process

    This sample ETL script shows you how to take advantage of both Spark and AWS Glue features to clean and transform data for efficient analysis.

  • The resolveChoice Method

    This sample explores all four of the ways you can resolve choice types in a dataset using DynamicFrame's resolveChoice method.

  • Converting character encoding

    This sample ETL script shows you how to use AWS Glue job to convert character encoding.

Utilities

GlueCustomConnectors

AWS Glue provides built-in support for the most commonly used data stores such as Amazon Redshift, MySQL, MongoDB. Powered by Glue ETL Custom Connector, you can subscribe a third-party connector from AWS Marketplace or build your own connector to connect to data stores that are not natively supported.

marketplace

  • Development

    Development guide with examples of connectors with simple, intermediate, and advanced functionalities. These examples demonstrate how to implement Glue Custom Connectors based on Spark Data Source or Amazon Athena Federated Query interfaces and plug them into Glue Spark runtime.

  • Local Validation Tests

    This user guide describes validation tests that you can run locally on your laptop to integrate your connector with Glue Spark runtime.

  • Validation

    This user guide shows how to validate connectors with Glue Spark runtime in a Glue job system before deploying them for your workloads.

  • Glue Spark Script Examples

    Python scripts examples to use Spark, Amazon Athena and JDBC connectors with Glue Spark runtime.

  • Create and Publish Glue Connector to AWS Marketplace

    If you would like to partner or publish your Glue custom connector to AWS Marketplace, please refer to this guide and reach out to us at glue-connectors@amazon.com for further details on your connector.

License Summary

This sample code is made available under the MIT-0 license. See the LICENSE file.

About

AWS Glue code samples

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 39.0%
  • Jupyter Notebook 26.3%
  • Scala 16.7%
  • Java 14.6%
  • Shell 3.1%
  • Dockerfile 0.3%