Skip to content

jayasava/Data-Analysis-with-Apache-Hadoop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data-Analysis-with-Apache-Hadoop

Banking Data Analysis with Hadoop

1. Creating Database,Tables and Inserting Data using MYSQL

Scripts in mysql_scripts.sql

2. Creating Sqoop jobs to load table data to staging tables in hdfs

Sqoop job queries in sqoop_job_queries

3. Loading data from staging data in hdfs to hive

  1. Creating External Tables in Hive as stage tables and load data from hdfs (queries in hdfs_to_hive)
  2. Create Hive tables with highly efficient ORC format (queries in Hive ORC Tables creation)
  3. Add Hive UDF to utilise Encryption and Decryption( refer Adding Hive-UDF for Encryption anf decryption functions)
  4. Insert encrypted data into ORC tables from stg tables (queries in Stg_to_ORC Tables)
  5. Truncating staging tables

4. Pulling out data of users having outstanding loan and credit card balances with given limits

Queries in loan_credit_analysis_queries

5. Survey Data Analysis

  1. Concatenating all survey files into one (survey_data_format)
  2. Loading Survey data from txt file into Hive (survey_text_to_hive)
  3. Pulling out total number of users who gave less than rating 3, finding average rating (survey_analysis_queries)

About

Banking Data Analysis with Hadoop

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published