Banking Data Analysis with Hadoop
Scripts in mysql_scripts.sql
Sqoop job queries in sqoop_job_queries
- Creating External Tables in Hive as stage tables and load data from hdfs (queries in hdfs_to_hive)
- Create Hive tables with highly efficient ORC format (queries in Hive ORC Tables creation)
- Add Hive UDF to utilise Encryption and Decryption( refer Adding Hive-UDF for Encryption anf decryption functions)
- Insert encrypted data into ORC tables from stg tables (queries in Stg_to_ORC Tables)
- Truncating staging tables
Queries in loan_credit_analysis_queries
- Concatenating all survey files into one (survey_data_format)
- Loading Survey data from txt file into Hive (survey_text_to_hive)
- Pulling out total number of users who gave less than rating 3, finding average rating (survey_analysis_queries)