Skip to content

Latest commit

 

History

History
56 lines (43 loc) · 4.93 KB

README.md

File metadata and controls

56 lines (43 loc) · 4.93 KB

Learn how to slice and dice data using the next generation big data platform - Apache Spark!

JRP The Ultimate Apache Spark with Java Course New Banner

This GitHub Repository will include the Source Code for the "The Ultimate Apache Spark with Java Course - Hands On"

Note for newcomers

Checkout our complete course on the following platforms:

  1. Udemy
  2. Job Ready Programmer (JRP)
  3. JRP Membership (Includes Data Analyst & Software Developer Career Paths)

About the course :

  • Apache Spark is the next generation batch and stream processing engine. It's been proven to be almost 100 times faster than Hadoop and much much easier to develop distributed big data applications with.
  • It's demand has sky rocketed in recent years and having this technology on your resume is truly a game changer.
  • Over 3000 companies are using Spark in production right now and the list is growing very quickly!
  • Some of the big names include: Oracle, Hortonworks, Cisco, Verizon, Visa, Microsoft, Amazon as well as most of the big world banks and financial institutions!
  • You'll be developing over 15 practical Spark Java applications crunching through real world data and slicing and dicing it in various ways using several data transformation techniques.
  • This course is especially important for people who would like to be hired as a java developer or data engineer because Spark is a hugely sought after skill.
  • We'll even go over how to setup a live cluster and configure Spark Jobs to run on the cloud.
  • You'll also learn about the practical implications of performance tuning and scaling out a cluster to work with big data so you'll definitely be learning a ton in this course.
  • This course has a 30 day money back guarantee. You will have access to all of the code used in this course.

Topics covered in this course :

In this course you'll learn everything you need to know about using Apache Spark in your organization while using their latest and greatest Java Datasets API. Below are some of the things you'll learn:

  • How to develop Spark Java Applications using Spark SQL Dataframes
  • Understand how the Spark Standalone cluster works behind the scenes
  • How to use various transformations to slice and dice your data in Spark Java
  • How to marshall/unmarshall Java domain objects (pojos) while working with Spark Datasets
  • Master joins, filters, aggregations and ingest data of various sizes and file formats (txt, csv, Json etc.)
  • Analyze over 18 million real-world comments on Reddit to find the most trending words used
  • Develop programs using Spark Streaming for streaming stock market index files
  • Stream network sockets and messages queued on a Kafka cluster
  • Learn how to develop the most popular machine learning algorithms using Spark MLlib
  • Covers the most popular algorithms: Linear Regression, Logistic Regression and K-Means Clustering

Contact us

About Imtiaz Ahmad

  • Imtiaz Ahmad is an award-winning Udemy Instructor who is highly experienced in big data technologies and enterprise software architectures.
  • Imtiaz has spent a considerable amount of time building financial software on Wall St. and worked with companies like S&P, Goldman Sachs, AOL and JP Morgan along with helping various startups solve mission-critical software problems.
  • In his 13+ years of experience, Imtiaz has also taught software development in programming languages like Java, C++, Python, PL/SQL, Ruby and JavaScript.
  • He’s the founder of Job Ready Programmer — an online programming school that prepares students of all backgrounds to become professional job-ready software developers through real-world programming courses.
  • Take the twin highway of Learning Data Analysis and Software Development on a single platform: Job Ready Programmer

Alt text