Big Data Hodoop and Spark Developer Online Training
(All course fees are in USD)
Course Description
The Big Data Hadoop online training (aligned to Cloudera CCA175 Exam) is designed to give you an in-depth knowledge of the Big Data framework using Hadoop and Spark. In this hands-on Hadoop course, you will execute real-life, industry-based projects using Integrated Lab. This online course would prepare you to become a Big Data Developer
Offered in Partnership with
Simplilearn
Course Delivery
- Online self-paced learning: 10+ hours
- Live online virtual classroom training: 50+ hours
Total online blended learning: 60+ hours
Benefits
- Total blended learning of 60 hours
- Live online virtual classes by industry experts
- 4 real-life industry projects using Hadoop, Hive and Big data stack
- Online lab included during course validity period
- Training on Yarn, MapReduce, Pig, Hive, HBase, and Apache Spark
- Aligned to Cloudera CCA175 certification exam. Details: https://www.cloudera.com/about/training/certification/cdhhdp-certification/cca-spark.html
Skills to be Learned
- Realtime data processing
- Functional programming
- Spark applications
- Parallel processing
- Spark RDD optimization techniques
- Spark SQL
Award upon Successful Completion
Big Data Hadoop and Spark Developer “Certificate of Achievement”
Awarding Organisation
Simplilearn
Learning Path
Besides knowledge enhancement, this online course is also aligned with and would also prepare you for the Cloudera CCA Spark and Hadoop Developer Exam CCA (175). Details can be found here: CCA Spark and Hadoop Developer Certification – Cloudera
Learning Outcomes
This Big Data Hadoop and Spark Developer course will enable you to:
- Learn how to navigate the Hadoop ecosystem and understand how to optimize its use
- Ingest data using Sqoop, Flume, and Kafka.
- Implement partitioning, bucketing, and indexing in Hive
- Work with RDD in Apache Spark
- Process real-time streaming data
- Perform DataFrame operations in Spark using SQL queries
- Implement User-Defined Functions (UDF) and User-Defined Attribute Functions (UDAF) in Spark
Assessments
Course-end Assessment
Industry Projects
Project 1 Analyzing Historical Insurance claims
Use Hadoop features to predict patterns and share actionable insights for a car insurance company.
Project 2 Analyzing Intraday Price Changes
Use Hive features for data engineering and analysis of New York stock exchange data.
Project 3 Analyzing Employee Sentiment
Perform sentiment analysis on employee review data gathered from Google, Netflix, and Facebook.
Project 4 Analyzing Product Performance
Perform product and customer segmentation to increase the sales of Amazon.
Course Completion Criteria
- Completion of online self-paced learning
- Attendance of live virtual classes
- A score of at least 75% in course-end assessment
- Successful evaluation in at least one project
Who Should Enrol
- Analytics professionals
- Senior IT professionals
- Testing and mainframe professionals
- Data management professionals
- Business intelligence professionals
- Project managers
- Graduates looking to begin a career in big data analytics
Prerequisites
It is recommended that you have knowledge of:
- Core Java
- SQL
Course Overview
Lesson 01 – Course Introduction
Lesson 02 – Introduction to Big Data and Hadoop
Lesson 03 – Hadoop Architecture, Distributed Storage (HDFS): The Storage Layer
Lesson 04 – Distributed Processing – MapReduce Framework
Lesson 05 – MapReduce Advanced Concepts
Lesson 06 – Apache Hive
Lesson 07 – Apache Pig
Lesson 08 – NoSQL Databases – HBase
Lesson 09 – Data Ingestion into Big Data Systems and ETL
Lesson 10 – YARN Introduction
Lesson 11 – Introduction to Python for Apache Spark
Lesson 12 – Functions
Lesson 13 – Big Data and the Need for Spark
Lesson 14 – Deep Dive into Apache Spark Framework
Lesson 15 – Working with Spark RDD’s
Lesson 16 – Spark SQL and Data Frames
Lesson 17 – Machine Learning using Spark ML
Lesson 18 – Stream Processing Frameworks and Spark Streaming
Lesson 19 – Spark Structured Streaming
Lesson 20 – Spark GraphX
Accessible Period of Course
1 year from date of enrolment
Customer Reviews
Kinshuk Srivastava
Data Scientist at Walmart
The course is very informative and interactive and that is the best part of this training.
Shubhangi Meshram
Senior Technical Associate at Tech Mahindra
I am impressed with the overall structure of training, like if we miss class we get the recording, for practice we have CloudLabs, discussion forum for subject clarifications, and the trainer is always there to answer.
Solomon Larbi Opoku
Senior Desktop Support Technician
Content looks comprehensive and meets industry and market demand. The combination of theory and practical training is amazing.
Navin Ranjan
Assistant Consultant
Faculty is very good and explains all the things very clearly. Big data is totally new to me so I am not able to understand a few things but after listening to recordings I get most of the things.
Ludovick Jacob
Manager of Enterprise Database Engineering & Support at USAC
I really like the content of the course and the way trainer relates it with real-life examples.
Puviarasan Sivanantham
Data Engineer at Fanatics, Inc.
Dedication of the trainer towards answering each & every question of the trainees makes us feel great and the online session as real as a classroom session.
Richard Kershner
Software Developer
The trainer was knowledgeable and patient in explaining things. Many things were significantly easier to grasp with a live interactive instructor. I also like that he went out of his way to send additional information and solutions after the class via email.
Aaron Whigham
Business Analyst at CNA Surety
Very knowledgeable trainer, appreciate the time slot as well… Loved everything so far. I am very excited…
Rudolf Schier
Java Software Engineer at DAT Solutions
Great approach for the core understanding of Hadoop. Concepts are repeated from different points of view, responding to audience. At the end of the class you understand it.
Priyanka Garg
Sr. Consultant
Very informative and active sessions. Trainer is easy going and very interactive.
Peter Dao
Senior Technical Analyst at Sutter Health
The content is well designed and the instructor was excellent.
Anil Prakash Singh
Project Manager/Senior Business Analyst @ Tata Consultancy Services
The trainer really went the extra mile to help me work along. Thanks
Dipto Mukherjee
Etl Lead at Syntel
Excellent learning experience. The training was superb! Thanks Simplilearn for arranging such wonderful sessions.
Course Features
- Students 0 student
- Max Students10000
- Duration60 hour
- Skill levelall
- LanguageEnglish
- Re-take course100000
-
Lesson 01 - Course Introduction
-
Lesson 02 - Introduction to Big Data and Hadoop
-
Lesson 03 - Hadoop Architecture, Distributed Storage (HDFS): The Storage Layer
-
Lesson 04 - Distributed Processing - MapReduce Framework
-
Lesson 05 - MapReduce Advanced Concepts
-
Lesson 06 - Apache Hive
-
Lesson 07 - Apache Pig
-
Lesson 08 - NoSQL Databases - HBase
-
Lesson 09 - Data Ingestion into Big Data Systems and ETL
-
Lesson 10 - YARN Introduction
-
Lesson 11 - Introduction to Python for Apache Spark
-
Lesson 12 - Functions
-
Lesson 13 - Big Data and the Need for Spark
-
Lesson 14 - Deep Dive into Apache Spark Framework
-
Lesson 15 - Working with Spark RDD's
-
Lesson 16 - Spark SQL and Data Frames
-
Lesson 17 - Machine Learning using Spark ML
-
Lesson 18 - Stream Processing Frameworks and Spark Streaming
-
Lesson 19 - Spark Structured Streaming
-
Lesson 20 - Spark GraphX