Overview
The Big Data Hadoop designer course have been intended to grant a top to bottom information on Big Data handling utilizing Hadoop and Spark and Data Science.
LEARNING OUTCOMES
- Comprehend the various parts of Hadoop environment, for example, Hadoop 2.7, Yarn, MapReduce, Pig, Hive, Impala, HBase, Sqoop, Flume, and Apache Spark
- Addition information on Hadoop Distributed File System (HDFS) and YARN just as their engineering, and figure out how to work with them for capacity and asset the board
- Comprehend MapReduce and its qualities, and acclimatize some progressed MapReduce ideas
- Get a review of Sqoop and Flume and depict how to ingest information utilizing them
- Comprehend various sorts of record groups, Avro Schema, utilizing Arvo with Hive, and Sqoop and Schema advancement
- Become more acquainted with about HBase, its engineering, information stockpiling, and working with HBase. You will likewise comprehend the contrast among HBase and RDBMS
- Increase a working information on Pig and its parts
- Do practical programming in Spark
- Comprehend flexible dispersion datasets (RDD) in detail
- Execute and construct Spark applications
- Comprehend the basic use-instances of Spark and the different intelligent calculations
- Learn Spark SQL, making, changing, and questioning Data outlines
DURATION: 4DAYS WORKSHOP + POST WORKSHOP SUPPORT
MODULES
- Introduction
- Introduction to Big data and Hadoop Ecosystem
- HDFS and YARN
- MapReduce and Scoop
- Basics of Hive and Impala
- Types of Data Formats
- Advanced Hive Concept and Data File Partitioning
- Apache Flume and HBase
- Pig/Tableau & QlikView
- Basics of Apache Spark
- RDDs in Spark
- Implementation of Spark Applications
- Spark Parallel Processing
- Spark RDD Optimization Techniques
- Spark Algorithm
DELIVERABLES
- 2 days Instructor - Classroom training from Certified Trainer.
- Course materials and practice exercises for exam.
- Course Completion certificate