Apache Spark Training Chennai

Module 1

Bigdata Landscape

Why Bigdata-3 v s-Hadoop Ecosystem

Introduction to Apache Spark

Features of Apache Spark

Apache Spark Stack

Introduction to RDD’s

RDD’s Transformation

What is good and bad In MapReduce?

Why to use Apache Spark

Module 2


Single node

Include Hadoop

Include Apache Spark

Include Hive

Include Sqoop

Include Hue

Module 3

Deep Dive in HDFS

HDFS Design

Fundamental of HDFS

Rack Awareness

Read/Write from HDFS

HDFS Federation and High Availability (Hadoop 2xx)

HDFS Command Line Interface

Module 4

Spark Shell Hands On Using HDFS

Spark Shell Introduction

Create file using Hue-Spark Shell extracting file from HDFS

Create RDD from HDFS file

Module 5

Programming with RDD Part-1

Creating new RDD

Transformations on RDD

Lineage Graph

Actions on RDD

RDD Concepts on Persist and Cache

Lazy evaluation of RDD

Module 6

Scala/Spark Functional Programming

Using Function Literals

Anonymous Functions

Define a function which accepts another function

Module 7

RDD Transformation Programming in Depth

Hands on and core concepts of map() transformation

Hands on and core concepts of filter() transformation

Hands on and core concepts of flatMap() transformation

Compare map and flatMap transformation

Module 8

Apache Spark in Action

Hands on and core concepts of reduce() action

Hands on and core concepts of fold() action

Hands on and core concepts of aggregate() action

Basics of Accumulator-Hands on and core concepts of collect() action

Hands on and core concepts of take() action

Ordered access of RDD

Module 9

Apache Spark Execution Model

How Spark execute program

Concepts of RDD partitioning

RDD data shuffling and performance issue

Module 10

Apache Spark PairRDD

Core concepts of PairRDD

Creation of PairRDD

Aggregation in PairRDD

Aggregation functions understanding in depth

How reduceByKey() work conceptually?

How foldByKey() work conceptually?

How combineByKey()work conceptually?

Module 11

Spark PairRDD HandsOn Lab





Module 12

Spark PairRDD Joining, Zipping and

reduceByKey versus groupByKey performance issue



joining (left, right, inner etc)

Module 13

Understanding Hadoop SequenceFile

Creating Seqnce File and Processing using SPark

Creating SequenceFile using TSV file

Loading Data in Apache Hive

Processing SequnceFile as an RDD

Module 14

Spark Shared Variables

Shared Variables: Broadcast Variables-Shared Variables: Accumulators

Module 15

Spark Accumulator

Word count and Character Count

Counting Bad records in a file

Module 16

Spark BroadCast Variable

Joining two csv files one as a Broadcasted Lookup table

Module 17

Spark API

BroadCast Variable, Filter Functions and Saving File

Module 18

Spark API

Spark Join, GroupBy and Swap function

Module 19

Spark API

Remove Header from CSV file and Map Each column to Row Data

Module 20

Spark SQL


Schema RDD replaced by DataFrame API

History of SparkSQL

Catalyst Optimizer

Module 21

SparkSQL HandsOn Sessions

Hive Configuration

Create Hive table using Spark

Load Data in HIve table using Spark

Create another table using DataFrame

Module 22

Implementing Business Logic using SparkSQL

Loading CSV file

Spark Case classes (To create schema for csv file)

Convert RDD to DataFrame using DataFrmae API for query data

Using SQL query on DataFrame

Module 23

Spark Loading and Saving Your Data


CSV and TSV files

JSON Files

Module 24

Spark Loading and Saving Your Data SQL and NOSQL


HBase (NoSQL)

Module 25

Writing Spark Applications

Spark Applications vs Spark Shell

Creating the SparkContext

Configuring Spark Properties

Building and Running a Spark Application


Module 26

Spark Streaming in Depth Part-1

Spark Streaming Overview-Example: Streaming Word Count

Module 27

Spark Streaming in Depth Part-2

Other Streaming Operations

Sliding Window Operation

Developing Spark Streaming Applications

Module 28

Spark Algorithms Part-1

Iterative Algorithm

Graph Analysis

Machine Learning

Module 29

Case studies

Best apache spark training center in chennai,best hadoop training centre in chennai,best apache spark training in chennai,best training institute in chennai for big data,apache spark analytics training center in chennai,apache spark architect training in chennai,apache spark certification cost chennai,hadoop architect training in chennai,best apache spark corporate training for singapore , Australia , US ,apache spark classroom training in chennai,apache spark testing training in chennai,apache spark hadoop certification training and placement in chennai,apache spark cloudera training in chennai,apache spark mapr training in chennai,apache spark hortonworks training in chennai,apache spark hadoop training in chennai ekkaduthangal,apache spark hadoop training institutes in chennai,apache spark testing training in chennai,apache spark training and placement in chennai,apache spark corporate training center chennai,apache spark hadoop corporate training chennai ,apache spark workshop for students in chennai,apache spark training fees in chennai,free apache spark training in chennai,apache spark microsoft hdinsight corporate training in chennai ekkaduthangal,apache spark training in chennai review,apache spark training in chennai tambaram,corporate training apache spark training in chennai velachery,apache spark training in chennai with placement,apache spark training institute chennai, bigdata student project , data mining student project , bigdata IEEE papers, Machine learning student project, Cloud student project ,IOT student project,Fresher traininig and placement
apache spark training ekkaduthangal chennai,cost of apache spark training in chennai,hadoop apache spark training cost in chennai,ibm big insight apache spark training in chennai,ekkaduthangal apache spark training in chennai,training for apache spark in chennai,training on apache spark in chennai,Apache spark training,cloudera certification training ,apache spark apache spark training,apache spark using python ,statistics training in chennai,apache spark spark training in chennai,cloudera spark hadoop certification training,Hortonworks developer and admin training,Azure apache spark lake training,cloudera hadoop installation in azure ,Hortonworks corporate training hadoop installation in azure ,Mapr hadoop installation in azure,Mapr hadoop installation in AWS,Talend apache spark training in chennai,cassandra solr training in chennai,apache spark nosql training in ekkaduthangal,best apache spark corporate training apache spark training in chennai,best apache spark deep learning training in chennai,best apache spark online training in chennai with 100 % placement assistance,apache spark job for fresher,RPA training in chennai,Mapr cluster installation and certification training in chennai,Informatica corporate training apache spark training in chennai,hadoop spark nosql cloud training in chennai ,spark scala python programming training in chennai,Tensorflow training in chennai,pyspark training,hadoop job,apache spark corporate trainingjob oriented training
How to participate in kaggle and hackathon , apache spark free workshop in chennai, apache spark workshop with certificate in chennai,apache spark journal preparation