Big Data Developer

Home / Big Data Developer

Big Data Developer

Big Data is a process that refers to solutions destined for storing and processing large data sets.  Stay updated in your Big Data career with lifetime access to live classes. The Big Data training course syllabus is designed by our several working professionals in MNC companies. Once you get training from our Best Big Data Training institute in Chennai, you will be able to develop, even complex sql and plsql scripts by your own and that will help you to work real-time scenarios in all companies.  Infrastructure and Tools for enabling Big Data storage,Scalability and Distributed Processing are compared ,discussed and implemented in demo practice sessions.  Best Big Data coaching institute in Chennai is another best step in IT just after the Cloud Computing and will be a leading trend in future.

      About The Trainer

  • Dinesh Kumar S has been working with data analytics for more than 8 years. He has made lots of presentations on Deep dive on Bigdata at IIT Madras.

  • Dinesh Kumar S is a Chief Data Scientist who have Certified with Cloudera CCA 175 Bigdata hadoop developer and was working with Infosys.

  • He is the only South Indian trainer to hold CCA 175 Bigdata Hadoop and Spark developer international certification from Cloudera.

  • Dinesh Kumar S specializes in Hadoop projects. He has also done production work with Databricks for Apache Spark,Hive,Pig,Sqoop,Flume,Oozie & No SQL Platforms.

  • He shall guide associates to clear  international certification both with hortonwors(Level 1 )and Cloudera (Level 2) Certified as well.He is an expert in the Bigdata Analytics and the Data Science development.

    Weekday / Weekend classes Available.

    Talk to the Trainer @ +91-9789888424

HADOOP:
1.BIG DATA

2.VS

3.ROLE OF HADOOP IN BIG DATA

4.HADOOP AND ITS ECOSYSTEM

5.OVERVIEW OF OTHER BIG DATA SYSTEMS

6.REQUIREMENTS IN HADOOP

7.USECASES OF HADOOP

HIVE:

1.INTRODUCTION-HIVE VS RDBMS

2.DETAILED INSTALLATION (CONFIGURATION, METASTORE, INTEGRATING WITH HUE) STARTING METASTORE AND HIVESERVER2

3.DATA TYPES (PRIMITIVE, COLLECTION)

4.CREATE TABLES (MANAGED VS EXTERNAL) AND DML OPERATIONS (LOAD, INSERT, EXPORT)

5.MANAGED VS EXTERNAL TABLES

6.QL QUERIES (SELECT, WHERE, GROUP, BY, HAVING, SORT BY, ORDER BY)

7.HIVE ACCESS THROUGH HIVE CLIENT

8.BEELINE AND HUE, FILE FORMATS (RC, ORC, PARQUENT, SEQUENCE)

9.PARTITIONING

10.PARTITION WITH EXTERNAL TABLE
11.DROPPING PARTITIONS AND CORRESPONDING CONFIGURATION PARAMETERS

11.BUCKETING, PARTITIONING VS BUCKETING

12.VIEWS, DIFFERENT TYPES OF JOINS (INNER, OUTER)

13.MAP SIDE JOIN, BUCKETING JOIN

14.SERDE (CSVSERDE, JSONSERDE)

15.PARALLEL EXECUTION

16.SAMPLING DATA

17.SPECULATIVE EXECUTION

HBASE:

1.INTRODUCTION TO NOSQL

2.CAP THEOREM

3.CLASSIFICATION OF NOSQL

4.HBASE AND RDBMSHBASE AND HDFC

5.HBASE ARCHITECTURE (READ PATH, WRITE PATH, COMPACTION, SPLITS)

6.INSTALLATION

7.CONFIGURATION

8.ROLE OF ZOOKEEPER

9.HBASE SHELL

10.JAVA BASED APIS (SCAN, GET, OTHER ADVANCED APIS)

11.INTRODUCTION TO FILTERS

12.ROWKEY DESIGN

13.MAPREDUCE INTEGRATION

14.PEFORMANCE TUNING

15.WHAT’S NEW IN HBASE 0.98

16.BACKUP AND DISASTER RECOVERY

17.HANDS ON

MAPREDUCE:
1.THEORY

2.DATA FLOW (MAP – SHUFFLE – REDUCE)

3.MAP RED VS MAPREDUCE APIS

4.PROGRAMMING [MAP PER, REDUCER, COMBINER,PARTITIONER]

5.WRITABLES

6.INPUT FORMAT

7.OUTPUT FORMAT

8.STREAMING API USING PYTHON

9.INHERENT FAILURE HANDLING USING SPECULATIVE EXECUTION

10.MAGIC OF SHUFFLE PHASE

11.FILE FORMATS

12.SEQUENCE FILES

MONGO DB:
1.INTRODUCTION TO MONGODB

2.DOCUMENTS AND COLLECTIONS

3.SIMPLE QUERIES

4.SIMPLE UPDATES AND DELETES

5.MORE COMPLEX TYPES OF QUERIES

6.UPDATES AND ARRAYS

7.INDEXING 1 & 2

8.MONGO RESTFUL API

9.MAPREDUCE

10.MONGO SECURITY

11.MONGO REPLICATION AND SHARDING

12.CONCLUSION

PYTHON:
1.INTRODUCTION

2.GETTING STARTED QUICKLY FUNCTIONS + MODULES + THE STANDARD LIBRARY

3.WORKING WITH DATA “GROWING” A LIST RUNTIME

4.WORKING WITH STRUCTURED DATA COMBINING THE BUILT-IN DATA STRUCTURES

HDFS:

1.HDFS CONCEPTS

2.ARCHITECTURE DAEMON S

3.B LOCK CONCEPT

4.DATA (FILE READ, FILE WRITE)

5.FAULT TOLERANCE

6.COHERENCY

7.DATA INTEGRITY

8.HDFC CONCEPTS

9.ROLE OF SECONDARY NAME NODE

10.HIGH AVAILABILITY

11.SHELL COMMANDS

12.JAVA BASE API-HDFS FEDERATION PSEUDO DISTRIBUTED HADOOP CLUSTER INSTALLATION

13.HUE INSTALLATION

DATA WAREHOUSE:
1.INTRODUCTION TO DATA WAREHOUSING

2.DATA WAREHOUSE HARDWARE

3.DESIGNING AND IMPLEMENTING A DATA WAREHOUSE

4.CREATING AN ETL SOLUTION WITH SSIS

5.IMPLEMENTING CONTROL FLOW IN AN SSIS PACKAGE

6.DEBUGGING AND TROUBLESHOOTING SSIS PACKAGES

7.IMPLEMENTING AN INCREMENTAL ETL PROCESS

8.INCORPORATING DATA FROM THE CLOUD INTO A DATA WAREHOUSE
9.ENFORCING DATA QUALITY

10.USING MASTER DATA SERVICES

11.EXTENDING SQL SERVER INTEGRATION SERVICES

12.DEPLOYING AND CONFIGURING SSIS PACKAGES

13.CONSUMING DATA IN A DATA WAREHOUSE

BASIC ON JAVA:
1.JAVA – WHAT, WHERE AND WHY?

2.HISTORY AND FEATURES OF JAVA

3.INTERNALS OF JAVA PROGRAM

4.DIFFERENCE BETWEEN JDK,JRE AND JVM

5.INTERNAL DETAILS OF JVM

6.VARIABLE AND DATA TYPE

7.UNICODE SYSTEM

8.NAMING CONVENTION

BASIC ON LINUX:
1.SYSTEM ADMINISTRATION OVERVIEW

2.BOOTING AND SHUTTING DOWN LINUX

3.MANAGING USERS AND GROUPS

4.LINUX FILE SECURITY

5.WORKING WITH THE LINUX KERNEL

6.SYSTEM BACKUPS

7.BASIC NETWORKING

8.INTRODUCTION TO SYSTEM SECURITY

9.NETWORKED FILE SYSTEMS (NFS)

10.INSTALLATION AND CONFIGURATION

11.MANAGING SOFTWARE AND DEVICES

12.THE LINUX FILE SYSTEM

13.CONTROLLING PROCESSES

14.SHELL SCRIPTING OVERVIEW

15.TROUBLESHOOTING THE SYSTEM

16.LAMP SERVER BASICSTHE SAMBA FILE SHARING FACILITY

PIG:
1.PIG INTRODUCTION

2.DATA TYPES

3.OPERATORS (ARITHMETIC, RELATIONAL, DIAGNOSTIC)

4.UDF (JAVA) FUNCTIONS ((EVAL FUNCTIONS AND LOAD/STORE FUNCTIONS)

5.LATIN STATEMENTS

6.MULTI QUERY EXECUTION SPECIALIZED JOINS

7.OPTIMIZED RULES

8.MEMORY MANAGEMENT

9.EXTENSIVE HANDS ON WITH LARGE DATASETS TRYING ALL THE ABOVE DISCUSSED THEORIES IN PRACTICAL SESSION.

SQOOP:
1.SQOOP ARCHITECTURE

2.SQOOP INSTALLATION

3.COMMANDS (IMPORT, HIVE-IMPORT, EVAL, HBASE IMPORT, IMPORT ALL TABLES, EXPORT)

4.CONNECTORS TO EXISTING DBS AND DW

FLUME:
1.WHY FLUME?

2.ARCHITECTURE

3.CONFIGURATION (AGENTS)

4.SOURCES (EXEC-AVRO-NETCAT)

5.CHANNELS (FILE,MEMORY,JDBC, HBASE)

6.SINKS (LONGER, AVRO, HDFS, HBASE, FILEROLL)

7.CONTEXTUAL ROUTING (INTERCEPTORS, CHANNEL SELECTORS)
8.INTRODUCTION TO OTHER AGGREGATION FRAMEWORKS

OOZIE:

1.OOZIE ARCHITECTURE

2.INSTALLATION

3.WORKFLOW

4.ACTION (MAPREDUCE, HIVE, PIG, SQOOP) INTRODUCTION TO BUNDLE

5.MAIL NOTIFICATIONS

APACHE:
1.INTRODUCTION TO SPARK

2.SPARK INSTALLATION DEMO

3.OVERVIEW OF SPARK ON A CLUSTER

4.SPARK STANDALONE CLUSTER

5.SPARK RDD

6.TRANSFORMATIONS IC RDD

7.ACTIONS IN RDD

8.PERSISTENCE IN RDD

9.LOADING DATA IN RDD
10.SAVING DATA THROUGH RDD
11.KEY-VALUE PAIR RDD

12.MAP REDUCE AND PAIR RDD OPERATIONS
13.SCALA AND HADOOP INTEGRATION

14.SPARK SQL

15.DATA FRAME CONCEPT

16.SQL CONTEXT WITH EXAMPLE – JSON

YARN:

1.ANATOMY OF A YARN APPLICATION RUN

2.RESOURCE REQUESTS

3.APPLICATION LIFESPAN

4.BUILDING YARN APPLICATIONS

ZOOKEEPER:

1.INSTALLING AND RUNNING ZOOKEEPER

2.AN EXAMPLE

3.GROUP MEMBERSHIP IN ZOOKEEPER

4.CREATING THE GROUP

5.JOINING A GROUP

MY SQL:

1.WHY NOSQL?

2.AGGREGATE DATA MODELS

3.MORE DETAILS ON DATA MODELS

4.DISTRIBUTION MODELS

SCALA:
1.INTRODUCTION TO SCALA

2.CREATING A SCALA DOC

3.CREATING A SCALA PROJECT

4.THE SCALA REPL

5.SCALA DOCUMENTATION

HADOOP
Hive
HBASE
MONGO DB
PYTHON
HDFS
MAP REDUCE
BASICS ON DWH
BASICS ON JAVA
BASICS ON LINUX
PIG
SQOOP
FLUME
OOZIE
APACHE SPARK
ZOOKEEPER
NoSQL
SCALA
YARN
HADOOP