Pages

Friday, 28 January 2022

Big Data Technology - Syllabus & Model Paper for B.Tech

Big Data Technology
Syllabus for B.Tech

COURSE OBJECTIVES:
1. Understand the Big Data Platform and its Use cases
2. Provide an overview of Apache Hadoop
3. Provide HDFS Concepts and Interfacing with HDFS
4. Understand Map Reduce Jobs

COURSE OUTCOMES:
At the end of the course student will be able to
CO1: Understand concept of Big Data and Hadoop Eco System
CO2: Configure various Hadoop services in distributed environment
CO3: Analyze unstructured data using Map Reduce
CO4: Understand various advanced Map Reduce tasks for analyzing the data
CO5: Solve various real times problems using Hadoop and HBase

UNIT-I
Introduction to Big Data: Big Data-definition, Characteristics of Big Data (Volume,
Variety, Velocity, Veracity, Validity), Importance of Big Data, Data in the Warehouse and Data in Hadoop, Algorithms using map reduce, Matrix-Vector Multiplication by Map Reduce. Introduction to Hadoop: Hadoop- definition, understanding distributed systems and Hadoop, Comparing SQL databases and Hadoop

UNIT-II
Hadoop Architecture: History of Hadoop, building blocks of Hadoop, NameNode,
DataNode, Secondary NameNode, JobTracker and Task Tracker, YARN. Understanding MapReduce, Word count program using traditional method and conventional methods Components of Hadoop -Working with files in HDFS, Reading and writing the Hadoop Distributed File system –The Design of HDFS, HDFS Concepts, The Command-Line Interface, Hadoop commands , Hadoop Filesystem

UNIT-III
MapReduce: Hadoop EcoSystem – Moving Data in and out of Hadoop – Understanding inputs and outputs of MapReduce . Anatomy of a MapReduce program, A Weather Dataset, Analyzing the Data with UnixTools, Analyzing the Data with Hadoop, Scaling Out, Hadoop Streaming, Hadoop Pipes, Hadoop Archives, Getting the patent data set, constructing the basic template of a Map Reduce program

UNIT-IV
MapReduce Advanced Programming: Advanced MapReduce - Chaining Map Reduce jobs,joining data from different sources, creating a Bloom filter, passing job-specific parameters to your tasks, probing for task-specific information, partitioning into multiple output files, inputting from and outputting to a database, keeping all output in sorted order.

UNIT-V
Graph Representation in Map Reduce: Modeling data and solving problems with graphs, Shortest Path Algorithm, Friends-of-Friends Algorithm, PageRank Algorithm, Bloom Filter, Zookeeper – how it helps in monitoring a cluster, HBase uses Zookeeper and how to Build Applications with Zookeeper.

TEXT BOOKS:
1. Dirk deRoos, Chris Eaton, George Lapis, Paul Zikopoulos, Tom Deutsch “UnderstandingBigData Analytics for Enterprise Class Hadoop and Streaming Data”, 1st Edition, TMH,2012.
2. Hadoop: The Definitive Guide by Tom White, 3rd Edition, O’reilly

REFERENCE BOOKS:
1. Hadoop in Action by Chuck Lam, MANNING Publ.
2. Hadoop in Practice by Alex Holmes, MANNING Publishers
3. Mining of massive datasets, AnandRajaraman, Jeffrey D Ullman, Wiley Publications.
4. Boris lublinsky, Kevin t. Smith, Alexey Yakubovich, “Professional Hadoop
Solutions”,Wiley,ISBN:9788126551071,2015.
5. Big Data Black Book ( Covers Hadoop 2, Map Reduce, Hive, Yarn, Pig & amp; Data Visualization ) - Dream Tech Publications
6. Chris Eaton, Dirk deroos et al. , “Understanding Big data ”, McGraw Hill, 2012.
7. Tom White, “HADOOP: The definitive Guide” , O Reilly 2012.
8. Vignesh Prajapati, “Big Data Analytics with R and Haoop”, Packet Publishing 2013.
9. Tom Plunkett, Brian Macdonald et al, “Oracle Big Data Handbook”, Oracle Press,
2014.
10. Jy Liebowitz, “Big Data and Business analytics”, CRC press, 2013.

No comments:

Post a Comment

Friends-of-friends-Map Reduce program

Program to illustrate FOF Map Reduce: import java.io.IOException; import java.util.*; import org.apache.hadoop.conf.Configuration; import or...