Pages

Friday, 28 January 2022

Big Data Technology - Syllabus & Model Paper for B.Sc

Big Data Technology
Syllabus for B.Sc
Course Objective:
To understand the fundamental concepts of Big Data and Hadoop.
To understand the Hadoop architecture and algorithms for Map Reduce.
To apply  Concepts of  HDFS in Hadoop framework.
To Understand HiveQL querying, HBase concepts and services in Zookeeper.

Course Outcome:
CO1: Understand Big Data analytics and Map Reduce techniques 
CO2:Apply Hadoop techniques for Data handling.
CO3:Implement HDFS commands in Hadoop framework.
CO4:Handle different queries of HiveQL.
CO5:Apply different services of Zookeeper in HBase.

UNIT I                                               [ CO1]
INTRODUCTION TO BIG DATA: Introduction – distributed file system – Big Data and its importance, Four V’s in big data, Drivers for Big data, Big data analytics, Big data applications. Algorithms using map reduce, Matrix-Vector Multiplication by Map Reduce.

UNIT II                                                                             [ CO2]
INTRODUCTION HADOOP :Big Data – Apache Hadoop & Hadoop EcoSystem – Moving Data in and out of Hadoop – Understanding inputs and outputs of MapReduce - Data Serialization.

UNIT- III                            [ CO3]
HADOOP ARCHITECTURE:Hadoop Architecture, Hadoop Storage: HDFS, Common Hadoop Shell commands , Anatomy of File Write and Read., NameNode, Secondary NameNode, and DataNode, Hadoop MapReduce paradigm, Map and Reduce tasks, Job, Tasktrackers - Cluster Setup – SSH & Hadoop Configuration – HDFS Administering – Monitoring & Maintenance.

UNIT-IV                           [ CO4]
HIVE AND HIVEQL, HBASE: Hive Architecture and Installation, Comparison with Traditional Database, HiveQL - Querying Data - Sorting And Aggregating, Map Reduce Scripts, Joins & Subqueries,

UNIT-V                 [ CO5]
HBase concepts- Advanced Usage, Schema Design, Advance Indexing - Zookeeper – how it helps in monitoring a cluster, HBase uses Zookeeper and how to Build Applications with Zookeeper.

Reference Books :

1. Boris lublinsky, Kevin t. Smith, Alexey Yakubovich, “Professional Hadoop Solutions”, Wiley, ISBN: 9788126551071, 2015.
2 .Big Data Black Book( Covers Hadoop 2, Map Reduce, Hive, Yarn, Pig & Data Visualization)- Dream Tech Publications
3.Chris Eaton, Dirk deroos et al. , “Understanding Big data ”, McGraw Hill, 2012.
4. Tom White, “HADOOP: The definitive Guide” , O Reilly 2012.
5. Vignesh Prajapati, “Big Data Analytics with R and Haoop”, Packet Publishing 2013.
6. Tom Plunkett, Brian Macdonald et al, “Oracle Big Data Handbook”, Oracle Press,
2014.
7. Jy Liebowitz, “Big Data and Business analytics”,CRC press, 2013.


Big Data Technology
Model Paper 
Time: 3 Hrs Max. Marks: 75
                                            Section – A
Answer the following 5 questions. 
Each question carries 10 marks.       5 X 10= 50M
1. (a) Explain MapReduce functions with an example. [ CO1 ]
Or
(b) Explain Matrix-Vector Multiplication by Map Reduce with example.[ CO1 ]
2. (a) Explain about Hadoop ecosystem.    [ CO2 ]
Or
(b) How to Moving Data in and out of Hadoop ?[ CO2 ]
3. (a) What is HDFS? Explain it briefly.     [ CO3 ]
Or
(b) Discuss MapReduce paradigm.      [ CO3 ]
4. (a) Explain HIVE architecture.      [ CO4 ]
Or
(b) What is HiveQL and Explain Map Reduce Scripts   [ CO4 ]
5. (a) Explain HBASE architecture.      [ CO5 ]
Or
(b) What is Zookeeper? How it helps in monitoring a cluster? [ CO5 ]


Section – B
Answer any five questions.
 Each carry 3 marks 5 X 3 = 15 M
6. Explain the four V’s in bigdata. [ CO1 ]
7. Explain Big Data and its importance. [ CO1 ]
8. Write about Moving Data in and out of Hadoop [ CO2 ]
9. Write about Hadoop Storages.         [ CO2 ]
10. Write some Common Hadoop Shell commands. [ CO3 ]
11. Explain about SSH. [ CO3 ]
12. Discuss about Map Reduce Scripts. [ CO4 ]
13. How HBase uses Zookeeper?         [ CO5 ]
                                                    Section – C
Answer all Questions 5 X 2 = 10M
14. What is Big Data?         [ CO1 ]
15. What is data serialization? [ CO2 ]
16. What is Job Task Trackers? [ CO3 ]
17. What is HiveQL?         [ CO4 ]
18. Define HBase concepts.         [ CO5 ]

Friends-of-friends-Map Reduce program

Program to illustrate FOF Map Reduce: import java.io.IOException; import java.util.*; import org.apache.hadoop.conf.Configuration; import or...