BSc (IT)

BSc (IT)

I authorize SMU-DE representative to contact me. This will override registry on DND/NDNC

Managing Big Data

BSC IT Distance learning course syllabus for Managing Big Data at Sikkim Manipal University Distance Education. Visit www.smude.edu.into apply now.

BSC IT Distance learning course syllabus for Managing Big Data at Sikkim Manipal University Distance Education. Visit www.smude.edu.into apply now.

Managing Big Data BSC IT Distance learning course syllabus for Managing Big Data at Sikkim Manipal University Distance Education. Visit www.smude.edu.into apply now. When: — Where: Category:

BSc (IT) - Managing Big Data

Course Code: BIT6022

Course Title: Managing Big Data (4 Credits)

 

Back

Course Contents

UNDERSTANDING BIG DATA

What is big data – why big data –.Data!, Data Storage and Analysis, Comparison with Other Systems, Rational Database Management System , Grid Computing, Volunteer Computing, convergence of key trends – unstructured data – industry examples of big data – web analytics – big data and marketing – fraud and big data – risk and big data – credit risk management – big data and algorithmic trading – big data and healthcare – big data in medicine – advertising and big data – big data technologies – introduction to Hadoop – open source technologies – cloud and big data – mobile business intelligence – Crowd sourcing analytics – inter and trans firewall analytics

 

NOSQL DATA MANAGEMENT

Introduction to NoSQL – aggregate data models – aggregates – key-value and document data models – relationships – graph databases – schema less databases – materialized views – distribution models – sharding –– version – Map reduce – partitioning and combining – composing map-reduce calculations

 

BASICS OF HADOOP

Data format – analyzing data with Hadoop – scaling out – Hadoop streaming – Hadoop pipes – design of Hadoop distributed file system (HDFS) – HDFS concepts – Java interface – data flow – Hadoop I/O – data integrity – compression – serialization – Avro – file-based data structures

 

MAPREDUCE APPLICATIONS

MapReduce workflows – unit tests with MRUnit – test data and local tests – anatomy of MapReduce job run – classic Map-reduce – YARN – failures in classic Map-reduce and YARN – job scheduling – shuffle and sort – task execution – MapReduce types – input formats – output formats

 

HADOOP RELATED TOOLS

Hbase – data model and implementations – Hbase clients – Hbase examples –praxis. Cassandra – cassandra data model – cassandra examples – cassandra clients –Hadoop integration. Pig – Grunt – pig data model – Pig Latin – developing and testing Pig Latin scripts. Hive – data types and file formats – HiveQL data definition – HiveQL data manipulation – HiveQL queries.

 

Back