18001025661 / 8527794500
info@sgtuniversity.org

Syllabus | B.Tech-Computer Science & Engineering | Data Mining and Data Warehousing

  Data Mining and Data Warehousing Learning Schedule
L T P C
Pre-requisites: DBMS 3 1 0 4

 COURSE DESCRIPTION

Data mining is a class of analytical techniques that examine a large amount of data to discover new and valuable information. This course is designed to introduce the core concepts of data mining, its techniques, implementation, benefits, and outcome expecta-tions from this new technology. It will also identify industry branches which most benefit from DM.Data warehousing involves data preprocessing, data integration, and providing on-line analytical processing (OLAP) tools for the interactive analysis of multidimensional data, which facilitates effective data mining. This course introduces data warehousing and data mining techniques and their software tools. Topics include: data warehousing, association analysis, classification, cluster-ing, numeric prediction, and selected advanced data mining topics.

COURSE OBJECTIVES

The objective of this course is to:

  1. Introduce data mining principles and techniques.
  2. Introduce data mining as a cutting edge business intellegence tool.
  3. Develop and apply critical thinking, problem solving and decision making skills.
  4. Introduce the concepts of Data Warehousing, difference between database and data warehousing.
  5. Describe and demonstrate basic data mining algorithms, methods, tools,
  6. Describe ETL Model and the Star Schema to design a Data Warehouse.

COURSE OUTCOMES

At the end of the course student will be able to:

  1. Design a data warehouse or data mart to present information needed by the and can be utilized for managing clients.
  2. Design and implement a quality data warehouse or data mart effectively and administer the data resources in such a way that it will truly meet management’s requirements.
  3. Evaluate standards and new technologies to determine their potential impact on your information resource for a large complex data warehouse/data mart.
  4. Use data mining tools for projects and to build reliable products as per demand.

COURSE CONTENT

Unit I

Overview, Motivation(for Data Mining),Data Mining-Definition & Functionalities, Data Processing, Form of Data Preprocess-ing, Data Cleaning: Missing Values, Noisy Data, (Binning,Clustering, Regression, Computer and Human inspection),Inconsistent Data, Data Integration and Transformation. Data Reduction:-Data Cube Aggregation, Dimensionality reduction, Data 35 Com-pression, Numerosity Reduction, Clustering, Discretization and Concept hierarchy generation

Unit II

Concept Description:- Definition, Data Generalization, Analytical Characterization, Analysis of attribute relevance, Mining Class comparisions, Statistical measures in large Databases. Measuring Central Tendency, Measuring Dispersion of Data, Graph Dis-plays of Basic Statistical class Description, Mining Association Rules in Large Databases, Association rule mining,mining Single-Dimensional Boolean Association rules from Transactional Databases– Apriori Algorithm, Mining Multilevel Association rules from Transaction Databases and Mining Multi-Dimensional Association rules from Relational Databases

Unit III

Classification and Predictions: What is Classification & Prediction, Issues regarding Classification and prediction, Decision tree, Bayesian Classification, Classification by Back propagation, Multilayer feed-forward Neural Network, Back propagation Algo-rithm, Classification methods K-nearest neighbor classifiers, Genetic Algorithm. Cluster Analysis: Data types in cluster analysis, Categories of clustering methods, Partitioning methods. Hierarchical Clustering- CURE and Chameleon, Density Based Methods-DBSCAN, OPTICS, Grid Based Methods- STING, CLIQUE, Model Based Method –Statistical Approach, Neural Network approach, Outlier Analysis

Unit IV

Data Warehousing: Overview, Definition, Delivery Process, Difference between Database System and Data Warehouse, Multi Dimensional Data Model, Data Cubes, Stars, Snow Flakes, Fact Constellations, Concept hierarchy, Process Architecture, 3 Tier Architecture, Data Marting.

Unit V

Aggregation, Historical information, Query Facility, OLAP function and Tools. OLAP Servers, ROLAP, MOLAP, HOLAP, Data Mining interface, Security, Backup and Recovery, Tuning Data Warehouse, Testing Data Warehouse. 

TEXT BOOKS

  1. H.Dunham,”Data Mining:Introductory and Advanced Topics” Pearson Education.
  2. Sam Anahory, Dennis Murray, “Data Warehousing in the Real World : A Practical Guide for Building Decision Support Systems, Pearson Education. 

REFERENCE BOOKS

  1. Jiawei Han, Micheline Kamber, ”Data Mining Concepts & Techniques” Elsevier.
  2. Mallach,”Data Warehousing System”,McGraw –Hill.
ADMISSIONS 2021