CSE 5243: Introduction to Data Mining (Au18, Tu/Th 9:35-10:55am, Baker Systems 136)

Instructor: Huan Sun

Teaching Assistants: Fang Zhou (zhou.1250)

Level and credits: U/G, 3

Prerequisites: Introduction to Databases, Introduction to Algorithms, or grad standing or permission of instructor

Office hours and locations (Instructor): Tue 11:00AM-12:15PM, Dreese Labs 699

Office hours and locations (TA): Fang Zhou @ DL190, 3:00PM-4:00PM on Tuesday

Description

Introduction to the knowledge discovery process, key data mining techniques, efficient high performance mining algorithms, exposure to applications of data mining.

Grading Plan (Note: All the deadlines are 11:59PM (midnight) of the due dates. No late submissions!)

  • Participation: 10%
  • Homework: 50%
  • Midterm Exam: 20%
  • Final Exam: 20%
  • No course project

Textbooks

Jiawei Han, Micheline Kamber, and Jian Pei. Data Mining: Concepts and Techniques, 3rd edition, Morgan Kaufmann, 2011

Recommended books for reading:

Academic Integrity Policy

Academic integrity is essential to maintaining an environment that fosters excellence in teaching, research, and other educational and scholarly activities. Thus, The Ohio State University and the Committee on Academic Misconduct (COAM) expect that all students have read and understand the University’s Code of Student Conduct, and that all students will complete all academic and scholarly assignments with fairness and honesty. Students must recognize that failure to follow the rules and guidelines established in the University’s Code of Student Conduct and this syllabus may constitute “Academic Misconduct.” For more info, click here.

Anonymous Feedback/Comments/Suggestions?

Feel free to leave any comments and suggestions about how the instructor/grader can do better to help you learn this course, such as whether the lectures are clear, examples are helpful, questions are answered timely, etc. Please check this anonymous form. Your input is highly appreciated! ;-)

Course Syllabus and Schedule (To be updated later)

Week Date Topic Assignment Out Assignment Due Lecture Notes
1 08/21 NO CLASS Chapter 1 (Han et al.)
1 08/23 Class Outline / Introduction Chapter 1 (Han et al.)
2 08/28 Review of Basic Probability and Statistical Concepts; Data Preprocessing Chapter 2 & Chapter 3 (Han et al.)
2 08/30 Data & Data Preprocessing Assignment 1 Chapter 2 & Chapter 3 (Han et al.)
3 09/04 Classification:Basic Concepts/Methods Chapter 8 (Han et al.)
3 09/06 Classification:Basic Concepts/Methods
4 09/11 Classification:Basic Concepts/Methods Assignment 2 (programming) Assignment 1 Due
4 09/13 Classification:Basic Concepts/Methods
5 09/18 Classification:Advanced Methods Chapter 9 (Han et al.), Sample Midterm
5 09/20 Classification:Advanced Methods & HW#2 Brief Discussion/QA & Clustering: Basic Concepts/Methods Chapter 10 (Han et al.)
6 09/25 Clustering: Basic Concepts/Methods(pdf)(pptx)
6 09/27 Clustering: Basic Concepts/Methods(pdf)(pptx)
7 10/02 Clustering: Basic Concepts/Methods(pdf) Assignment 3 (programming) Assignment 2 Due
7 10/04 Homework Discussion + Lecture Review + Midterm Review
8 10/09 Midterm Exam
8 10/11 Autumn Break
9 10/16 Mining Frequent Patterns and Associations: Basic Concepts Frequent Pattern Mining (Chapter 6, Han et al.)
9 10/18 Mining Frequent Patterns and Associations: Basic Concepts
10 10/23 Mining Frequent Patterns and Associations: Basic Concepts Assignment 4 Assignment 3 Due
10 10/25 Mining Frequent Patterns and Associations: Basic Concepts Sequence Pattern Mining (chapter) (Zaki et al.)
11 10/30 Mining Frequent Patterns and Associations: Advanced Methods Advanced Pattern Mining (Chapter 7, Han et al.)
11 11/01 Mining Frequent Patterns and Associations: Advanced Methods
12 11/06 Mining Frequent Patterns and Associations: Advanced Methods Chapter 3 (Leskovec et al.)
12 11/08 Mining Frequent Patterns and Associations: Advanced Methods & Finding Similar Items: Locality-Sensitive Hashing
13 11/13 Finding Similar Items: Locality-Sensitive Hashing Assignment 5 (Programming) Assignment 4 Due
13 11/15 Introduction to Graphs Chapter 4: Graph Data (Zaki et al.)
14 11/20 Introduction to Graphs (& Final Exam Sample Problems)
14 11/22 Thanksgiving Break
15 11/27 Introduction to Information Retrieval
15 11/29 Guest Lecture by Dr. Ping Zhang
16 12/04 Review Session Assignment 5 Due
16 12/07 (Friday) Final Exam 8AM-9:45AM at Baker Systems 136
16 12/08

Course slides are partly adapted from similar courses offered by Prof. Jiawei Han in UIUC, Prof. Srinivasan Parthasarathy in OSU, Prof. Yizhou Sun in UCLA, and from books listed above.