CSE 5243: Introduction to Data Mining (Au19, Tu/Th 9:35-10:55am, McPherson Lab 2019)

Instructor: Huan Sun

Teaching Assistants: Jiaqi Xu (xu.1629)

Level and credits: U/G, 3

Prerequisites: Introduction to Databases, Introduction to Algorithms, or grad standing or permission of instructor

Office hours and locations (Instructor): Tue 11:00AM-12:15PM, Dreese Labs 699

Office hours and locations (TA): Jiaqi Xu @ Baker406, 3:00PM-4:00PM on Tuesday (xu.1629)

Description

Introduction to the knowledge discovery process, key data mining techniques, efficient high performance mining algorithms, exposure to applications of data mining.

Grading Plan (Note: All the deadlines are 11:59PM (midnight) of the due dates. No late submissions!)

  • Participation: 10%
  • Homework: 50%
  • Midterm Exam: 20%
  • Final Exam: 20%
  • No course project

Textbooks

Jiawei Han, Micheline Kamber, and Jian Pei. Data Mining: Concepts and Techniques, 3rd edition, Morgan Kaufmann, 2011

Recommended books for reading:

Academic Integrity Policy

Academic integrity is essential to maintaining an environment that fosters excellence in teaching, research, and other educational and scholarly activities. Thus, The Ohio State University and the Committee on Academic Misconduct (COAM) expect that all students have read and understand the University’s Code of Student Conduct, and that all students will complete all academic and scholarly assignments with fairness and honesty. Students must recognize that failure to follow the rules and guidelines established in the University’s Code of Student Conduct and this syllabus may constitute “Academic Misconduct.” For more info, click here.

Anonymous Feedback/Comments/Suggestions?

Feel free to leave any comments and suggestions about how the instructor/grader can do better to help you learn this course, such as whether the lectures are clear, examples are helpful, questions are answered timely, etc. Please check this anonymous form. Your input is highly appreciated! ;-)

Course Syllabus and Schedule (To be updated later)

Week Date Topic Assignment Out Assignment Due Lecture Notes
1 08/20 Class Outline Chapter 1 (Han et al.)
1 08/22 Introduction Chapter 1 (Han et al.)
2 08/27 Review of Basic Probability and Statistical Concepts Review of Probability Theory
2 08/29 Data & Data Preprocessing Assignment 1 Chapter 2 & Chapter 3 (Han et al.)
3 09/03 Data Preprocessing and Classification:Basic Concepts/Methods Chapter 8 (Han et al.)
3 09/05 Classification:Basic Concepts/Methods
4 09/10 Classification:Basic Concepts/Methods Assignment 2 (programming) Assignment 1 Due
4 09/12 Classification:Basic Concepts/Methods
5 09/17 Classification:Advanced Methods Chapter 9 (Han et al.), Sample Midterm
5 09/19 Classification:Advanced Methods & HW#2 Brief Discussion/QA & Clustering: Basic Concepts/Methods Chapter 10 (Han et al.)
6 09/24 Clustering: Basic Concepts/Methods(pdf)
6 09/26 Clustering: Basic Concepts/Methods(pdf)
7 10/01 Clustering: Basic Concepts/Methods(pdf) Assignment 3 (programming) Assignment 2 Due
7 10/03 Homework Discussion + Lecture Review + Midterm Review
8 10/08 Midterm Exam
8 10/10 Autumn Break
9 10/15 Mining Frequent Patterns and Associations: Basic Concepts Frequent Pattern Mining (Chapter 6, Han et al.)
9 10/17 Mining Frequent Patterns and Associations: Basic Concepts
10 10/22 Mining Frequent Patterns and Associations: Basic Concepts Assignment 4
10 10/24 Mining Frequent Patterns and Associations: Basic Concepts Sequence Pattern Mining (chapter) (Zaki et al.)
10 10/26 Assignment 3 Due (the NEW and FINAL deadline)
11 10/29 Mining Frequent Patterns and Associations: Advanced Methods Advanced Pattern Mining (Chapter 7, Han et al.)
11 10/31 Mining Frequent Patterns and Associations: Advanced Methods
12 11/05 Mining Frequent Patterns and Associations: Advanced Methods Chapter 3 (Leskovec et al.)
12 11/07 Mining Frequent Patterns and Associations: Advanced Methods & Finding Similar Items: Locality-Sensitive Hashing
13 11/12 Finding Similar Items: Locality-Sensitive Hashing Assignment 5 (Programming) Assignment 4 Due
13 11/14 Introduction to Graphs Chapter 4: Graph Data (Zaki et al.)
14 11/19 Introduction to Graphs (& Final Exam Sample Problems)
14 11/22 Introduction to Information Retrieval
15 11/26 Guest Lecture Extended Office Hour (9:35AM-12:00PM, DL699)
15 11/28 Thanksgiving Break
16 12/03 Review Session Assignment 5 Due
16 12/06 (Friday) Final Exam

Course slides are partly adapted from similar courses offered by Prof. Jiawei Han in UIUC, Prof. Srinivasan Parthasarathy in OSU, Prof. Yizhou Sun in UCLA, Prof. Yijun Zhao in Northeastern University and from books listed above.