CSE 5243 is offered under the auspices of the
Department of Computer Science and Engineering, The Ohio State University. It
is an elective course and will serve all those interested and enthusiastic
about data mining and data analytics.
Course Description
Knowledge
discovery, data mining, data preprocessing, data transformations; clustering,
classification, frequent pattern mining, anomaly detection, graph and network
analysis; applications.
Time/Venue:
Class Number |
Location |
Venue |
35059 |
JR0270 |
TR 3:55-5:15 PM |
35060 (M) |
JR0270 |
TR 3:55-5:15 PM |
Course Contents
Topic |
Introduction to the Knowledge Discovery Process and
Background |
Elements of Data Preprocessing and Data
Transformations |
Data Clustering |
Data Classification |
Frequent Pattern and Association Mining |
Analyzing Graphs and Networks |
Anomaly Detection |
Applications (Bioinformatics, Social Networks) |
Prerequisites
CSE
3241 or 5241, and CSE
2331, 5331, Stat 3301, or ISE 3200.
Coursework in Numerical Methods/Linear
Algebra/Statistics; Data Structures; Programming fluidity required.
-
Introduction
to data mining. Tan, Pang-Ning, Michael Steinbach, Anuj Karpatne, and Vipin Kumar. 2019.
-
Learning Data Mining with Python (Safari), Robert Layton, 2017.
-
Jupyter for Data Science, Dan Toomey, 2017.
Reference Text Books
-
Data Mining: Concepts and Techniques
(Safari), Jiawei Han, Micheline Kamber, and Jian Pei. 2011.
-
Data Mining Analysis and Concepts,
Mohammed J. Zaki and Wagner Meira,
Jr., Online Version.
-
Mining of Massive Datasets ,
Jure Leskovec, Anand Rajaraman
and Jeffrey Ullman, Online Version.
-
Machine Learning, Tom Mitchell, 1997.
-
Pattern Recognition and Machine Learning,
Christopher M. Bishop, 2006.
-
Neural Networks and Deep Learning, Michael Nielsen, Online version.
-
Deep Learning, Ian
Goodfellow, Yoshua Bengio,
and Aaron Courville. 2016.
-
An Introduction to
Statistical Learning: with Applications in R. Gareth James, Daniela Witten,
Trevor Hastie, and Robert Tibshirani. 2014.
Other Reference Material
-
Data Analysis with Open
Source Tools, Philipp K. Janert, O’Reilly 2010
-
Think
Stats, Allen B. Downey, O’Reilly, 2014
-
Visualization
Analysis and Design, Tamara Munzner, CRC Press,
2014
Data
Dataquest Data Repository List - https://www.dataquest.io/blog/free-datasets-for-projects/
KDnuggets -
1. https://www.kdnuggets.com/datasets/index.html
2. https://www.kdnuggets.com/faq/datasets-for-data-mining.html
Data Driven - https://www.drivendata.org/
Data World - https://data.world/community/open-community/data-partners/
University of Edinburgh Data Sets –
http://www.inf.ed.ac.uk/teaching/courses/dme/html/datasets0405.html
Instructor
Raghu Machiraju, Ph.D.
Professor - Departments of Bioinformatics , Computer
Science & Engineering and Pathology.
Principal Data Scientist, Translational Data Analytics
Institute.
Grading Assistant
Chaitanya Kulkarni, BS, MS.
Department of Computer
Science and Engineering, kulkarni dot 132
at buckeyemail dot osu dot edu.
Office Hours
Instructor- TR
1:00-2:00 PM, DL 779.
Grader: Chaitanya
Kulkarni - Mon 12:00-1:00 PM, BE406.
Grade Distribution:
Participation: 5%; Laboratory Assignments: 40%; Quizzes:10%;Midterm:
20%, Final Project: 25%
Class Help/Watering Hole:
https://piazza.com/osu/autumn2016/cse5544/home
Week 1
|
1/7
|
Rubrics, Case Studies
|
1/9
|
Introduction
|
Text Ch, 1
|
Week 2
|
1/14
|
Data: Types & Characteristics
|
1/16
|
Data: Statistics
& Math
|
Text Ch, 2
|
Week 3
|
1/21
|
Data: Preprocessing
|
1/23
|
Data: Visualization; Workflows
|
Text Ch, 2
|
Week 4
|
1/28
|
Classification: Basic |
1/30
|
Classification: Basic
|
Text Ch, 3
|
Week 5
|
2/4
|
Classification: Basic |
2/6
|
Classification: Advanced |
Text Ch, 3-4
|
Week 6
|
2/11
|
Classification: Advanced |
2/13
|
Classification: Advanced |
Text Ch, 4
|
Week 7
|
2/18
|
Classification: Advanced |
2//20
|
Classification: Advanced |
Text Ch, 4
|
Week 8
|
2/25
|
Clustering: Basic |
2/27
|
Clustering: Basic |
Text Ch, 7
|
Week 9
|
3/3
|
Clustering: Basic |
3/5
|
Midterm |
Text Ch, 7
|
Week 10
|
3/10
|
Spring Break
|
3/12
|
Spring Break
|
Head South
|
Week 11
|
3/17
|
Clustering:
Advanced
|
3/19
|
Clustering:
Advanced
|
Text Ch, 8
|
Week 12
|
3/24
|
Association Mining:
Basic
|
3/26
|
Association Mining:
Basic
|
Text, Ch 5
|
Week 13
|
3/31
|
Association Mining:
Basic
|
4/2
|
Association Mining:
Basic
|
4/5, Lab 4 due
|
Week 14
|
4/7
|
Association Mining:
Advanced
|
4/9
|
Graphs and Networks
|
Project Proposal
|
Week 15
|
4/14
|
Graphs Networks
|
4/16
|
Anamoly Detection
|
Last Week
|
Labs:
-
Laboratory1:
Due XX,YY : Data Pre-processing.
-
Laboratory2:
Due XX,YY: Clustering
-
Laboratory3:
Due XX,YY: Classification
-
Laboratory4:
Due XX,YY: Frequent Pattern Mining
Midterm:
TBA
Final Project
TBA
© The Ohio State University. All rights reserved. | Images by Jany Chan. | Design by TEMPLATED.