syllabus
About the Me!!!
Name | PAUL OFFEI |
poffei@st.ug.edu.gh | |
Office | IT Lab |
Office Hours | Wednesday and Thursday |
Virtual Meetings | Microsoft Team |
Webpage | ntow.netlify.app |
Course Syllabus
Course Description:
Data mining is the study of efficiently finding structures and patterns in large data sets. We will focus on several aspects of this:
- (1) converting from a messy and noisy raw data set to a structured and abstract one,
- (2) applying scalable and probabilistic algorithms to these well-structured abstract data sets
- (3) formally modeling and understanding the error and other consequences of parts (1) and (2), including choice of data representation and trade-offs between accuracy and scalability.
Course Topics
- Supervised Learning - labelled data and task driven
- regression — to predict one or more real values
- classification — to predict one of a finite number of possible outcomes
- probabilistic supervised learning — to predict a distribution of outcomes
- Unsupervised Learning — unlabelled data and data driven to develop a data model
- clustering - divide by similarity
- association - identify sequence
- dimensionality reduction - wider dependencies
- Optimization — to fit or choose parameters in all of the models above
Prerequisites
- Introductory programming, Mathematics and Statistics course
Course Announcements
All the course announcement and additional materials or tutorials to help you learn the cource will be posted in the blog page of the course website.
Homework late policy
Every assignment in this course is due at exactly the time stated and while I will grade late assignments, there will be a marks deduction.
Evaluation
The course evaluation will be a weighted mask score on class attenance and participation, homework, quizes, projects and exams.
Optional Test Book
- Introduction to Applied Linear Algebra by Lieven Vandenberghe and Stephen P. Boyd
- Introduction to Probability for Data Science by Stanley Chan
- Mining of Massive Datasets by Jure Leskovec, Anand Rajaraman, Jeff Ullman
- Foundations of Data Science by Avrim Blum, John Hopcroft and Ravindran Kannan
- Introduction to Data Mining by Pang-Ning Tan, Michael Steinbach, Anuj Karpatne, Vipin Kumar
- Pattern Recognition and Machine Learning by Christopher M. Bishop
- The Elements of Statistical Learning Data Mining, Inference, and Prediction by Trevor Hastie, Robert Tibshirani
- Probabilistic Machine Learning An Introduction by Kevin P. Murphy
- Mathematics for Machine Learning Marc Peter Deisenroth, A. Aldo Faisal, Cheng Soon Ong
- Mathematical Foundation for Data Analysis by Prof. Jeff M. Phillips
- Deep Learning by Ian Goodfellow, Yoshua Bengio, Aaron Courville