CS 360: Introduction to Data Science &
Machine Learning

Spring 2019

This is a course desinged for senior undergraduate students. This course will cover introductory materials on Data Science and Machine Learning. The course schedule will list the set of topics will be covered in this course.

Instructor

Clint P. George — clint [at] iitgoa.ac.in — Office: 205 (F5), IT Building

Meetings

Lectures: Monday (2-3pm, CL3) — Tuesday (3-4pm, LH2) — Wednesday (11:30am-12:30pm, T1)
Office hours: Monday (3-4pm) — Tuesday (4:00pm-4:55pm)

Recommended Books

Elements of Statistical Learning
by Hastie, Tibshirani, and Friedman (2017)
Machine Learning
by Mitchell (2009)
Pattern Recognition and Machine Learning (Information Science and Statistics)
by Bishop (2010)
Foundations of Data Science
by Blum, Hopcroft, and Kannan (2018)
R for Data Science: Import, Tidy, Transform, Visualize, and Model Data
by Wickham and Grolemund (2016)

Course Eligibility and Requirements

This is a core course designed for the fifth semester Computer Science and Engineering undergraduate students. Knowledge in computer programming is required.
Course prerequisites: CS 113, MA106, CS215, CS218, MA214, CS344, CS386

Grading Policy

Quiz 1 (10%) — Midterm (20%) — Quiz 2 (10%) — Seminar (10%) — Homework and Classroom Pop-Quizes (10-15%) — Final (35-40%)

Academic Honesty

We expect each student to follow the highest standards of integrity and academic honesty. Copying/sharing code in exams, homeworks, labs are not allowed: see IIT Goa: Policy for academic malpractices.

Course Schedule

This is a tentative course schedule. It will be updated often. Log on to classroom to see lecture slides, other course materials, and announcements.

# Topic Materials*
1 Course Introduction
2 R programming, data visualization, data transformations using R libraries
3 Exploratory data analysis
4 Expected value, variance, The Central Limit Theorem
7 Linear regression, logistic regression, Perceptron (review)
8 Generative learning algorithms, Gaussian discriminant analysis
9 MLE, MAP (review)
10 Support Vector Machines (SVMs), kernel methods
11 Bias--variance tradeoff and error analysis
12 Learning Theory, Generalization errors, VC dimension
13 Regularization and model selection
14 Experimental evaluation of learning algorithms, cross-validation
15 Multilayer neural networks
16 Backpropagation
18 Mixture models and mixture of Gaussians
19 The expectation maximization (EM) algorithm
22 Probabilistic topic models