CS 360/530: Foundations of Machine Learning

Spring 2020

This course gives an introduction to foundations of machine learning and statistical learning. Please see Schedule of Classes for topics covered in this course.

Outline

Instructor

Clint P. George — clint [at] iitgoa [dot] ac [dot] in — Office: F9, New Academic Block A

Teaching Assistants

Meetings

  1. Elements of Statistical Learning by Hastie, Tibshirani, and Friedman (2017)

  2. Machine Learning by Mitchell (2009)

  3. Pattern Recognition and Machine Learning (Information Science and Statistics) by Bishop (2010)

  4. Foundations of Data Science by Blum, Hopcroft, and Kannan (2018)

  5. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data by Wickham and Grolemund (2016)

Course Eligibility and Requirements

This is a core course designed for undergraduate (CS 360; six semester Computer Science and Engineering) and graduate (CS 530) students. Familiarity with the following is recommended.

Grading Policy (tentative)

Academic Honesty

We expect that every student follows the highest standards of integrity and academic honesty. Copying/sharing code in exams, homeworks, lab sessions are not permitted. See the IIT Goa policy for academic malpractices.

Schedule of Classes

Note: This is a tentative course schedule. It will be updated often. Also, log on to Classroom to see lecture slides,videos, additional course materials, and announcements.

S/N Date Topic Resources
1 Jan 07 Course Introduction lecture 01
2 Jan 08 Data science basics: data visualization, data transformations R vs. Python for data science; data visualization; data transformations
3 Jan 13 Understanding the data: expected value, variance, covariance, visualizing distributions St. Petersbug paradox; exploratory data analysis
4 Jan 14 Hypothesis testing video 1; one-vs-two tailed tests; tt-test example
5 Jan 15 Hypothesis testing: crtical region, pp-value, examples one-sample test; normality test; boxplots
6 Jan 20 Supervised learning, linear models, ordinary least-squares cs229 notes 1
7 Jan 21 The least mean square (LMS) algorithm: batch and stochastic gradient approaches; The normal equations for the least squares problem.
8 Jan 22 Polynomial curve fitting and similarities with least squares comic; C. Bishop's Slides 3-13
9 Jan 27 Linear regression: probabilistic view, Maximum Likelihood Estimate
10 Jan 29 Locally weighted linear regression, introduction to logistic regression cs229 notes 1; Section 4
11 Feb 03 Tutorial on Linear regression and homework 3
12 Feb 04 Logistic regression logistic regression tutorial; cs229 notes 1; Section 5
13 Feb 04 kk-Nearest Neighbours: discussion and workshop by A. Gupta, S. Kumar
14 Feb 05 Discussion on softmax regression, the perceptron algorithm
15 Feb 10 Discussion on Gradient Descent, the Newton's method Lectures by Gibert Strang on Gradient Descent; Newton's method
16 Feb 11 Introduction to Neural Networks Example: CIFAR-10; reading: linear classifier
17 Feb 12 Neural networks, the back-propagation algorithm cs231n notes
18 Feb 17 The back-propagation algorithm: aspects of implementation Notes by R. Grosse, U. Toronto
19 Feb 18 Discussion on the backprop algorithm
20 Feb 19 Introduction to the Convolutional Neural Networks cs231n notes
21 Feb 24 Overview of deep autoencoders
22 Mar 16 Generative models
23 Mar 17 Gaussian discriminant analysis Andrew Ng's notes (Section 1); relevant lecture
24 Mar 23 Gaussian discriminant analysis vs logistic regression
25 Mar 24 Bias/variance tradeoff
26 Mar 25 Model selection Andrew Ng's lecture (from 35th minute); more
27 Mar 30 Feature selection: forward/backward, filter-based approaches Andrew Ng's notes
28 Mar 31 Probabilistic inference: MLE, MAP estimates
29 Apr 01 Probabilistic mixtures and introduction to the EM algorithm Andrew Ng's notes
30 Apr 07 The EM algorithm Andrew Ng's notes
31 Apr 08 The EM algorithm (continued)
32 Apr 13 Introduction to text mining and natural language processing
33 Apr 14 Principal Component Analysis
34 Apr 15 Singualar Value Decomposition and applications in text data
35 Apr 20 Probabilistic Topic Models
36 Apr 21 Probabilistic Topic Models (continued)
37 Apr 22 [ADD] Support vector machines Prof. Patrick Winston's lecture
38 Apr 27 [ADD] Graduate seminars

Schedule of Homeworks, Quizzes, Exams

Log on to Classroom to see more details.

S/N Date Title Remarks
1 Jan 09 Homework 01: Getting started with R and data science Due date: Jan 14, 9am
2 Jan 14 Homework 02: CLT and Standard Error Due date: Jan 20, 10am
3 Jan 21 Homework 03: Linear regression Due date: Jan 27, 10am
4 Jan 28 Quiz 01 LT3, Admin Block
5 Feb 08 Homework 04: Logistic regression Due date: Feb 18, 10am
6 Mar 31 Homework 05: Neural networks Due date: Apr 08, 9am
7 Mar 27 Course project: CORD-19-research-challenge Due date: Finals week

Other relevant courses and resources

  1. CS229: Machine Learning
  2. Machine learning by Tom Mitchel
  3. COS 324 - Introduction to Machine Learning