Spring 2020
This course gives an introduction to foundations of machine learning and statistical learning. Please see Schedule of Classes for topics covered in this course.
- Keynote by C. Bishop on Model-based machine learning. A must watch!
- Midterm is on Thursday, Feb 27, 2020 at 10am-12pm
- Lectures will be conducted online from March 16, 2020 to March 31, 2020. Please check the Classroom for more details.
Outline
Clint P. George — clint [at] iitgoa [dot] ac [dot] in — Office: F9, New Academic Block A
Class meetings (LT3, Admin Block):
Instructor office hours (F-9, New Academic Block):
TA office hours (Computing Center, Admin Block, 1st floor)
Tutorial/Lab hours (Computing Center, Admin Block, 1st floor): 4-4:45pm, Mon
Elements of Statistical Learning by Hastie, Tibshirani, and Friedman (2017)
Machine Learning by Mitchell (2009)
Pattern Recognition and Machine Learning (Information Science and Statistics) by Bishop (2010)
Foundations of Data Science by Blum, Hopcroft, and Kannan (2018)
R for Data Science: Import, Tidy, Transform, Visualize, and Model Data by Wickham and Grolemund (2016)
This is a core course designed for undergraduate (CS 360; six semester Computer Science and Engineering) and graduate (CS 530) students. Familiarity with the following is recommended.
Basic computer programming—we use R/Python.
Probability theory (CS 215, MA 605)
Multivariable calculus and linear algebra (MA 105, MA 106, EE 611)
CS360: Quiz 1 (15%) — Midterm (20%) — Quiz 2 (15%) — Homeworks and Classroom participation (15%) — Final (35%)
CS530: Quiz 1 (10%) — Midterm (20%) — Quiz 2 (10%) — Projects/Term paper (15%) — Homeworks and Classroom participation (15%) — Final (30%)
We expect that every student follows the highest standards of integrity and academic honesty. Copying/sharing code in exams, homeworks, lab sessions are not permitted. See the IIT Goa policy for academic malpractices.
Note: This is a tentative course schedule. It will be updated often. Also, log on to Classroom to see lecture slides,videos, additional course materials, and announcements.
S/N | Date | Topic | Resources |
---|---|---|---|
1 | Jan 07 | Course Introduction | lecture 01 |
2 | Jan 08 | Data science basics: data visualization, data transformations | R vs. Python for data science; data visualization; data transformations |
3 | Jan 13 | Understanding the data: expected value, variance, covariance, visualizing distributions | St. Petersbug paradox; exploratory data analysis |
4 | Jan 14 | Hypothesis testing | video 1; one-vs-two tailed tests; -test example |
5 | Jan 15 | Hypothesis testing: crtical region, -value, examples | one-sample test; normality test; boxplots |
6 | Jan 20 | Supervised learning, linear models, ordinary least-squares | cs229 notes 1 |
7 | Jan 21 | The least mean square (LMS) algorithm: batch and stochastic gradient approaches; The normal equations for the least squares problem. | |
8 | Jan 22 | Polynomial curve fitting and similarities with least squares | comic; C. Bishop's Slides 3-13 |
9 | Jan 27 | Linear regression: probabilistic view, Maximum Likelihood Estimate | |
10 | Jan 29 | Locally weighted linear regression, introduction to logistic regression | cs229 notes 1; Section 4 |
11 | Feb 03 | Tutorial on Linear regression and homework 3 | |
12 | Feb 04 | Logistic regression | logistic regression tutorial; cs229 notes 1; Section 5 |
13 | Feb 04 | -Nearest Neighbours: discussion and workshop by A. Gupta, S. Kumar | |
14 | Feb 05 | Discussion on softmax regression, the perceptron algorithm | |
15 | Feb 10 | Discussion on Gradient Descent, the Newton's method | Lectures by Gibert Strang on Gradient Descent; Newton's method |
16 | Feb 11 | Introduction to Neural Networks | Example: CIFAR-10; reading: linear classifier |
17 | Feb 12 | Neural networks, the back-propagation algorithm | cs231n notes |
18 | Feb 17 | The back-propagation algorithm: aspects of implementation | Notes by R. Grosse, U. Toronto |
19 | Feb 18 | Discussion on the backprop algorithm | |
20 | Feb 19 | Introduction to the Convolutional Neural Networks | cs231n notes |
21 | Feb 24 | Overview of deep autoencoders | |
22 | Mar 16 | Generative models | |
23 | Mar 17 | Gaussian discriminant analysis | Andrew Ng's notes (Section 1); relevant lecture |
24 | Mar 23 | Gaussian discriminant analysis vs logistic regression | |
25 | Mar 24 | Bias/variance tradeoff | |
26 | Mar 25 | Model selection | Andrew Ng's lecture (from 35th minute); more |
27 | Mar 30 | Feature selection: forward/backward, filter-based approaches | Andrew Ng's notes |
28 | Mar 31 | Probabilistic inference: MLE, MAP estimates | |
29 | Apr 01 | Probabilistic mixtures and introduction to the EM algorithm | Andrew Ng's notes |
30 | Apr 07 | The EM algorithm | Andrew Ng's notes |
31 | Apr 08 | The EM algorithm (continued) | |
32 | Apr 13 | Introduction to text mining and natural language processing | |
33 | Apr 14 | Principal Component Analysis | |
34 | Apr 15 | Singualar Value Decomposition and applications in text data | |
35 | Apr 20 | Probabilistic Topic Models | |
36 | Apr 21 | Probabilistic Topic Models (continued) | |
37 | Apr 22 | [ADD] Support vector machines | Prof. Patrick Winston's lecture |
38 | Apr 27 | [ADD] Graduate seminars |
Log on to Classroom to see more details.
S/N | Date | Title | Remarks |
---|---|---|---|
1 | Jan 09 | Homework 01: Getting started with R and data science | Due date: Jan 14, 9am |
2 | Jan 14 | Homework 02: CLT and Standard Error | Due date: Jan 20, 10am |
3 | Jan 21 | Homework 03: Linear regression | Due date: Jan 27, 10am |
4 | Jan 28 | Quiz 01 | LT3, Admin Block |
5 | Feb 08 | Homework 04: Logistic regression | Due date: Feb 18, 10am |
6 | Mar 31 | Homework 05: Neural networks | Due date: Apr 08, 9am |
7 | Mar 27 | Course project: CORD-19-research-challenge | Due date: Finals week |