ELEN E4903 Machine Learning

Columbia University, Spring 2018

Instructor: John Paisley
Location: 501 Northwest Corner Building

Time: Wednesday 7pm - 9:30pm
Office hours: Monday 11-12, Mudd 422

TA's: Aonan Zhang az2385@columbia.edu hrs: Friday 7 - 9pm @ CS TA room, Mudd 122A (1st floor)
Ghazal Fazelnia
hrs: Wednesday 9:30 - 11:30am @ DSI Space, Mudd (4th floor)

Yiyang Li
yl3789@columbia.edu hrs: Wednesday 2 - 4pm @ CS TA room, Mudd 122A (1st floor)

Kejia Shi
hrs: Friday 9:30 - 11:30am @ CS TA room, Mudd 122A (1st floor)

Sidharth Prasad
sp3591@columbia.edu hrs: Monday 5:30 - 7:30pm @ CS TA room, Mudd 122A (1st floor)

Yiran Shi
hrs: Thursday 4 - 6pm @ CS TA room, Mudd 122A (1st floor)

Di Lu
hrs: Tuesday 2:40 - 4:40pm @ CS TA room, Mudd 122A (1st floor)

This course provides an introduction to supervised and unsupervised techniques for machine learning. We will cover both probabilistic and non-probabilistic approaches to machine learning. Focus will be on classification and regression models, clustering methods, matrix factorization and sequential models. Methods covered in class include linear and logistic regression, support vector machines, boosting, K-means clustering, mixture models, expectation-maximization algorithm, hidden Markov models, among others. We will cover algorithmic techniques for optimization, such as gradient and coordinate descent methods, as the need arises. This class is part of the Topics in Electrical & Computer Engineering series.

Prerequisites:   Basic linear algebra and calculus, introductory-level courses in probability and/or statistics strongly encouraged. Comfort with a programming language (e.g., Matlab) will be essential for completing the homework assignments. Not open to students who have taken COMS 4721, COMS 4771, STATS 4240, STATS 4400 or IEOR 4525.

Text:   There is no required text for the course. Suggested readings for each class will be given from the textbooks below. These readings are meant to be general pointers and may contain more material than we cover in class.

    T. Hastie, R. Tibshirani and J. Friedman, The Elements of Statistical Learning, Second Edition, Springer. [link]
    C. Bishop, Pattern Recognition and Machine Learning, Springer. [link]
    H. Daume, A Course in Machine Learning, Draft. [link]

Grading:   5 homework assignments (50%), midterm exam (25%), final in-class exam (25%). Each homework assignment will have a programming component that will count significantly toward the final homework grade. The final in-class exam will focus on material from the second half of the course.


Topics covered
Suggested readings
Additional Information
Week 1

Introduction, maximum likelihood estimation ESL Ch. 1-2; PRML Ch. 2.1-2.3 Homework 1 out (see Courseworks)

linear regression, least squares, geometric view ESL Ch. 3.1-3.2; PRML Ch. 1.1, 3.1 Due February 4 by 11:59pm
Week 2

ridge regression, probabilistic views of linear regression ESL Ch. 3.3-3.4; PRML Ch. 3.1-3.2

bias-variance, Bayes rule, maximum a posteriori ESL Ch. 7.1-7.3, 7.10; PRML Ch 2.3
Week 3

Bayesian linear regression PRML 3.3-3.5

sparsity, subset selection for linear regression ESL Ch. 3.3-3.8
Week 4

nearest neighbor classification, Bayes classifiers ESL Ch. 13.3-13.5; CML Ch. 2, 7 Homework 2 out (see Courseworks)

linear classifiers, perceptron ESL Ch. 4.5; CML 3 Due February 25 by 11:59pm
Week 5

logistic regression, Laplace approximation ESL Ch. 4.4; PRML Ch. 4.3-4.5

kernel methods, Gaussian processes ESL Ch. 6; PRML Ch. 6; CML Ch. 9
Week 6

maximum margin, support vector machines ESL Ch. 12.1-12.3; PRML Ch. 7.1

trees, random forests ESL Ch. 9.2, 15; CML Ch. 1
Week 7

boosting ESL Ch. 10; CML Ch. 11

Week 8

Midterm exam

Homework 3 out (see Courseworks)

Due March 18 by 11:59pm
Week 9

No class (Spring break)

Week 10

clustering, k-means ESL Ch. 14.3;  PRML Ch. 9.1; CML Ch. 13

EM algorithm, missing data ESL Ch. 8.5; PRML Ch. 9.3-9.4
Week 11

mixtures of Gaussians PRML Ch. 9.2; CML Ch. 14 Homework 4 out (see Courseworks)

matrix factorization Review article
Due April 10 by 11:59pm
Week 12

non-negative matrix factorization ESL Ch. 14.6; Review article

latent factor models, PCA and variations ESL Ch. 14.5; PRML Ch. 12.1-12.3
Week 13

Markov models PRML Ch. 13.1

hidden Markov models PRML Ch. 13.2
Week 14

continuous state-space models PRML Ch. 13.3

association analysis ESL Ch. 14.2; Book chapter
Week 15

Final in-class exam