COMS W4721 Machine
Learning for Data Science
Columbia University, Spring 2019
Instructor: John Paisley
Location: 501 Schermerhorn Hall
Time: T/Th 7:40pm  8:55pm
Office hours: Monday 11am12pm @ 422 Mudd Building
TA's:

Danyang He

dh2914@columbia.edu  Friday 79pm @ CS TA room, Mudd 122A (1st floor)


Arjun Srivatsa

ass2186@columbia.edu
 Saturday 122pm @ CS TA room, Mudd 122A (1st floor) 

Luv Aggarwal

la2733@columbia.edu
 Tuesday 35pm @ CS TA room, Mudd 122A (1st floor) 

Sukriti Tiwari

st3177@columbia.edu
 Monday 57pm @ CS TA room, Mudd 122A (1st floor) 

Daniel Jeong

dpj2108@columbia.edu

Thursday 24pm @ CS TA room, Mudd 122A (1st floor)


Josh Rutta

jar2317@columbia.edu

Tuesday 10am12pm @ EE lounge, Mudd 1301 (13th floor)


Ghazal Fazelnia

gf2293@columbia.edu

CVN student office hours via email (no fixed time) 
Synopsis:
This course provides an introduction to supervised and unsupervised
techniques for machine learning. We will cover both probabilistic and
nonprobabilistic approaches to machine learning. Focus will be on
classification and regression models, clustering methods, matrix
factorization and sequential
models. Methods covered in class include linear and
logistic regression, support vector machines, boosting, Kmeans
clustering, mixture models, expectationmaximization
algorithm, hidden Markov models, among others. We will cover
algorithmic techniques for optimization, such as gradient and
coordinate descent methods, as the need arises.
Prerequisites:
Basic linear algebra and calculus, introductorylevel courses in
probability and statistics. Comfort with a programming language (e.g.,
Matlab) will be essential for completing the homework assignments. Not
open to students who have taken COMS 4771, STATS 4400 or IEOR 4525.
Text:
There is no required text for the course. Suggested readings for each
class will be given from the textbooks below. These readings are
meant to be general pointers and may contain more material than we
cover in class.
T. Hastie, R. Tibshirani and J. Friedman, The Elements of Statistical Learning, Second Edition, Springer. [link]
C. Bishop, Pattern Recognition and Machine Learning, Springer. [link]
H. Daume, A Course in Machine Learning, Draft. [link]
Grading: 4 homework assignments (50%), midterm exam (25%), final inclass exam
(25%). Each homework assignment will have a programming component that
will count significantly toward the final homework grade. The final
inclass exam will focus on material from the second half
of the course (after Spring Break).

Date


Topics
covered

Suggested
readings


Week 1

1/22/2019


Introduction, maximum likelihood estimation

ESL Ch. 12; PRML Ch. 2.12.3



1/24/2019


linear regression, least squares, geometric view

ESL Ch. 3.13.2; PRML Ch. 1.1, 3.1


Week 2

1/29/2019


ridge regression, probabilistic views of linear regression

ESL Ch. 3.33.4; PRML Ch. 3.13.2



1/31/2019


biasvariance, Bayes rule, maximum a posteriori

ESL Ch. 7.17.3, 7.10; PRML Ch 2.3


Week 3

2/5/2019


Bayesian linear regression

PRML 3.33.5



2/7/2019


sparsity, subset selection for linear regression

ESL Ch. 3.33.8


Week 4

2/12/2019


nearest neighbor classification, Bayes classifiers

ESL Ch. 13.313.5; CML Ch. 2, 7



2/14/2019


linear classifiers, perceptron

ESL Ch. 4.5; CML 3


Week 5

2/19/2019


logistic regression, Laplace approximation 
ESL Ch. 4.4; PRML Ch. 4.34.5 


2/21/2019


kernel methods, Gaussian processes 
ESL Ch. 6; PRML Ch. 6; CML Ch. 9 

Week 6

2/26/2019


maximum margin, support vector machines 
ESL Ch. 12.112.3; PRML Ch. 7.1 


2/28/2019


trees, random forests 
ESL Ch. 9.2, 15; CML Ch. 1 

Week 7

3/5/2019


boosting 
ESL Ch. 10; CML Ch. 11 


3/7/2019


neural networks

ESL Ch. 11; PRML Ch. 5


Week 8

3/12/2019


Midterm exam (location: IAB 417)




3/14/2019


no class



Week 9



Spring Break



Week 10

3/26/2019


no class




3/28/2019


clustering, kmeans 
ESL Ch. 14.3; PRML Ch. 9.1; CML Ch. 13 

Week 11

4/2/2019


EM algorithm, missing data 
ESL Ch. 8.5; PRML Ch. 9.39.4 


4/4/2019


mixtures of Gaussians 
PRML Ch. 9.2; CML Ch. 14


Week 12

4/9/2019


matrix factorization 
Review article



4/11/2019


nonnegative matrix factorization 
ESL Ch. 14.6; Review article 

Week 13

4/16/2019


latent factor models, PCA and variations 
ESL Ch. 14.5; PRML Ch. 12.112.3 


4/18/2019


Markov models 
PRML Ch. 13.1 

Week 14

4/23/2019


hidden Markov models 
PRML Ch. 13.2 


4/25/2019


continuous statespace models

PRML Ch. 13.3


Week 15

4/30/2019


association analysis 
ESL Ch. 14.2; Book chapter 


5/2/2019


Final inclass exam (location: IAB 417)



