SML -- Scalable Machine Learning Course

SML: Scalable Machine Learning

Practical information

  • Volume: 3 hours per week (3 credits)

  • Time: Tuesday, 4-7pm (3 lectures /in one block)

  • Location: 306 SODA

  • Instructor: Alex Smola (available 1-3pm Tuesdays in Evans 418)

  • TA: Dapo Omidiran

  • Grading Policy: Assignments (40%), Project (50%), Midterm project review (10%), Scribe (Bonus 5%)

  • Piazza discussion board


  • 02222012 - Slides are online

  • 02222012 - New assignments are live

  • 02222012 - Video for SVM (first three sets) are uploaded

  • 02222012 - Video for Optimization complete

  • 02052012 - Slides for Streams and Optimization are uploaded

  • 02052012 - Videos now have sound enabled

  • 01252012 - Problem set 1 is uploaded

  • 01252012 - Slides and videos are uploaded

  • 01252012 - Project ideas and datasets are uploaded

  • 01192012 - The graphical models tab has links to video lectures on tutorials on the subject (this is mainly for students who didn’t get to attend the class by Mike Jordan and Martin Wainwright).

  • 01182012 - The systems slides are available now (follow the systems link)

  • 01182012 - Updated project guidelines


Scalable Machine Learning occurs when Statistics, Systems, Machine Learning and Data Mining are combined into flexible, often nonparametric, and scalable techniques for analyzing large amounts of data at internet scale. This class aims to teach methods which are going to power the next generation of internet applications.

The class will cover systems and processing paradigms, an introduction to statistical analysis, algorithms for data streams, generalized linear methods (logistic models, support vector machines, etc.), large scale convex optimization, kernels, graphical models and inference algorithms such as sampling and variational approximations, and explore/exploit mechanisms. Applications include social recommender systems, real time analytics, spam filtering, topic models, and document analysis.



  • Basic probability and statistics. Having attended a machine class would be a big plus but is not absolutely required. Particularly some knowledge of kernels and graphical models would be useful.

  • Basic linear algebra (matrices, vectors, eigenvalues). Knowing functional analysis would be great but not required.

  • Ability to write code that exceeds ‘Hello World’. Preferably beyond Matlab or R.

  • Basic knowledge of optimization. Having attended a convex optimization class would be great.

Page generated 2012-02-22 21:44:22 PST, by jemdoc.

Looks like some really awesome content in here.