# Statistical Machine Learning (Summer term 2022)

## Quick links

IliasSlides, updated 2022-09-27.

Youtube videos

## Exam information

The final exams are written (Klausur), the confirmed dates are- 1.8.2022, 14:30 - 16:30, lecture hall N6, Morgenstelle
- 13.10.2022 10:00 - 12:00, lecture hall HS 9, Neue Aula

Find anonymized results of the first exam here, and the relationship between points and grades here.

## Course material

**Slides:**

Current version of the slides, updated 2022-09-27: pdf

**Videos:**The videos of the 2020 lecture can be found on youtube

**Assignments:**

- Assignment 1, Assignment1.ipynb due April 25th 2022, 12:00.
- Assignment 2, Assignment2.ipynb, test_USPS.csv, train_USPS.csv due May 2nd 2022, 12:00.
- Assignment 3, Assignment3.ipynb due May 9th 2022, 12:00.
- Assignment 4, Assignment4.ipynb, candy-data.csv, due May 16th 2022, 12:00.
- Assignment 5, Assignment5.ipynb due May 23rd 2022, 12:00.
- Assignment 6, due May 30th 2022, 12:00.
- Assignment 7, Assignment7.ipynb, train.csv, test.csv, geneexp.csv, due June 13th 2022, 12:00.
- Assignment 8, Assignment8.ipynb, rfdata_train.npy, rfdata_test.npy, due June 20th 2022, 12:00.
- Assignment 9, Assignment9.ipynb, USPS.csv, due June 27th 2022, 12:00.
- Assignment 10, Assignment10.ipynb, my_exam_questions.tex, clusterdata.mat, sbm_adjacency.npy, due July 4th 2022, 12:00.
- Assignment 11, Assignment11.ipynb, due July 11, 12:00.
- Assignment 12, Assignment12.ipynb, compas.csv, due July 18, 12:00.

## Test exam

To get an impression how the exam is going to look like, here are some assignments of an exam of one of the previous years: pdfAnd here is the collection of exam questions, you have come up with: pdf

## Online feedback form

We want to know what you like / do not like about the lecture! You can tell us anonymously with the following feedback form. The more concrete and constructive the feedback, the higher the likelihood that we can adapt to it.**Recap material :**

- Mathematics for Machine Learning, my youtube videos
- linear algebra
- probability theory
- Python tutorial: python_tutorial.ipynb and the recorded zoom call.

# Background information about the course

## Registration

You have to register for this course on Ilias. To register for the tutorials, please fill out the poll on Ilias. Firm deadline for registration and the tutorial poll is Wednesday, April 20 (because we then distribute students to tutorials).## Lecture

**Lectures (by Prof. U. von Luxburg)**

Tuesday 8:15 - 10:00, Hörsaal Mineralogie H102, Lothar-Meyer-Bau, Wilhelmstraße 56

Thursday 8:15-10:00, Hörsaal 036, Neuphilologicum (=Brechtbau), Wilhelmstr. 50).

Lectures start on April 19. For those who have never used python before, there will be an introduction to python on Wed, April 20, 16-18:00 on zoom, with this link:

## Tutorials

We will have several tutorials per week in which we discuss the assignments. You have to enroll on Ilias to get assigned to a tutorial. Tutorials start in the second week.Wed 10-12, F122 Hörsaal 2 (Sand6)

Wed 12-14, F122 Hörsaal 2 (Sand6)

Wed 16-18, Hörsaal TTR2 (Maria-von-Linden-Strasse 6)

Thur 16-18, Hörsaal TTR2 (Maria-von-Linden-Strasse 6)

**Teaching assistants and their contacts:**

- Moritz Haas (coordinator)
- Partha Ghosh
- Madhav Iyengar
- Siddharth Ramrakhiani
- Luca Rendsburg

**Information sheet regarding organization**pdf

## Contents of the lecture

The lecture is primarily intended for Master students in machine learning, computer science or related degrees, but might also be interesting for students of other disciplines like mathematics, physics, linguistics, economics, etc. If in doubt, please simply attend the first lectures and talk to us.

The focus of the lecture is on both algorithmic and theoretic aspects of machine learning. We will cover many of the standard algorithms and learn about the general principles and theoretic results for building good machine learning algorithms. Topics range from well-established results to very recent results.

- Bayesian decision theory, no free lunch theorem.
- Supervised learning problems (regression, classification): Simple baselines (nearest neibhbor, random forests); linear methods; regularization; SVMs, non-linear kernel methods and the theory behind it
- Unsupervised learning problems: Dimension reduction from PCA to manifold methods; Clustering from k-means to spectral clustering and spectral graph theory, embedding algorithms from MDS to t-SNE
- Statistical Learning Theory: consistency and generalization bounds

- Machine learning in the context of society: fairness, explainability etc
- Low rank matrix completion, compressed sensing
- Ranking

## Prerequisites

You need to know the basics of probability theory and linear algebra, as taught in the mathematics for computer science lectures in your bachelor degree, or even better as taught in the class Mathematics for Machine Learning. If you cannot remember them so well, I strongly urge you to recap the material.## Assessment criteria and exams

There will be weekly assignments that you have to solve and hand in: theoretical assignments and programming assignments in python (an introduction to python will be given in one of the first tutorials; it will not be possible to use any other programming language). You need to achieve at least 50% of all assignment points to be admitted to the final exam.Admissions of previous years: if you have attended the course Machine learning: algorithms and theory in 2021 with Matthias Hein and you got admitted to the exam, you do not need to get a new admission this year (but we strongly encourage you to take part in the tutorials and solve the assignments). Any admission before 2021 is invalid, you will need to the assignments again.

The final exams are written (Klausur), the confirmed dates are

- 1.8.2022, 14:30 - 16:30, lecture hall N6, Morgenstelle
- 13.10.2022 10:00 - 12:00, lecture hall HS 9, Neue Aula

You are not allowed to bring any material (books, slides, etc) except for what we call the controlled cheat sheet: one side (A4, one side only) of handwritten (!) notes, made by yourself.

## Literature

There is no textbook that covers machine learning the way I want. We are going to use individual chapters of various books. Here is a list of some relevant books, some of which also exist as online versions (for the rest, please check out the library):- Shalev-Shwartz, Ben-David: Understanding Machine Learning. 2014.
- Chris Bishop: Pattern recognition and Machine Learning. 2006.
- Hastie, Tibshirani, Friedman: Elements of statistical learning. 2001. (the statistics point of view on machine learning, written by statisticians)
- Kevin Murphy: Machine Learning, a probabilistic perspective, 2012 (for the probabilistic point of view)
- Schölkopf, Smola: Learning with kernels. 2002 (for support vector machines and kernel algorithms)
- Hastie, Tibshirani, Wainwright: Statistical learning with sparsity. 2015. (Lasso, low rank matrix methods)
- Duda, Hart, Stork: Pattern classification. (a somewhat outdated classic, but some sections still good to read)
- Mohri, Rostamizadeh, Talwalkar: Foundations of Machine Learning. 2012.(rather theoretical point of view)
- Devroye, Györfi, Lugosi: A Probabilistic Theory of Pattern Recognition. 2013. (Covers statistical learning theory, we do not use much of it in this lecture.)

Pre-term FAQ:

- This lecture and all its tutorials will take place in person only. Of course my videos of the 2020 lectures remain online, but some contents might change. To receive the solutions of the assignments, you have to attend the tutorials.
- If you got admitted to the exam last year (Summer 2021), you don't have to gain a new exam admission and can participate in the exam. If your admission is from 2020 or earlier, you need to do the assignments again to get admitted to the exam
- If you have questions regarding whether you are eligible for this course, please ask me in person during the first week of the term. This is easier than sending emails.