Research Seminar "Machine Learning Theory"

This is the research seminar by Ulrike's group and everybody else who is interested

When and where

Each thursday 13:30 - 14:30 IN OUR NEW BUILDING : Maria-von-Linden-Strasse 6, Seminar room in the third floor. Typically, we also meet for lunch at about 12:45 at the MPI cafeteria.


Most sessions take place in form of a reading group: everybody reads the assigned paper before the meeting. Then we are going to discuss the paper in the meeting. Sometimes we also have talks by guests or members of the group.


Everybody who is interested in machine learning theory. Students, PhD students and researchers alike. We do not mind people dropping in and out depending on whether they find the current session interesting or not.

Upcoming meetings

  • 28.2.2019 no meeting
  • 7.3.2019 no meeting
  • 14.3.2019 Paper discussion (Michael L.) J. Kleinberg, M. Raghavan: Selection Problems in the Presence of Implicit Bias. 2018.
  • 21.3.2019 Paper Discussion (Luca) Grover, Leskovec. node2vec: Scalable Feature Learning for Networks KDD 2016
  • 28.3.2019 (Benjamin?) Fairness Through Awareness Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, Rich Zemel pdf
  • 4.4.2019 Paper discussion (Leena) Calandriello, Rosasco. Statistical and Computational Trade-Offs in Kernel K-Means NeurIPS 2018
  • 11.4.2019 TML DAY, a full day of internal talks:
    • 9:00-9:45 Talk by Michael Lohaus - Biased access to international networks
    • 9:45-10:30 Talk by Damien Garreau
    • coffee break
    • 11:00-11:45 Talk by Michael Perrot - Foundations of Comparison-Based Hierarchical Clustering
    • 11:45-12:30 Talk by Leena Chennuru Vankadara - MMD based generalized clustering
    • lunch break
    • 14:00-14:45 Talk by Debarghya Ghoshdastidar - Generating samples for graph based learning
    • 14:45-15:30 Talk by Sascha Meyen - Two Pitfalls In Consciousness Research: The Indirect Task Advantage and Confidence Weighted Majority Voting
    • coffee break
    • 16:00-16:45 Talk by Siavash Haghiri - Comparison-based setting in machine learning
  • 18.3.2019 Paper discussion: Andoni A, Naor A, Nikolov A, Razenshteyn I, Waingarten E. Data-dependent hashing via nonlinear spectral gaps STOC 2018

Suggested papers for future meetings

(Please put your name when entering suggestions; it does not mean that you need to present it, but then we can judge where it comes from)

Past meetings

  • 21.2.2019 Talk by Oindrila Kanjilal: Structural reliability estimation using Markov chain splitting and Girsanov's transformation based methods
    Abstract: This talk is in the area of Monte Carlo simulation based methods for structural reliability estimation with special focus on strategies to reduce sampling variance of the estimator for the probability of failure. The talk will focus on two specific variance reduction schemes for reliability assessment, namely, the Markov chain Monte Carlo based particle splitting methods, and the Girsanov transformation based importance sampling methods for dynamical systems. Specifically, three issues will be discussed: (a) strategies to reduce sampling variance in the subset simulation based methods by modifying a few intermediate steps in the existing subset simulation algorithm, (b) development of closed loop Girsanov controls in the study of structural dynamical systems governed by stochastic differential equations, and (c) combining the Markov chain particle splitting methods and the closed loop Girsanov transformation based method to assess reliability of dynamical systems with uncertain parameters.
  • 7.2.2019 Paper discussion: Mikhail Belkin, Daniel Hsu, Partha Mitra: Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate. NeurIPS 2018.
  • 24.1.2019 Talk by Marina Meila (University of Washington): Unsupervised Validation for Unsupervised Learning
    Abstract: Scientific research involves finding patterns in data, formulating hypotheses, and validating them with new observations. Machine learning is many times faster than humans at finding patterns, yet the task of validating these as "significant" is still left to the human expert or to further experiment. In this talk I will present a few instances in which unsupervised machine learning tasks can be augmented with data driven validation. In the case of clustering, I will demonstrate a new framework of proving that a clustering is approximately correct, that does not require a user to know anything about the data distribution. This framework has some similarities to PAC bounds in supervised learning; unlike PAC bounds, the bounds for clustering can be calculated exactly and can be of direct practical utility. In the case of non-linear dimension reduction by manifold learning, I will present a way around the following well-known problem. It is widely recognized that the low dimensional embeddings obtained with manifold learning algorithms distort the geometric properties of the original data, like distances and angles. These algorithm dependent distortions make it unsafe to pipeline the output of a manifold learning algorithm into other data analysis algorithms, limiting the use of these techniques in engineering and the sciences. Our contribution is a statistically founded methodology to estimate and then cancel out the distortions introduced by any embedding algorithm, thus effectively preserving the distances in the original data. This method is based on augmenting the output of a manifold learning algorithm with "the pushforward Riemannian metric", i.e. with additional metric information that allows it to reconstruct the original geometry. Joint work with Dominique Perrault-Joncas, James McQueen, Jacob VanderPlas, Zhongyue Zhang, Grace Telford, Yu-chia Chen, Samson Koelle
  • 17.1.2019 Talk by Jesse Anderton (Northeastern University, Boston): Ordinal Embedding: A Geometric Approach
  • 10.1.2019 Talk by Moritz Haas about his master thesis: Ranking based on local comparisons
  • 13.12.2018 Paper discussion: Clustering Redemption--Beyond the Impossibility of Kleinberg's Axioms. Vincent Cohen-Addad, Varun Kanade and Frederik Mallmann-Trenn, NeurIPS 2018.
  • 29.11.2018 Paper discussion: Approximate Nearest Neighbors in Limited Space Piotr Indyk and Tal Wagner. COLT 2018
  • 22.11.2018 Paper discussion: Xue, Kpotufe Achieving the time of 1-NN, but the accuracy of k-NN AISTATS 2018.
  • 15.11.2018 Paper discussion:Delayed Impact of Fair Machine Learning, Lydia Liu, Sarah Dean, Esther Rolf, Max Simchowitz, Moritz Hardt, ICML 2018
  • 8.11.2018 Paper discussion: Fairness Without Demographics in Repeated Loss Minimization, Tatsunori Hashimoto, Megha Srivastava, Hongseok Namkoong, Percy Liang, ICML 2018
  • 25.10.2018 Paper discussion: Wasserstein auto-encoders (Ilya Tolstikhin, Olivier Bousquet, Sylvain Gelly, Bernhard Schoelkopf)
  • 18.10.2018 Master thesis talk by Tosca Lechner
  • 11.10.2018 Paper discussion: Approximate ranking from pairwise comparisons (AISTATS 2018) Reinhard Heckel, Max Simchowitz, Kannan Ramchandran, and Martin J. Wainwright
  • 25.7.2018 Talk by Krikamol Muandet, about learning theory approaches in counterfactual causality
  • 11.7.2018 Talk by Carl Johann Simon-Gabriel on his PhD thesis
  • 4.7.2018 Leena presents her master thesis: Metric Embeddings for Machine Learning, Siavash presents his ICML talk: Comparison-Based Random Forests
  • 27.6.2018 Talk by Tobias Frangen about his master thesis: Consistency of Relative Neighborhood Classification Rules
  • 20.6.2018 Paper discussion: An Analysis of the t-SNE Algorithm for Data Visualization,Sanjeev Arora, Wei Hu and Pravesh K Kothari, COLT 2018
  • 15.6.2018 Talk by Karl Rohe (University of Wisconsin-Madison)
    Abstract: This paper uses the relationship between graph conductance and spectral clustering to study (i) the failures of spectral clustering and (ii) the benefits of regularization. The explanation is simple. Sparse and stochastic graphs create a lot of small trees that are connected to the core of the graph by only one edge. Graph conductance is sensitive to these noisy "dangling sets." Spectral clustering inherits this sensitivity. The second part of the paper starts from a previously proposed form of regularized spectral clustering and shows that it is related to the graph conductance on a "regularized graph." We call the conductance on the regularized graph CoreCut. Based upon previous arguments that relate graph conductance to spectral clustering (e.g. Cheeger inequality), minimizing CoreCut relaxes to regularized spectral clustering. Simple inspection of CoreCut reveals why it is less sensitive to small cuts in the graph. Together, these results show that unbalanced partitions from spec tral clustering can be understood as overfitting to noise in the periphery of a sparse and stochastic graph. Regularization fixes this overfitting. In addition to this statistical benefit, these results also demonstrate how regularization can improve the computational speed of spectral clustering. We provide simulations and data examples to illustrate these results.
  • 13.6.2018 Whole day: Seminar on statistics on graphs and networks
  • 6.6.2018 Talk by Michael Schober about his PhD thesis (PhD student of Philipp Hennig)
  • 30.5.2018 Paper discussion: Arias-Castro Some theory for ordinal embedding, Bernoulli 23(3):1663-1693, 2017.
  • 9.5.2018 Paper discussion: Kazemi, Chen, Dasgupta, Karbasi. Comparison Based Learning from Weak Oracles. AISTATS, 2018.
  • 2.5.2018 Paper discussion: Kremer, Sha, Igel Robust Active Label Correction AISTATS 2018.
  • 25.4.2018 Paper discussion: Ukkonen Crowdsourced correlation clustering with relative distance comparisons. ICDM, 2017.
  • 18.4.2018 Talk by Erwan Scornet (Center for Applied Mathematics, Ecole Polytechnique, Paris) (at the MPI, seminar room N0.002): Consistency and minimax rates of random forests
    Abstract: The recent and ongoing digital world expansion now allows anyone to have access to a tremendous amount of information. However collecting data is not an end in itself and thus techniques must be designed to gain in-depth knowledge from these large data bases. This has led to a growing interest for statistics, as a tool to find patterns in complex data structures, and particularly for turnkey algorithms which do not require specific skills from the user. Such algorithms are quite often designed based on a hunch without any theoretical guarantee. Indeed, the overlay of several simple steps (as in random forests or neural networks) makes the analysis more arduous. Nonetheless, the theory is vital to give assurance on how algorithms operate thus preventing their outputs to be misunderstood. Among the most basic statistical properties is the consistency which states that predictions are asymptotically accurate when the number of observations increases. In this talk, I will present a first result on Breiman’s forests consistency and show how it sheds some lights on its good performance in a sparse regression setting. I will also present new results on minimax rates of Mondrian forests which highlight the benefits of forests compared to individual regression trees.
  • 11.4.2018 Paper discussion: Efficient k-nearest neighbor graph construction for generic similarity measures Dong, Charikar, Li, WWW-2011.
  • 4.4.2018 Paper discussion: Which Distribution Distances are Sublinearly Testable? Daskalakis, Kamath, Wright, SODA 2018
  • 28.3. no meeting, easter vacation
  • 21.3.2018 no meeting, Conference on computational archeology , program
  • 14.3.2018 TML Day: lots of internal talks
    9:00 - 9:45 Siavash Haghiri (Comparison-Based Framework for Psychophysics)
    9:45 - 10:30 Damien Garreau (Comparison-Based Random Forests)
    11:00 - 11:45 Michael Perrot (Boosting Triplets for Classification)
    11:45 - 12:30 Michael Lohaus (Bayesian Optimization for Distance Estimation)
    13:30 - 14:30 Debarghya Ghoshdastidar (Minimax Rates in Graph Testing and Feature Clustering)
    14:30 - 15:15 Georgios Arvanitidis
    15:45 - 16:30 Diego Fioravanti (A principled approach to adversarial examples)
  • 28.2.2018 Paper discussion: Crowdsourced Clustering: Querying Edges vs Triangle , NIPS 2016
  • 21.2.2018 Paper discussio: Learning Low-Dimensional Metrics , Blake Mason, Lalit Jain, Robert Nowak, NIPS 2017
  • 14.2.2018 Talk by Maren Mahsereci about her PhD thesis (PhD student in Philipp Hennigs group)
  • 7.2.2018 Paper discussion: On clustering network-valued data Soumendu Sundar Mukherjee, Purnamrita Sarkar, Lizhen Lin, NIPS 2017
  • 17.1.2018 Paper discussion: Yizhen Wang, Somesh Jha, Kamalika Chaudhuri, Analyzing the Robustness of Nearest Neighbors to Adversarial Examples , 2017
  • 10.1.2018 Paper discussion: Formal Guarantees on the Robustness of a Classifier against Adversarial ManipulationHein et al, NIPS 2017
  • 20.12.2017 Paper discussion: Supervised Word Movers Distance Gao Huang, Chuan Guo, Matt J. Kusner, Yu Sun, Fei Sha, Kilian Q. Weinberger, NIPS 2016
  • 13.12.2017 Paper discussion: S. Dasgupta. A cost function for similarity-based hierarchical clustering. STOC, 2016.
  • 29.11.2017 Paper discussion: Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering. NIPS 2016. link
  • 22.11.2017 Paper discussion: k*-Nearest Neighbors: From Global to Local. NIPS 2016
  • 15.11. 2017 Paper discussion: Active Learning from Imperfect Labelers Songbai Yan, Kamalika Chaudhuri, Tara Javidi, NIPS 2016
  • 8.11.2017 Paper discussion: Square Hellinger Subadditivity for Bayesian Networks and its Applications to Identity Testing. COLT 2017.
  • 25.10.2017 Paper discussion: Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. Tolga Bolukbasi, Kai-Wei Chang, James Y. Zou, Venkatesh Saligrama, Adam T. Kalai . NIPS 2016. link
  • 4.10.2017 Paper discussion: The Perturbed Variation Haarel, Mannor, NIPS 2015
  • Wed, 26.7.2017: Paper discussion: Kondor, Pan: Multiscale graph Laplacian kernel. NIPS 2016
  • 18.+19.7. Mini-conference ML in Science, Tuebingen
  • Wed, 12.7.2017: Paper discussion: Clustering with Same-Cluster Queries.Hasan Ashtiani et al. NIPS, 2016
  • Wed, 5.7.2017: Paper discussion: Understanding deep learning requires rethinking generalization Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals
  • Thur, 29.6. Machine Learning Summer School
  • Wed, 21.6. Machine Learning Summer School
  • Tue, 13.6. 14:00 Talk by Cheng Tang on subspace clustering
  • 12.6. 15:00 Talk by Nicolas Garcia (at the MPI, seminar room fourth floor): The variational formulation of the Bayesian update and its connection to the choice of metric tensor in Riemannian MCMC.
    Abstract: A pressing question in Bayesian statistics and machine learning is to introduce a unified theoretical framework that brings together some of the many statistical models and algorithmic methodologies employed by practitioners. In this talk I will argue in favor of the variational formulation of the Bayesian update as it provides both an overarching structure and a powerful tool for the analysis of many such models and algorithms. When coupled with the theory of gradient flows on the space of probability measures, the variational formulation of the Bayesian update suggests a natural path connecting prior and posterior distributions. This path however, is only natural after the choice of a metric in the base space where the prior and posterior distributions are defined. I will then show how the variational framework suggests clear optimality criteria for the choice of metric in the Riemannian MCMC methodology introduced by Girolami and Calderhead. This is joint work with Daniel Sanz-Alonso.
  • 1.6.2017: Paper discussion: False Discovery Rate Control and Statistical Quality Assessment of Annotators in Crowdsourced Ranking ICML 2016
  • 24.5. 2017 Paper discussion: Fast Distributed k-Center Clustering with Outliers on Massive Data
  • 10.5.2017: Paper discussion: Representational Similarity Learning with Application to Brain Networks ICML 2016
  • 4.5.2017: Paper discussion: Scalable Semi-Supervised Aggregation of Classifiers. NIPS 2015
  • 26.4.2017: Paper discussion: Sign rank versus VC dimension COLT 2016
  • 20.4.2017: Paper discussion: Starting Small - Learning with Adaptive Sample Sizes ICML 2016
  • 13.4.2017 Paper discussion: Compressive Spectral Clustering ICML 2016
  • 5.4.2017 Presentations by our group members
    • 10:00 - 10:30 Talk by Debarghya Ghoshdastidar: Two-Sample Tests for Large Random Graphs
    • 10:30 - 11:00 Talk by Matthaeus Kleindessner: Machine learning in a setting of ordinal distance information --- kernel functions as an alternative to the embedding approach
    • 11:30 - 12:00 Talk by Lennard Schulz: A comparison-based setup in psychophysics: comparing subsampling strategies
    • 12:00 - 12:30 Talk by Siavash Haghiri: Comparison Based Nearest Neighbor Search
    • 13:15 - 13:45 Talk by Cheng Tang: Understanding the empirical success of clustering heuristics
  • 30.3.2017 Paper discussion: Clustering Signed Networks with the Geometric Mean of Laplacians Pedro Mercado, Francesco Tudisco, Matthias Hein. NIPS 2016 (Debarghya)
  • 23.3.2017 Paper discussion: Hardt, Price, Srebro: Equality of Opportunity in Supervised Learning. NIPS 2016.
  • 16.3.2017 Paper discussion: Memory, Communication, and Statistical Queries COLT 2016
  • 23.2.2017 Talk by Damien Garreau, ENS Paris
  • 16.+17.2. 2017 Seminar on crowdsourcing takes place.
  • 9.2.2017 Talk by Michael Perrot, Université Saint-Étienne: Learning Metrics with a Controlled Behaviour

    Abstract: The goal in Machine Learning is to acquire new knowledge from data. To achieve this many algorithms make use of a notion of distance or similarity between examples. A very representative example is the nearest neighbour classifier which is based on the idea that two similar examples should share the same label: it thus critically depends on the notion of metric considered. Depending on the task at hand these metrics should have different properties but manually choosing an adapted comparison function can be tedious and difficult. The idea behind Metric Learning is to automatically tailor such metrics to the problem at hand. One of the main limitation of standard methods is that the control over the behaviour of the learned metrics is often limited. In this talk I will present two approaches specifically designed to overcome this problem. In the first one we consider a general framework able to take into account a reference metric acting as a guide for the learned metric. We are then interested in theoretically studying the interest of using such side information. In the second approach we propose to control the underlying transformation of the learned metric. Specifically we use some recent advances in the field of Optimal Transport to force it to follow a particular geometrical transformation.

  • 26.1.2017 Paper discussion: Recommendations as Treatments: Debiasing Learning and Evaluation ICML 2016
  • 19.1.2017 Paper discussion: Data-driven Rank Breaking for Efficient Rank Aggregation JMLR 2016
  • 12.1.2017 Paper discussion: Greedy Column Subset Selection: New Bounds and Distributed Algorithms ICML 2016
  • 24.11.2016 Paper discussion: Learning Combinatorial Functions from Pairwise Comparisons COLT 2016
  • 17.11.2016 Paper discussion: Provably Manipulation-Resistant Reputation Systems COLT 2016
  • 10.11.2016 Paper discussion: Active Ranking from Pairwise Comparisons and when Parametric Assumptions Don’t Help arXiv
  • 3.11.2016 Paper discussion: When Can We Rank Well from Comparisons of $O(n\log n)$ Non-Actively Chosen Pairs? COLT 2016
  • 28.7.2016 Paper discussion: Finite Sample Prediction and Recovery Bounds for Ordinal Embedding, Lalit Jain, Kevin Jamieson, Robert Nowak, arxiv 2016
  • 21.7.2016 Paper discussion: Fast and Accurate Inference of Plackett–Luce Models Lucas Maystre, Matthias Grossglauser. NIPS 2015
  • 30.6.2016 Paper discussion: Nihar Shah et al: Estimation from Pairwise Comparisons: Sharp Minimax Bounds with Topology Dependence
  • 23.6.2016 Talk by Debarghya
  • 14.7.2016 Paper discussion: Precision-Recall-Gain Curves: PR Analysis Done Right Peter Flach, Meelis Kull. NIPS 2015
  • 16.6.2016 Paper discussion: Parallel Correlation Clustering on Big Graphs Xinghao Pan, Dimitris Papailiopoulos, Samet Oymak, Benjamin Recht, Kannan Ramchandran, Michael I. Jordan NIPS 2015
  • 9.6.2016 Paper discussion: Optimal Testing for Properties of Distributions Jayadev Acharya, Constantinos Daskalakis, Gautam C. Kamath. NIPS 2015
  • 2.6.2016 Paper discussion: Competitive Distribution Estimation: Why is Good-Turing Good Alon Orlitsky, Ananda Theertha Suresh. NIPS 2015.
  • 12.5.2016 Paper discussion: Fast two sample testing with analytic representations of probabiilty measures. Chwialkowski, Ramdas, Sejdinovic, Gretton, NIPS
  • 28.4.2016 Paper discussion: A Nearly-Linear Time Framework for Graph-Structured Sparsity. ICML 2015 pdf
  • 19.4.2016 Paper discussion: Distributed Estimation of Generalized Matrix Rank: Efficient Algorithms and Lower Bounds Yuchen Zhang, Martin Wainwright, Michael Jordan. ICML 2015
  • 12.4.2016 Paper discussion: Multiview Triplet Embedding: Learning Attributes in Multiple Maps. ICML 2015. pdf
  • 7.4.2016 Learning preferences from ordinal data. Oh, Thekumparampil, Xu, NIPS 2015
  • 3.3.2016 Paper discussion: Less is More: Nyström Computational Regularization Alessandro Rudi, Raffaello Camoriano, Lorenzo Rosasco, NIPS 2015
  • 10.3.2016 Paper discussion: Shah, Wainwright: Simple, Robust and Optimal Ranking from Pairwise Comparison. 2015 link
  • 25.2.2016 Paper discussion: Statistical Model Criticism using Kernel Two Sample Tests James R. Lloyd, Zoubin Ghahramani, NIPS 2015.
  • 18.2.2016 Paper discussion: Approval Voting and Incentives in Crowdsourcing, Nihar Shah, Dengyong Zhou, Yuval Peres, ICML 2015. pdf
  • Paper discussion: Muhammad Bilal Zafar, Isabel Valera Martinez, Manuel Gomez Rodriguez, and Krishna Gummadi Fairness Constraints: A Mechanism for Fair Classification . icml workshop on fairness and accountabilty in ml 2015
  • 28.1.2016 Paper discussion: Y. Jiao and J.-P. Vert, "The Kendall and Mallows Kernels for Permutations", ICML 2015

In Hamburg (2012-2015)

  • 4.6.2015 Paper discussion: Florent Krzakala, Cristopher Moore, Elchanan Mosseld, Joe Neemand, Allan Sly, Lenka Zdeborová, and Pan Zhanga: Spectral redemption in clustering sparse networks, PNAS 2013 pdf
  • 28.5.2015 Paper discussion: Wauthier, Jojic, Jordan: Active spectral clustering via iterative uncertainty reduction. KDD, 2012. link
  • 21.5.2015 Paper discussion: Jun Li , Juan A. Cuesta-Albertos, Regina Y. Liu, DD-Classifier: Nonparametric Classification Procedure Based on DD-Plot, JASA 2015. link
  • 27.4.2015: Talk by Ruth Urner, Max Planck Institute for Intelligent Systems, Tuebingen.
  • 23.4. 2015 Paper discussion: Learning Mixtures of Ranking Models (ranjal Awasthi, Avrim Blum, Or Sheffet, Aravindan Vijayaraghavan, NIPS 2014)
  • 16.4. 2015 Paper discussion: Almost no label to cry (Giorgio Patrini, Richard Nock, Tiberio Caetano, Paul Rivera, NIPS 2014)
  • 9.4. 2015 Paper discussion: Discrete Graph Hashing (Wei Liu, Cun Mu, Sanjiv Kumar, Shih-Fu Chang, NIPS 2014)
  • 26.2.2015 We discuss again the paper "Ranking and combining multiple predictors without labeled data" pdf. Focus is on understanding how the key lemma 1 can be true and why it makes sense. Everybody prepare seriously, please ...
  • 5.3. 2015 Machine learning brainstorming day!
    9-10 Tobias Lang: Active learning with user feedback
    10 - 11 Morteza Alamgir: Centrality based graph kernels
    11-12: Mehdi Sajjadi: peer grading algorithms, our data, our current insights
    12 - 13 lunch
    13 - 14 Sven Kurras: Clustering in an online game with adversarial players
    14 - 15 Rita Morisi: Spectral clustering applied to the Consensus problem
    Unfortunately, the talk by Matthaeus has to be skipped (was: Estimating median and modes and clusters from ordinal crowd data)
  • 19.2. Two Bachelor-thesis defense talks
    Jonas Häring: Comparing graphs with small doubling dimension to expander graphs
    Alexis Engelke: Streaming algorithms for graph partitioning
  • 12.2.2015 Paper discussion: Heikinheimo, Ukkonen. The crowd-median algorithm. First AAAI Conference on Human Computation and Crowdsourcing, 2013. pdf.
  • 5.2.2015 Talk by Rita Morisi (Institute of Advanced Studies Lucca, Italy): Graph based techniques in machine learning and control
  • 29.1. 2015 Paper discussion: Parisi, Nadler et al, PNAS 2014: Ranking and combining multiple predictors without labeled data pdf
  • 22.1. 2015 Talk by Thomas Buehler, Uni Saarbruecken: Titel: A flexible framework for solving constrained ratio problems in machine learning
  • 15.1.2015 Paper discussion: Distributed Balanced Clustering via Mapping Coresets, NIPS 2014 pdf
  • 8.1. 2015 Paper discussion: Scalable Simple Random Sampling and Stratified Sampling, ICML 2013 pdf
  • 18.12.2014 Paper discussion: Sinkhorn Distances: Lightspeed Computation of Optimal Transport. M. Cuturi, NIPS 2013 pdf
  • 11.12.2014 Paper discussion: Dimensionality Reduction with Subspace Structure Preservation, NIPS 2014 pdf
  • 4.12. 2014 Paper discussion: Graph clustering via a discrete uncoupling process. Stijn van Dongen. SIAM J. MATRIX ANAL. APPL. 2008 pdf
  • 27.11.2014 We discuss the following two papers, just high level:
    Fennel: Streaming graph partitioning for massive scale graphs. 2014 pdf
    Streaming graph partitioning for large distributed graphs. I Stanton, G Kliot, KDD 2012. pdf
  • 13.11.2014 Paper discussion: Network-based statistic: identifying differences in brain networks A Zalesky, A Fornito, ET Bullmore - Neuroimage, 2010. pdf
  • 20.11.2014 Talk by Maximilian Christ, University of Dortmund: SNP DNA analysis with logic regression.
  • 30.10.2014 Master defense talk by Longshan Sun: Algorithms for peer assessment.
  • 23.10.2014 Paper discussion: On the convergence of maximum variance unfolding Ery Arias-Castro, Bruno Pelletier. JMLR, 2013 pdf
  • 16.10.2014 Paper discussion: Ohad Shamir: Fundamental Limits of Online and Distributed Algorithms for Statistical Learning and Estimation. Arxiv 2014. link (we read the main paper, don't dive into the proofs if you don't have time).
  • 2.10.2014 Talk by Tobias Lang, Zalando, Berlin (at 11:00!): From Planning and Exploration in Stochastic Relational Worlds to Recommender Systems in the E-Commerce World
  • 24.8.2014 Sundus Israr's Master defense test talk
  • 10.7. 2014 Morteza Alamgir's thesis defense test talk.
  • 3. 6. 2014 A whole day of talks!
    9:00 - 9:45 Matthaeus Kleindessner: Uniqueness of Ordinal Embedding
    9:45 - 10:30 Yoshikazu Terada: Local ordinal embedding
    10:30 - 11:30 Ulrike von Luxburg: Density estimation from unweighted kNN graphs
    12:30 - 13:15 Sven Kurras: The f-Adjusted Graph Laplacian: a Diagonal Modification with a Geometric Interpretation
    13:15 - 13:45 Morteza Alamgir: Density-preserving quantization with application to graph downsampling
  • 15.5.2014 Talk by Daniel Schmidtke about his master thesis (from 15:00-15:30). Then we discuss the paper by Dejan Slepcev et al, Continuum limit of total variation on point clouds. Preprint, 2014
  • 22.5.2014 Paper discussion: Breaking the Small Cluster Barrier of Graph Clustering, Nir Ailon, Yudong Chen, Huan Xu, ICML 2013 pdf
  • 17.4.2014 Paper discussion: Senelle, Garcia-Diez, Mantrach, Shimbo, Saerens, Fouss: The sum over forests density index: identifying dense regions in a graph. Preprint, 2013, not online yet, here is a local copy (with our usual login and password used for teaching). pdf
  • 24.4. 2014: talk by Sharon Bruckner (FU Berlin) about "Random-walk based methods for clustering"
  • 10.4.2014 Paper discussion: A Local Algorithm for Finding Well-Connected Clusters, Zeyuan Allen Zhu, Silvio Lattanzi, Vahab Mirrokni, ICML 2013 pdf
  • 3.4.2014 Paper discussion: Two papers on peer grading: Piech, Koller at al. Tuned Models of Peer Assessment in MOOCs pdf Shah, Wainwright et al A Case for Ordinal Peer-evaluation in MOOCs pdf
  • 23.1. 2014 Dominik Herrmann is going to talk about his PhD thesis. He used machine learning methods to investigate the computer security issues.
  • 9.1.2014 Tutorial by Sven Kurras on the mean shift algorithm
  • 21.11. 2013 Paper discussion: Estimating Unknown Sparsity in Compressed Sensing, Miles Lopes, ICML 2013 pdf
  • 14.11. 2013 Paper discussion: Efficient Ranking from Pairwise Comparisons, Fabian Wauthier, Michael Jordan, Nebojsa Jojic, ICML 2013 pdf
  • 7.11.2013 Paper discussion: Scalable Optimization of Neighbor Embedding for Visualization, Zhirong Yang, Jaakko Peltonen, Samuel Kaski, ICML 2013 pdf
  • 31.10.2013 Maximum Variance Correction with Application to A* Search, Wenlin Chen, Kilian Weinberger, Yixin Chen, ICML 2013 pdf
  • 25.10. 2013: Talk by Gina Gruenhage (TU Berlin). New data visualizations using cMDS: Embedding high dimensional data in a space of curves.
  • 18.10.2013 Paper discussion: Robust Structural Metric Learning, Daryl Lim, Gert Lanckriet, Brian McFee, ICML 2013 pdf
  • 11.10. 2013 Paper discussion: Vanishing Component Analysis, Roi Livni, David Lehavi, Sagi Schein, Hila Nachliely, Shai Shalev-Shwartz, Amir Globerson, best paper award at ICML 2013 pdf
  • 26.9.2013 (16:45) Talk by Julian Busch: Randomized Algorithms for Balanced Graph Cuts (defense of his Bachelor Thesis).
  • 15.7.2013 Talk by Cheng Soon Ong, Nicta Melbourne
  • 10.7.2013 Paper discussion: Local equivalences of distances between clusterings - A geometric perspective. Learning Journal, 2011 pdf
  • 3.7. 2013 Paper discussion: On the Hardness of Domain Adaptataion (And the Utility of Unlabeled Target Samples). ALT 2012 pdf
  • 19.6. 2013 Paper discussion: Statistical Consistency of Ranking Methods in A Rank-Differentiable Probability Space. NIPS 2012 pdf
  • 11.6.2013 Talk by Antoine Channarond
  • 4.6.2013 Talk by Yoshikazu Terada
  • 8.5.2013: Paper discussion: Sparse Algorithms are not Stable: A No-free-lunch Theorem. PAMI 2012. link
  • 24.4. 2013: Paper discussion: Convergence and Energy Landscape for Cheeger Cut Clustering, NIPS 2012 link
  • 6.3. 2013 Paper discussion: Clustering Sparse Graphs, NIPS 2012 link
  • 3.4.2013 Paper discussion: Semi-supervised Eigenvectors for Locally-biased Learning, NIPS 2012 link
  • 10.4.2013 Paper discussion: Learning with Partially Absorbing Random Walks, NIPS 2012 link
  • 27.2.2013: Paper discussion: Clustering by Nonnegative Matrix Factorization Using Graph Random Walks. NIPS 2012 link
  • 28.11.2012: Paper discussion: Maria A. Riolo, Mark Newman: First-principles multiway spectral partitioning of graphs. Arxiv, 2012
  • 14.11.2012: Paper discussion: J. Lee, S. Gharan, L. Trevisan. Multi-way spectral partitioning and higher-order Cheeger inequalities. STOC 2012.
  • 7.11.2012: Paper discussion: S. Dasgupta. Consistency of nearest neighbor classification under selective sampling. Twenty-Fifth Conference on Learning Theory (COLT) 2012.
  • 31.10.2012: Paper discussion: A. Barabasi et al. Controllability of complex networks. Nature 2011.