Moritz Haas


University of Tübingen
Department of Computer Science
Maria von Linden Str. 6
72076 Tübingen

Room: 30-5/A15
Phone: +49 (0)7071 29-70848
E-mail: mo.haas(at)

In May 2021, I started my PhD under joint supervision of Ulrike von Luxburg in the Theory of Machine Learning group (TML) at Tuebingen university, computer science department, and Bedartha Goswami from the machine learning in climate science group. I am a scholar in the International Max Planck Research School for Intelligent Systems (IMPRS-IS), a graduate school for PhD students from both university and Max-Planck-Institute in Tuebingen and Stuttgart.

For my master thesis, I analysed Wasserstein GANs statistically. (pdf) Interestingly, our excess risk bound for unconditional WGANS captures a key advantage of generative models: Since we can generate as many samples as we want, the generalization error is only limited by the critic network and the dataset size. If we generate enough samples (and assume to find a global optimizer), the generator network may be arbitrarily large, although the proofs use classical capacity arguments instead of the more recent double descent arguments.

At the beginning of my PhD, we explored empirical distortions in climate networks originating in limited amounts of noisy data (Journal of Climate). We also found that common resampling procedures to quantify significant behaviour in climate networks do not adequately capture intrinsic network variance. While we propose a new resampling framework, the question of how to reliably quantify intrinsic network variance from complex climatic time series remains the matter of ongoing work.

More recently, we explored when kernel as well as neural network models that overfit noisy data can generalize nearly optimally. Previous literature had suggested that kernel methods can only exhibit such `benign overfitting' if the input dimension grows with the number of data points. Together with David Holzmüller and Ingo Steinwart, we show that, while overfitting leads to inconsistency with common estimators, adequately designed spiky-smooth estimators can achieve benign overfitting in arbitrary fixed dimension. For neural networks with NTK parametrization, you just have to add tiny fluctuations to the activation function. It remains to study whether a similar adaptation of the activation function or some other inductive bias towards spiky-smooth functions can also lead to benign overfitting with feature-learning neural architectures and complex datasets. (arXiv)