How to study ML at the Department of Statistics
Starting from October 2021, we offer new bachelor’s and master’s degrees. From now on, it is possible to obtain a master’s degree in “Statistics and Data Science” with a specialization in ML. Formal instructions on how to apply for these programs can be found here.
The bachelor’s degree “Statistics and Data Science”
Here, students will become familiar with all the basics needed to study ML in-depth later on.
The curriculum consists of:
- Several maths courses: analysis, linear algebra, and numerics.
- A variety of courses in statistics: descriptive statistics, inference, statistical regression models, multivariate analysis, probability theory, and others.
- Several programming courses teach how to implement and apply the methods (we have a strong focus on R, but there are as well other courses on languages such as python).
- The mandatory course “Introduction to Machine Learning” introduces the basic concepts and methods of (supervised) ML and prepares for more advanced courses. Here we cover linear methods, trees and forests, ML evaluation and simple tuning techniques. All methods are explored by mathematical and programming exercises.
Applicable modules (PO 2021): P 12 - “Introduction to Python” is a bachelor’s course focussed on conveying basic skills for working with Python. This includes setting up a suitable environment, getting to know the basic data types and becoming comfortable with the paradigm of object oriented programming. Further, gaining practical skills for data analysis and for several subfields of machine learning are a desired outcome of this course. Applicable modules (PO 2021): WP 5
The Master’s degree “Statistics and Data Science with specialization in ML”
In general, students can select between five different specializations, including one in ML. Among others, we offer the following lectures here
- “Supervised Learning” (aka “Predictive Modelling”) directly builds on the “Introduction to ML” from the bachelor’s degree and covers the mathematical foundations of ML in much more depth, including risk minimization, information theory and regularization. It also introduces more advanced methods like boosting, SVMs and Gaussian processes.
Applicable modules (PO 2021): P 2 - “Optimization” builds on the bachelor’s course on numerics and offers a structured overview on many different sub-branches of optimization - although we mainly discuss continuous problems and nearly no combinatorial or discrete optimization. Among the covered topics are stochastic gradient descent, second-order methods, constraint optimization, Lagrange multipliers and duality, subgradients and non-smooth techniques, evolutionary algorithms, multi-objective optimizations and Bayesian optimization.
Applicable modules (PO 2021): WP 1 - “Deep Learning” introduces the concepts of training neural networks, including backpropagation and general optimization challenges such as ill-conditioning, local minima and saddle points. It further teaches regularization techniques for neural networks such as dropout and augmentation, and specific types of neural networks including convolutional neural networks with details about convolution operations as well as recurrent neural networks.
Applicable modules (PO 2021): WP 7 - “Advanced Deep Learning” builds directly on “Deep Learning” and discusses advanced techniques such as autoencoders and their variants, generative adversarial networks, graph neural networks, semi-supervised learning, adversarial examples, interpretability and uncertainty quantification methods as well as other recent topics such as normalizing flows.
Applicable modules (PO 2021): WP 9 (Applied Machine Learning), WP 32 (Current Research in Machine Learning), WP 34 (Selected Topics of Machine Learning) - “Deep Learning for Natural Language Processing” introduces deep learning concepts which are specifically relavant for NLP applications. First, general ML concepts are revisited followed by an introduction to word vector representations. Subsequently all of the common neural architectures are discussed, while a special focus is on RNNs, Attention and the Transformer. The second half of the lecutre covers Transfer Learning, BERT and other state-of-the-art architectures as well as proper evaluation and benchmarking. Further, advanced topics like prompting, zero-/few-shot learning and multilinguality are discussed.
Applicable modules (PO 2021): WP 9 (Applied Machine Learning), WP 32 (Current Research in Machine Learning), WP 34 (Selected Topics of Machine Learning) - “Advanced Machine Learning” builds on “Supervised Learning” and introduces advanced concepts of machine learning related to supervised learning such as performance estimation techniques, calibration, tuning, feature engineering, and practical toolkits, but also introduces unsupervised learning techniques and related concepts.
Applicable modules (PO 2021): WP 8 - “Automated Machine Learning” covers modern algorithms for hyperparameter optimization, ML pipeline configuration and neural architecture search.
Applicable modules (PO 2021): WP 33 - “Applied Machine Learning” teaches students which pitfalls have to be considered when applying machine learning to real-world problems and shall give practical experiences which will be useful for both academic and industrial careers.
Applicable modules (PO 2021): WP 9 - “Interpretable Machine Learning” (planned) builds on the “Introduction to ML” from the bachelor’s degree and focuses on model-agnostic interpretation techniques that produce different types of explanations and can help to better understand the global (i.e., expected overall) and the local (e.g., observation-wise) behavior of ML models. Among the covered topics are methods for visualizing feature effects, quantifying feature importance, and feature interactions.
Applicable modules (PO 2021): tbd
Further educational components
In addition to lectures, these modules are included in the Bachelor’s and Master’s degrees:
- [B] [M] Two thesis projects, which should focus on ML for this specialization.
- [M] The module “Consulting” offers the students the opportunity to work on an applied project with partners from industry or applied sciences. Alternatively, students can choose to participate in the “Innovationslabor”. While “Consulting” focuses on data analysis, the “Innovationslabor” focuses more on implementation and prototyping for data science projects.
- [B] [M] In our seminars, our professors and junior researchers cover modern topics of interest in ML. We regularly offer courses on explainability, uncertainty quantification, few labels learning, advanced deep learning, AutoML, fairness, causality, and many more.
- [B] A minor subject. Computer science might be a good choice as preparation for the ML master.
Our digital strategy for modern education
Education should be open for as many people as possible and oppose as few barriers as possible. Therefore, we started to open our educational offerings step by step to everyone. On the one hand, we publish the educational material on public websites such as https://slds-lmu.github.io/i2ml/. On the other hand, the sources of this material are public as well, e.g., https://github.com/slds-lmu/lecture_i2ml. By this means, everyone interested can either access the material and learn from it or be a part of improving the material by contributing with pull requests on our git repositories. We are also actively working with other universities to collaboratively develop our courses and share material, e.g., resulting in a course on Auto ML (https://ki-campus.org/courses/automl-luh2021). Furthermore, we published our view on developing these open source educational resources, discuss challenges, and point into the directions of possible solutions, see https://arxiv.org/abs/2107.14330.
Throughout the 4 semesters, a special focus is on programming; implementing new methods of machine learning is a crucial aspect of the daily work of a machine learning researcher or data scientist. This fact is reflected in all our courses and we motivate students to develop the R skills obtained in the bachelor and to widen the scope and to learn at least one additional programming language, e.g., python.