Seminar in Artificial Intelligence - Theoretical Aspects of Machine Learning

General information

Format

This seminar simulates a machine learning conference, where the students take on the role of authors and reviewers. It consists of multiple phases.

1. Proposal phase

Attend the mandatory first meeting either in person or remotely (details on TUWEL).

Option 1: our suggestions

You select two topics/papers (i.e., two bullet points) from one of the topics below. You will work with the material mentioned in the overview and the project-specific resources.

Option 2: your own idea + one of our suggestions

You choose your own topic to work on. This can be some existing machine learning paper/work or an own creative idea in the context of machine learning. We strongly encourage you to start from existing papers from the following venues: NeurIPS, ICML, ICLR, COLT, AISTATS, UAI, JMLR, MLJ. Importantly, your idea has to be specific and worked out well. Nevertheless, choose one of our suggestions as well.

Independent of the option you chose, understand the fundamentals of your topic and try to answer the following questions:

Select topics and write a short description of them together with the answers to the questions (~3 sentences should be sufficient) in TUWEL.

We can only accept your own proposals if you can answer the mentioned questions and have a well worked out topic.

2. Bidding and assignment phase

You will also act as reviewers and bid on the topics of your peers you want to review. Based on the biddings, we (in the role as chairs of the conference) will select one of each student’s proposals as the actual topic you will work on for the rest of this semester. You do not need to work on the other topic, anymore. Additionally, we will also assign two different topics from other students to you, which you will have to review later in the semester.

3. Working phase

Now the actual work starts. Gather deep understanding of your topic, write a first draft of your report and give a 5-minute presentation. We recommend to go beyond the given material.

4. Reviewing phase

You will again act as a reviewer for the conference by writing two reviews, one for each draft report assigned to you.

5. Writing phase

Based on the reviews from your peers (and our feedback) you will further work on your topic.

6. Submission phase

Give a final presentation and submit your report.

General resources (freely available books and lecture notes)

Topics (Tentative)

You should have access to the literature and papers through Google scholar, DBLP, the provided links, or the TU library.

Neurosymbolic AI / Logic & ML (click to expand)

Overview:

Papers and topics:

Submodularity in machine learning (click to expand)

Motivation: Submodularity is a property of set functions similar to convexity for real-valued functions. It allows to build strong machine learning algorithms for sub-task such as sketching, coresets, data distillation, and data subset selection. Moreover, it is useful for clustering, active and semi-supervised learning.

Overview:

Papers and topics:

Kernel methods (click to expand)

Motivation: Kernels generalise linear classifiers to linear functions in a (potentially infinite dimensional) feature space. They are the foundation of various popular machine learning algorithms like the kernel SVM and kernel PCA.

Overview:

Papers and topics:

Semi-supervised learning (click to expand)

Motivation: Semi-supervised learning uses labelled and to be able to train classifiers with fewer labels. This is useful in applications where unlabelled data is abundant, yet labels are scarce, such as node classification in social networks, drug discovery, and autonomous driving.

Overview:

Papers and topics:

Active learning (click to expand)

Motivation: In active learning, the learning algorithm is allowed to select the data points it wants to see labelled, for example, where it is most uncertain. The goal is to reduce the labelling effort. This is useful in applications where unlabelled data is abundant, yet labels are scarce, such as node classification in social networks, drug discovery, and autonomous driving.

Overview:

Papers and topics:

Online learning (click to expand)

Motivation: While standard supervised learning assumes that we have access to some static fixed dataset, often in practice the data arrives in a stream. This is the subject of online learning (not meant in the internet online sense, but rather as streaming/incremental). Here, we often drop standard sampling assumptions and instead study worst-case behaviour (regret).

Overview:

Papers and topics:

Modern aspects of learning theory (click to expand)

Motivation: Learning theory studies computational and algorithmic aspects of machine learning algorithms to prove guarantees such as sample complexity bounds. This important to understand and devise novel learning algorithms. In recent years, many long-standing open questions in learning theory have been answered.

Overview:

Papers and topics:

Trustworthy ML (click to expand)

Motivation: Machine learning systems are ubiquitous and it is necessary to make sure they behave as intended. In particular, trustworthiness can be achieved by means of privacy-preserving, robust, and explainable algorithms.

Overview:

Papers and topics:

Optimization (and Generalization) in Neural Networks (click to expand)

Overview:

Papers and topics:

Equivariant neural networks (click to expand)

Motivation: Many datastructures have an innate structure that our neural networks should respect. For example the output of a graph neural networks should not change if we permute the vertices (permutation equivariance/invariance).

Overview:

Papers and topics:

Graph Neural Networks (GNNs) (click to expand)

Motivation: Graphs are a very general structure and can be applied to many areas: molecules and developing medicine, geographical maps, spread of diseases. They can be used to model physical systems and solve partial differential equations. Even images and text can be seen as a special case of graphs. Thus it makes sense to develop neural networks that can work with graphs. GNNs have strong connections to many classical computer science topics (algorithmics, logic, ...) while also making use of neural networks. This means that work on GNN can be very theoretical, applied or anything in between.

Overview:

Papers and projects:

ML for SAR image processing. (click to expand)

Motivation: Synthetic Aperture Radar (SAR) is an active microwave imaging system that provides high-resolution images day and night under all weather conditions. It has been widely used in many practical applications, such as environment, crop monitoring, and disaster detection. Using best-suited machine learning algorithms to derive useful information from these data is essential.

Overview:

Papers and topics: