Bachelor Seminar Wissenschaftliches Arbeiten

General information

TISS: (link)
contact: Sagar Malhotra (email)
meeting link: (zoom)
everything important will be announced in TUWEL/TISS.

Format

This seminar simulates a machine learning conference, where the students take on the role of authors and reviewers. It consists of multiple phases.

1. Proposal phase

Attend the mandatory first meeting either in person or remotely (details on TUWEL).

Option 1: our suggestions

You select two topics/papers (i.e., two bullet points) from one of the topics below. You will work with the material mentioned in the overview and the topic-specific resources.

Option 2: your own idea + one of our suggestions

You choose your own topic to work on. This can be some existing machine learning paper/work or an own creative idea in the context of machine learning. We strongly encourage you to start from existing papers from the following venues: NeurIPS, ICML, ICLR, COLT, AISTATS, UAI, JMLR, MLJ. Importantly, your idea has to be specific and worked out well. Nevertheless, choose one of our suggestions as well.

Independent of the option you chose, understand the fundamentals of your topic and try to answer the following questions:

What is the problem?
Why is it an interesting problem?
How do you plan to approach the problem? / How have the authors of your topic approached the problem?

Select topics and write a short description of them together with the answers to the questions (~3 sentences should be sufficient) in TUWEL.

We can only accept your own proposals if you can answer the mentioned questions and have a well worked out topic.

2. Bidding and assignment phase

You and your fellow students will act as reviewers and bid on the topics of your peers you want to review. Based on the biddings, we (in the role as chairs of the conference) will select one of each student’s proposals as the actual project you will work on for the rest of this semester. You do not need to work on the other project, anymore. Additionally, we will also assign two different projects from other students to you, which you will have to review later in the semester.

3. Working phase

Now the actual work starts. Gather deep understanding of your topic, write a first draft of your report and give a 5-minute presentation. Feel free to go beyond the given material.

You will schedule two meetings with your supervisor to discuss your progress, but do not hesitate to contact him/her if you have any questions.

4. Reviewing phase

You will again act as a reviewer for the conference by writing two reviews, one for each draft report assigned to you.

5. Writing phase

Based on the reviews from your peers (and our feedback) you will further work on your topic.

6. Submission phase

Give a final presentation and submit your report.

General resources (freely available books)

Understanding machine learning: from theory to algorithms. Shai Shalev-Shwartz and Shai Ben-David (pdf)
Foundations of machine learning. Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar (pdf)
Foundations of data science. Avrim Blum, John Hopcroft, and Ravindran Kannan (pdf)
Mathematics for machine learning. Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong (pdf)
Mining of massive datasets. Jure Leskovec, Anand Rajaraman, and Jeffrey D. Ullman (pdf)
Reinforcement learning: an introduction. Richard Sutton and Andrew Barto (pdf)
Deep learning and neural networks. Ian Goodfellow and Yoshua Bengio and Aaron Courville (pdf)

Topics (tentative)

You should have access to the literature and papers through Google scholar, DBLP, the provided links, or the TU library.

Learning Logically Definable Concepts (click to expand)

Motivation: Ability to learn logically definable concepts from labelled data is a theoretical model of Machine Learning which is explainable by design, and integrates ideas from both logic (especially finite model theory) and PAC Learning.

Overview:

Shai Ben-David and Shai Shalev-Shwartz. Understanding Machine Learning. Chapter 2,3
Shai Ben-David Lectures. (youtube-link) Lecture 1,2,3

Papers and topics:

Martin Grohe and Martin Ritzert. Learning First-Order definable concepts over structures of small degree. 2017
Grohe et al. Learning MSO-definable hypotheses on strings. 2017
Bergeram et al. On Parameterized Complexity of learning Monadic Second-Order Formulas. 2023

AI and Neuroscience (click to expand)

Motivation: We aim to understand computation in the brain. Our research either uses recent Deep learning technology to analyze brain recordings (AI for neuroscience), or we derived comptutation principles from the neuron to inspire future generation of AI algorithms (brain inspired computing).

Expected outcomes:

AI for neuroscience 1: Training a deep neural network to extract neuron activation times from electrode array recordings
AI for neuroscience 2: Compression of electrode array recordings via deep signal tokenization
Brain inspired computing 1: Theoretical derivation of simplified brain-like computing models
Brain inspired computing 2: Performance of LLM with a specific bio-inspired feature

Papers:

Bellec, G., Salaj, D., Subramoney, A., Legenstein, R., & Maass, W. (2018). Long short-term memory and learning-to-learn in networks of spiking neurons. Advances in neural information processing systems, 31.
Bellec, G., Scherr, F., Subramoney, A., Hajek, E., Salaj, D., Legenstein, R., & Maass, W. (2020). A solution to the learning dilemma for recurrent networks of spiking neurons. Nature communications, 11(1), 3625.
Sourmpis, C., Petersen, C. C., Gerstner, W., & Bellec, G. (2024). Biologically informed cortical models predict optogenetic perturbations. eLife, 2025.

Beyond Worst-Case Analysis of Data Science Algorithms (click to expand)

Motivation:

Data science algorithms are successfully utilized in many different areas on a daily basis. Typically, these algorithms solve problems that are NP-hard and often even hard to approximate. Understanding why these algorithms work so well in practice is an important question in the area of beyond worst-case analysis.

Overview:

The book “Beyond the Worst-Case Analysis of Algorithms” by Tim Roughgarden provides a good starting point for literature search. We are particularly interested in the results related to Chapters 6, 7, 20, 28 and 30 of this book. It is encouraged to also look at other chapters of this book and papers related to these chapters.

Supervisor: Prof. Dr. Stefan Neumann

Broad Applications Beyond Healthcare (click to expand)

While our primary focus is on healthcare, RL has widespread applications in diverse domains such as finance, robotics, gaming, and autonomous systems. The adaptability of RL algorithms makes them suitable for addressing complex decision-making challenges in different fields.

Convergence Proofs - Probably Approximately Correct (PAC) (click to expand)

PAC provides a mathematical foundation for ensuring that learned policies are close to optimal, instilling confidence in the reliability of RL algorithms. With PAC, RL agents can make decisions in critical healthcare scenarios with a high degree of certainty, mitigating risks associated with uncertain outcomes.

Disentangled Representations (click to expand)

Motivation: Computing a disentangled representation is a very desirable property for modern deep learning architectures. Having access to individual, disentangled factors is expected to provide significant improvements for generalisation, interpretability and explainability.

Overview:

What is a good representation? (Bengio, et al., “Representation Learning: A Review and New Perspectives”, 2013)
Two common architectures used for disentanglement:
Variational Auto-Encoders (Kingma & Welling, “Auto-Encoding Variational Bayes”, 2013, and “An Introduction to Variational Autoencoders”, 2019)
Generative Adversarial Networks (Goodfellow, et al., “Generative Adversarial Nets”, 2014)

Papers and topics:

survey on useful Metrics (Carbonneau, et al., “Measuring Disentanglement: A Review of Metrics”, 2022; and Eastwood & Williams, “A Framework for the Quantitative Evaluation of Disentangled Representations”, 2018; and Do & Tran, “Theory and Evaluation Metrics for Learning Disentangled Representations”, 2019)
fairness (Creager, et al., “Flexibly Fair Representation Learning by Disentanglement”, 2019)
contrastive Learning (Cao, et al., “An Empirical Study on Disentanglement of Negative-free Contrastive Learning”, 2022)
recommender Systems (Ma, et al., “Learning Disentangled Representations for Recommendation”, 2019)
weakly-Supervised (Locatello, et al., “Weakly-Supervised Disentanglement Without Compromises”, 2020)
semi-supervised (Nie, et al., “Semi-Supervised StyleGAN for Disentanglement Learning”, 2020)

Equivariant neural networks (click to expand)

Motivation: Many datastructures have an innate structure that our neural networks should respect. For example the output of a graph neural networks should not change if we permute the vertices (permutation equivariance/invariance).

Overview:

chapter 8 “equivariant neural networks” of “Deep learning for molecules and materials” by Andrew D. White, 2021. (pdf).
introduction to equivariance: Taco Cohen and Risi Kondor - Neurips 2020 Tutorial (first half) (slideslive-link)

Papers and topics:

neural network that can learn on sets (Zaheer, et al. “Deep sets.” NeurIPS 2017)
learning equivariance from data (Zhou, et al. “Meta-learning symmetries by reparameterization.” ICLR 2021)

Foundations of Model-Based Reaction Prediction (click to expand)

Motivation: Machine learning has become a cornerstone of automated synthesis planning in organic chemistry. A key step in this process is reaction prediction. Over the past 15 years, numerous models and architectures (template-based, GNNs, transformers, …) have been developed to address this complex task. Yet, reaction prediction can refer to a range of distinct modeling tasks—such as predicting products, assessing feasibility, or classifying mechanisms. Comparing existing models is often non-trivial, as their assumptions and objectives vary and the boundaries between prediction tasks are often unclear. For instance, a model trained for product prediction may also be applied to assess reaction feasibility. This motivates the exploration of a formal framework that clarifies the relationships between different modeling approaches.

Papers:

Marwin H. S. Segler, Mike Preuss, and Mark P. Waller. 2018. Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555, 7698 (March 2018)
Shuan Chen and Yousung Jung. 2022. A generalized-template-based graph neural network for accurate organic reactivity prediction. Nature Machine Intelligence 4, 9 (September 2022)
Zhengkai Tu and Connor W. Coley. 2022. Permutation Invariant Graph-to-Sequence Model for Template-Free Retrosynthesis and Reaction Prediction. Journal of Chemical Information and Modeling 62, 15 (August 2022)

Generalisation (click to expand)

Motivation: The ability of a model to adapt and perform well on new data is crucial. A model which generalises not only performs well on the training set, but on unseen data as well. Understanding and characterising why and how deep learning can generalise well is still an open question.

Overview:

notes on generalisation (Prof. Roger Grosse) (link)
generalisation and overfitting (youtube-link)

Papers and topics:

Brilliantov, Souza, Garg. Compositional PAC-Bayes: Generalization of GNNs with persistence and beyond. NeurIPS, 2024.
Rauchwerger et al. Generalization, Expressivity, and Universality of Graph Neural Networks on Attributed Graphs. ICLR, 2025.
Behboodi, Cesa, Cohen. A PAC-Bayesian Generalization Bound for Equivariant Networks. NeurIPS, 2022.

GNNs (click to expand)

Motivation: Graphs are a very general structure and can be applied to many areas: molecules and developing medicine, geographical maps, spread of diseases. They can be used to model physical systems and solve partial differential equations. Even images and text can be seen as a special case of graphs. Thus it makes sense to develop neural networks that can work with graphs. GNNs have strong connections to many classical computer science topics (algorithmics, logic, …) while also making use of neural networks. This means that work on GNNs can be very theoretical, applied or anything in between.

Overview:

Veličković, Everything is connected: Graph neural networks, Current Opinion in Structural Biology, 2023
Sanchez-Lengeling et al., A Gentle Introduction to Graph Neural Networks, distill.pub 2021
Veličković, Intro to graph neural networks (ML Tech Talks), YouTube, 2021

Papers:

Note: For very long papers we do not expect you to read the entire appendix.

Baranwal et al., Optimality of Message-Passing Architectures for Sparse Graphs, NeurIPS, 2023
Zhou et al., Distance-Restricted Folklore Weisfeiler-Leman GNNs with Provable Cycle Counting Power, NeurIPS, 2023
Zahng et al., A Complete Expressiveness Hierarchy for Subgraph GNNs via Subgraph Weisfeiler-Lehman Tests, ICML, 2023
Zhang et al., Rethinking the Expressive Power of GNNs via Graph Biconnectivity, ICLR, 2023
Lim et al., Sign and Basis Invariant Networks for Spectral Graph Representation Learning, ICLR, 2023
Joshi et al., On the Expressive Power of Geometric Graph Neural Networks, ICML, 2023
Huang et al., You Can Have Better Graph Neural Networks by Not Training Weights at All: Finding Untrained GNNs Tickets, LoG, 2022

Recent Developments in Large Language Models (click to expand)

Motivation: Large language models such as ChatGPT are seeing a huge research interest. Some companies are releasing more or less free models, and open-source initiatives have sprung up.

Overview: In this seminar paper, an overview of the latest large language models that are available in various forms is given. This includes, in particular, an investigation of their performance, and an explanation how performance can be evaluated objectively at all.

Advisor: Prof. Clemens Heitzinger

The Theory of Opinion Formation in Social Networks (click to expand)

Motivation:

Online social networks have become ubiquitous parts of modern societies, but recently they have been blamed for causing disagreement and polarization. Developing a theoretical understanding of these phenomena is still an active research question.

Papers:

Nikita Bhalla, Adam Lechowicz, Cameron Musco: Local Edge Dynamics and Opinion Polarization. WSDM 2023: 6-14
Uthsav Chitra, Christopher Musco: Analyzing the Impact of Filter Bubbles on Social Network Polarization. WSDM 2020: 115-123
Cameron Musco, Christopher Musco, Charalampos E. Tsourakakis: Minimizing Polarization and Disagreement in Social Networks. WWW 2018: 369-378
Antonis Matakos, Evimaria Terzi, Panayiotis Tsaparas: Measuring and moderating opinion polarization in social networks. Data Min. Knowl. Discov. 31(5): 1480-1505 (2017)
Xi Chen, Jefrey Lijffijt, Tijl De Bie: Quantifying and Minimizing Risk of Conflict in Social Networks. KDD 2018: 1197-1205
David Bindel, Jon M. Kleinberg, Sigal Oren: How bad is forming your own opinion? Games Econ. Behav. 92: 248-265 (2015)
Mayee F. Chen, Miklós Z. Rácz: An Adversarial Model of Network Disruption: Maximizing Disagreement and Polarization in Social Networks. IEEE Trans. Netw. Sci. Eng. 9(2): 728-739 (2022)
Jason Gaitonde, Jon M. Kleinberg, Éva Tardos: Adversarial Perturbations of Opinion Dynamics in Networks. EC 2020: 471-472
Sijing Tu, Stefan Neumann: A Viral Marketing-Based Model For Opinion Dynamics in Online Social Networks. WWW 2022: 1570-1578

Supervisor: Prof. Dr. Stefan Neumann

ML at the interface to quantum physics (click to expand)

Motivation: Exploration of new perspectives opened by ML methods for complex quantum systems and/or improving machine learning methods from the point of view as a physical system of interacting elements

Overview: Statistical field theory for neural networks (Lecture notes) by Moritz Helias and David Dahmen, https://arxiv.org/abs/1901.10416

Related Works:

Renormalization Group Flow as Optimal Transport by Jordan Cotler and Semon Rezchikov, https://arxiv.org/abs/2312.16038
Physics-informed neural network for solving functional renormalization group on lattice by Takeru Yokota, https://arxiv.org/abs/2304.00599

Advisor: Prof. Sabine Andergassen

Policy Evaluation in Healthcare (click to expand)

Policy evaluation is a critical process that assesses the effectiveness of decision-making policies in healthcare. In dynamic healthcare environments, RL algorithms continuously assess and adjust policies based on real-time patient data, ensuring adaptability to evolving medical scenarios.

Regulatory AI (click to expand)

Motivation: Regulatory AI explores how technical systems can support accountability, transparency, and compliance in machine learning. The goal is to design models and data pipelines that can be audited, constrained, or certified: turning legal and ethical requirements into computable properties. This connects ML research on robustness, interpretability, and verification with questions of governance and oversight.

Overview:

Montreal AI Ethics Institute “Series on the regulatory landscape of AI”. (montrealethics.ai)
NeurIPS 2024 RegML Workshop “Regulatable ML: Towards Bridging the Gaps between Machine Learning Research and Regulations”. (neurips.cc)
DeepMind blog: “Exploring institutions for global AI governance”, 2023.
(deepmind.google)

Papers and topics:

data governance frameworks for frontier AI models (Hausenloy, J., McClements, M. & Thakur, P. “Towards Data Governance of Frontier AI Models.” NeurIPS RegML 2024)
algorithmic auditing and accountability frameworks (Raji, I. D., Smart, A., White, R., Mitchell, M., Gebru, T. et al. “Closing the AI Accountability Gap: Defining an End-to-End Framework for Internal Algorithmic Auditing.” FAccT 2020)
data governance concerns for generative AI systems (Aaronson SA. “Data Disquiet: Concerns about the Governance of Data for Generative AI.” CIGI 2024)

Sepsis treatment by RL (click to expand)

In healthcare, making decisions is challenging due to complex and high-stakes scenarios. RL, as a dynamic decision-making framework, is uniquely positioned to handle the intricacies of healthcare scenarios by not only predicting outcomes but also adapting treatment strategies to evolving patient conditions.

Recent Developments in Transformers (click to expand)

Motivation: Transformers have revolutionized natural-language processing, and research has exploded since 2017.

Overview: In this seminar paper, the functioning of transformers is explained and an overview of the latest developments, regarding large language models, time-series predction, etc., is given.

Advisor: Prof. Clemens Heitzinger