In this course, you will experience the role of a typical machine learning researcher. You will:
There are three types of projects:
Working on the project consists of the following steps:
Choose a project topic or come up with your own, original idea. Then, write a short problem statement, including your research question, motivation, related work, methodology, and expected results.
Work on your project and present your work to colleagues and our group. Additionally, you will have discussion meetings with your supervisor if needed.
Design a poster and write a short final report to present the results of your work. Your poster will be exhibited to your colleagues and our group during a poster session (snacks included, if Covid allows it!) at the end of the semester.
We are happy to supervise machine learning related projects that are connected to our research interests. Examples are:
Type: Theory and experiments
Context: Are probabilities from neuro-symbolic (NeSy) models calibrated? Focus on the trade-off between probability calibration and constraint satisfaction in Semantic Loss and BEARS. NeSy methods enforce rules or encourage knowledge-aware uncertainty. This often improves logical consistency or reduces overconfidence, but may skew predicted probabilities away from observed frequencies. The seminar clarifies when these methods help or hurt calibration and how to measure the trade-off.
Suggested approach:
Related work:
Advisor: Sagar Malhotra
Type: Reading and report-writing project on the constructive proof of Chernoff bounds and its implications in probability theory.
Recommended Skills:
Task: Chernoff bounds are crucial in probability theory because they provide exponentially tight bounds on the tail probabilities of sums of independent random variables. This allows us to quantify the likelihood that the sum deviates significantly from its expected value. They are widely used in areas like randomized algorithms, machine learning, and complexity theory, where controlling the probability of large deviations is essential for guaranteeing performance and reliability.
This project focuses on gaining a deep understanding of a recent constructive proof of Chernoff bounds, as presented by Impagliazzo and Kabanets. The proof is a combinatorial approach that allows not only for bounding the probability of large deviations but also for identifying the subsets (witnesses) responsible for these deviations. Students will review the proof in detail, compare it to the standard (non-constructive) proof, and explore its potential applications in areas such as distribution testing and complexity theory.
Objectives:
Suggested Approach:
References:
Advisor: Sagar Malhotra
CONTEXT: Bias in ML models is a very important part of ML research which tries to identify and reduce implicit biases such as gender or racial bias. While there has been a plethora of different algorithms and methods to debias deep neural networks (e.g., debiasing Recommender Systems, Large Language Models), there is still plenty of room to explore, especially for graph-based models such as GNNs. While recent work has investigated bias on item level such as popularity bias, user-centric bias is yet to be explored. The research question in focus is two-fold: (1) Can we identify unwanted bias related to sensitive user attributes in the node features of a GNN based recommender system? And if yes, (2) how can we mitigate unwanted bias in user embeddings for a GNN based recommender system?
SUPERVISOR: David Penz ( david.penz@tuwien.ac.at )
Type: Mostly implementing and training GNNs with PyTorch (Geometric), some going through related work and collecting interesting generalization bounds
Recommended Skills: (Some) experience with machine learning in Python. Familiarity with mathematical proofs is helpful to understand the related work, but not strictly required for this project.
Question: How (empirically) tight are generalization bounds for graph neural networks?
Context: In machine learning theory, we are interested in giving formal guarantees about the behavior of machine learning algorithms. One type of guarantee is about generalization, where we may want to provide an (upper) bound on the generalization error, i.e., how well does a model generalize to yet unseen data? Recently, there has been a number of theoretical contributions for generalization bounds for graph neural networks. It would be interesting to assess (1) how practically useful they are, and (2) how they compare to each other.
Suggested approach: Consult related work and choose two to three different generalization bounds for graph neural networks (this can be done in discussion with your supervisor). Assess them empirically on a number of graph benchmark datasets for at least one GNN architecture.
Related work:
Advisor: Tamara Drucks ( tamara.drucks@tuwien.ac.at )
Type: Implementation and training of GNNs with PyTorch.
Question: Is there a privacy risk of data leakage from information contained in gradients computed during the training of a GNN?
Suggested Approach: Definition and evaluation of an attack strategy using information contained in the gradients computed during training and/or inference, adapting the approach of Zhu et al. (2019) to the graph setting.
Related Work:
Context: Information contained in the gradients computed to train a neural network can be used to infer private information on training data (Zhu et al., 2019). This risk is under-investigated in the GNN context (Zhang et al., 2022), where focus is usually put on the privacy risks of the final, trained model. It is thus interesting to investigate gradient inversion attacks (Yin et al., 2021) for GNN models and evaluate their efficacy and the possible protections offered by, e.g., differential privacy.
Advisor: Patrick
Type: Implementation and training of GNNs with PyTorch.
Question: Are GNNs which are robust against adversarial attacks less private? Are private GNNs less robust?
Suggested Approach: Implementation/adaptation of robust GNNs and evaluation of the privacy/utility trade-off on benchmark datasets against a number of privacy attacks. A possible focus can be on private/adversarial perturbations on the edges of the graphs: how does the accuracy of an adversarially attacked GNN compare to that of a private one?
Related Work:
Context: A tension between differential privacy and robustness (against adversarial examples) has been empirically observed [Song et al., 2019] and theoretically shown [Ghazi et al., 2021]. Do these result extend to structured data? Focusing on a centralised setting and possibly on perturbations of the graph structure: are more robust algorithms less private (or vice versa)?
Advisor: Patrick
Type: Implement and experiment with GNNs
Recommended Skills: Experience with deep learning using PyTorch
Task: Testing Zero-One laws on Graph Neural Networks on distributions beyond Erdos-Reyni
Context: Zero-One laws state that the probability of any first order logic sentence on a random graph approaches zero or one, as larger and larger graphs are sampled [1]. Recent results [2] have shown that GNNs also exhibit zero-one laws on Erdos-Reyni random graphs. In this project our goal would be to empirically test this behavior of GNNs beyond the Erdos-Reyni random graph model. We will be interested in checking this behavior on graphons — a very general random graph model, and on random graph models, where edge probability depends on the number of nodes in the graph. Especially, we would be interested in testing distributions where zero-one laws for first order logic do not hold.
Suggested approach: Create synthetic dataset by randomly generating graphs w.r.t the above mentioned distributions, and label the graphs with a first order logic rule of your choice — 1 if the fol rule is true on the graph and 0 otherwise. Learn a GNN on those graphs. Now, check the probability of the randomly generated graph being classified as 1, as the graph size increases, both with the GNN and with the first order logic formula.
Related work:
[1] Spencer. Strange Logic of Random Graphs. Springer-Verlag 2001
[2] Adam Day et al. Zero-One Laws of Graph Neural Networks. NeurIPS 2023
Advisor: Sagar Malhotra
Type: Implementation and training of GNNs with PyTorch.
Question: What is the privacy protection performance of differentially private GNNs against various privacy attacks?
Suggested Approach: Implementation/adaptation of existing differentially private GNNs and evaluation of the privacy/utility trade-off on benchmark datasets against a number of privacy attacks.
Related Work:
Context: Similarly to deep learning algorithms trained on images and text, GNNs rely on large amounts of (possibly sensitive) data. Differential privacy is one of the possible approaches to protect sensitive data, which is of particular importance when the models are trained in a centralised way. In particular, there exist differentially private techniques that aim at preserving the privacy of features, labels, or edges of the graph, mainly focusing on protection against membership inference attacks. Nevertheless, privacy protection performance against various other privacy attacks if under-investigated.
Advisor: Patrick
A tension between differential privacy and robustness (against adversarial examples) has been empirically observed [Song et al., 2019] and theoretically shown [Ghazi et al., 2021]. Do these result extend to structured data? For instance, similarity-based edge reconstruction attacks are more effective on sparse graphs: does this setting have specific robustness implications either theoretically or empirically? Focusing on a centralized setting and possibly on perturbations of the graph structure: are more robust algorithms less private (or vice versa)? A possible approach would include the implementation/adaptation of robust GNNs and evaluation of the privacy/utility trade-off on benchmark datasets against a number of privacy attacks. A possible focus can be on private/adversarial perturbations on the edges of the graphs: how does the accuracy of an adversarially attacked GNN compare to that of a private one?
Patrick Indri ( patrick.indri@tuwien.ac.at )
Type: Implement and train deep neural networks, use off-the-shelf logical rule learners
Recommended Skills: Experience with deep learning using PyTorch
Task: Zero-One Laws of Graph Neural Network
Context: Message Passing Graph Neural Networks(GNNs) expressivity is known to be bounded above by the one dimensional Weisfeiler-Leman test [1] — an algorithm that iteratively colors graph nodes, with the goal of creating different coloring for two non-isomorphic graphs (although not necessarily succeeding). Hence, if two graphs have the same WL coloring then a GNN will have the same labels for these two graphs. This implies that whatever function a GNN learns can be expressed as a function on 1-WL colors. This equivalence can be further extended to the fragment of first order logic with two variables and counting quantifiers (C2) i.e. two graphs which agree on all C2 properties have the same WL colors, and hence are assigned the same label by a GNN. The research question in this project is “Can we learn/approximate the function represented by a GNN in an explainable logical language?” If successful, such an algorithm can provide explanations for GNNs predictions, and can potentially serve as an explainable alternative to the GNN itself.
Suggested approach: Create synthetic dataset, by fixing an a-priori a logical rule, and labelling randomly generated graphs with that rule. Learn a GNN on those graphs. Now, run WL on the previously generated graphs, which will give you set of colors for your examples, use the GNN generated graph label as the label for the corresponding multiset of WL-colors of the graph. Now, we can use off-the-shelf logical rule learners (like [2]) to learn succinct rules expressing the association between WL colors (as propositions) and GNN labels. Finally, we can check if the learned logical rule is indeed equivalent to the original logical rule.
Related work:
[1] Martin Grohe. The Logic of Graph Neural Networks. 2022. LICS 2022
[2] Ghosh et al. Efficient Learning of Interpretable Classification Rules. JAIR 2022
[3] Azzolin et al. Global Explainability of GNNs via Logic Combination of Learned Concepts. ICLR 2023
Advisor: Sagar and Pascal
Type: Learn theory on symmetries and parameter sharing in Neural Networks. Implementation and experimentation with symmetry-driven parameter sharing in a toy neuro-symbolic model.
Recommended Skills:
Task: This project aims to explore how symmetries in constraints for NeuroSymbolic (NeSy) models can be leveraged to induce parameter sharing. The goal is to reduce the number of independent parameters, enhance generalization, and improving computational efficiency. Students will design and implement neural networks that enforce symmetry-driven parameter sharing[1], focusing on toy tasks such as the MNIST Sum Task [2] and (optionally) extending to more complex domains where symmetries in data can be exploited [3].
Objectives:
Suggested Approach:
Related Work:
Advisor: Sagar Malhotra
Type: Implement and train deep neural networks
Recommended Skills: Experience with deep learning using PyTorch
Task: Extend research on debiasing recommender systems (see Ganhör et al. below) by experimenting with the use of Information Theory
Context: Bias in ML models is a very important part of ML research which tries to identify and reduce implicit biases such as gender or racial bias. While there has been a plethora of different algorithms and methods to debias deep neural networks (e.g., debiasing Recommender Systems, Large Language Models), many of those algorithms require changes and/or additions to the architecture. Using Fisher/Mutual Information as regression term, we hope for a method that is easy to tune and does not require changes to the existing architecture.
Suggested approach: Implement recommender systems based on VAE architectures and different data sets including additional meta information such as gender (e.g. LFM2b, MovieLens), implement and adapt an approximation algorithm for Fisher/Mutual Information (see related work), train and evaluate the influence of the regression term.
Related work:
Advisor: David
Type: Implement and train deep neural networks
Recommended Skills: Experience with deep learning using PyTorch
Task: Extend research on debiasing recommender systems (see Ganhör et al. below) by experimenting with the use of Siamese Neural Networks
Context: Bias in ML models is a very important part of ML research which tries to identify and reduce implicit biases such as gender or racial bias. Recent research has shown good results using a variety of different approaches (e.g., adversarial training). However, many of those approaches are difficult to fine tune and come with a loss in performance. Using a simpler training loop using Siamese Neural Networks, we hope for a method that is easy to tune and preserves the models performance.
Suggested approach: Implement recommender systems based on VAE architectures and different data sets including additional meta information such as gender (e.g. LFM2b, MovieLens), extend the implementation to a Siamese Neural Network architecture,
Related work:
Advisor: David
Type: Implement and train different algorithms designed to compute disentangled representations (PyTorch)
Recommended Skills: Experience with deep learning using PyTorch
Task: Implement different state-of-the-art algorithms for computing disentangled representations and benchmark them on a variety of different data sets, preferrably (including) data sets outside the Computer Vision domain.
Context: Many state-of-the-art deep learning models in various domains such as Language Models and Recommender Systems thrive on semantincally rich hidden representations. Those representations, however, are most of the time highly entangled, i.e., a change of a single factor such as gender, colour, shape, age, … will change all the dimensions of the representation. Disentangling individual dimensions or vectors from eachother might offer benefits such as better explainability of the predictions, new insights in what the network is learning, controllability over what information flows through the network, …
Suggested approach: Find suitable data sets, i.e., data sets that come with additional meta information such as gender and age (e.g., LFM2b, MovieLens, BIOS); implement and train a variety of different algorithms suitable for the given task (e.g., beta-VAE, FactorVAE); evaluate performance of the task and metrics related to disentanglement
Related work:
Advisor: David
Type:
Formalize a sound algorithm for the search of prime implicant explanations in protein-ligand binding affinity predictions.
Question:
What constitutes a valid feature subset of a protein-ligand instance? How can these feature subsets be enumerated efficiently?
Context:
Prime implicant (PI) explanations in graph classification identify minimally sufficient subgraphs of an instance that support a specific decision made by a black-box classifier [1,2]. A prime implicant is a smallest part of the input graph that alone is enough to produce the classifier’s prediction. Graph classification involves assigning labels to graphs—in our case, predicting if a protein-ligand pair will bind strongly.
Our classifier is a thresholded protein-ligand binding affinity predictor, usually a machine learning model that takes a protein-ligand pair as input and returns a predicted binding affinity score. These models are often black boxes, meaning their internal reasoning is not easily understood. PI explanations provide insight by identifying ligand fragments that the model considers sufficient for predicting high binding affinity. This helps researchers understand which molecular substructures drive the model’s decisions.
Explainability is crucial for trusting and improving predictions, especially in drug discovery where understanding molecular interactions can guide better design. This project involves researching established methods for predicting protein-ligand binding affinity and extending existing work on prime implicant explanations to this new field of application.
Suggested approach:
Read about established protein-ligand binding affinity predictors (e.g. [3]). Compare and evaluate the models. Interesting aspects include the data representation (ligand fragments, pharmacophore, …), used training data, and architectural features. After thoroughly understanding the setting, try to formalize a sound subgraph enumeration algorithm (e.g. a derivation of [4]) for this domain.
Related work:
Type:
Implement an experiment in Python to compute prime implicant ligand fragments for a given black-box binding affinity predictor.
Question:
Are prime implicant explanations a suitable method to explain protein-ligand binding affinity predictions?
Context:
Prime implicant (PI) explanations in graph classification identify minimally sufficient subgraphs of an instance that support a specific decision made by a black-box classifier [1,2]. A prime implicant is a smallest part of the input graph that alone is enough to produce the classifier’s prediction. Graph classification involves assigning labels to graphs—in our case, predicting if a protein-ligand pair will bind strongly.
Our classifier is a thresholded protein-ligand binding affinity predictor, usually a machine learning model that takes a protein-ligand pair as input and returns a predicted binding affinity score. These models are often black boxes, meaning their internal reasoning is not easily understood. PI explanations provide insight by identifying ligand fragments that the model considers sufficient for predicting high binding affinity. This helps researchers understand which molecular substructures drive the model’s decisions.
Explainability is crucial for trusting and improving predictions, especially in drug discovery where understanding molecular interactions can guide better design. The project involves working with these explanations to analyze protein-ligand data, helping bridge the gap between complex machine learning models and interpretable biological insights.
Suggested approach:
Find two or three protein-ligand binding affinity predictors to analyze (e.g. [3]), implement the fragment based graph extension method (a specialized version of a connected subgraph enumeration algorithm [4]), compute PI explanations for model predictions on a training dataset, measure predictive performance of PI explanations on test dataset, compare PI explanations to other explanation methods
Related work:
Type: Experimental
Recommended Skills: Some programming experience, ideally some experience with standard machine learning libraries such as PyTorch.
Task: Reproduce the results of a machine learning paper.
Context: Reproducibility, i.e., obtaining similar output using the same data, is a critical quality in research. Unfortunately, this is often not the case.
Suggested approach: Choose a paper from one of the following venues: NeurIPS, ICML, ICLR, ECML, JMLR, MLJ. Please do no just re-use existing code! (i.e., re-implement in a different programming language/library, or implement methods from multiple papers to perform a comparison.)
Related work:
Advisor: Depends on paper
Question: How can we find robust hypotheses that attackers cannot easily manipulate?
Suggested Approach: Approximate Tukey center of hypotheses via iterated Radon points.
Related Works:
Context: The more (distributed) machine learning algorithms and/or the models learned by them are to be applied in critical applications where lives depend on predictions, the more trustworthy they need to be. For that, (distributed) learning algorithms need to be robust to manipulation of (training) data and communication channels. They also need to guarantee that their models have well-calibrated confidence. The Tukey center is a high-dimensional generalisation of the median and shares similar stability properties that will be exploited in this project.
Question: How can constraints be incorporated into meaningful low-dimensional embeddings of large data sets?
Suggested Approach: Approximate knowledge-constrained PCA via the power method.
Related Works:
Context: Traditionally, embedding techniques focus on finding one fixed embedding, which depends on some technical parameters and emphasises a singular aspects of the data. In contrast, interactive data exploration aims at allowing the user to explore the structure of a dataset by observing a projection dynamically controlled by the user in an intuitive way. Ultimately it provides a way to search and find an embedding, emphasising aspects that the user desires to highlight. Current approaches often do not allow for real-time manipulation of large data sets.
Type: Implementing topology informed GNN with Pythorch (Geometric)
Question: Can we leverage information arising from higher-order interaction for social networks predicion tasks?
Suggested Approach: The concept of filtration (of a set) has been employed in the context of graph learning to enrich message passing with topological information arising from higher order interactions in various ways. In this project, we aim to combine (and eventually extend) existing approaches and evaluate their effectiveness on prediction tasks over social network datasets.
Related Work:
Advisor: Flaviano
Type: Theory of reinforcement learning (multiple topics)
Recommended skills: Course on reinforcement learning. Strong mathematical/theoretical background.
Task: Show convergence results for algorithms in RL. Show lower bounds on the performance of RL algorithms, i.e. PAC (probably approximately correct) estimates. Based on the PAC estimates, show computational complexity.
Context: In safety critical applications, convergence results and performance estimates are of utmost importance for reliable/safe/robust learning.
Suggested approach: Use fixed-point arguments and concentration inequalities.
Related work: Markus Böck and Clemens Heitzinger. Speedy categorical distributional reinforcement learning and complexity analysis. SIAM Journal on Mathematics of Data Science, 4(2):675–693, 2022. Markus Böck, Julien Malle, Daniel Pasterk, Hrvoje Kukina, Ramin Hasani, andClemens Heitzinger. Superhuman performance on sepsis MIMIC-III data by distributional reinforcement learning. PLOS ONE, 17(11):e0275358/1–18, 2022.
Advisor: Prof. Clemens Heitzinger
Type: Safety-critical applications of reinforcement learning (multiple topics)
Recommended skills: Course on reinforcement learning. Strong programming skills (Python and/or Julia) for applications.
Task: Implement and test modern RL algorithms and devise new algorithms.
Context: In safety critical applications, distributional RL makes it possible to quantify risk. In real-world applications, deep RL makes it possible to treat large state spaces. We have experience in medical applications and want to learn more optimal strategies on recent datasets.
Suggested approach: Distributional RL, deep RL.
Related work: Markus Böck, Julien Malle, Daniel Pasterk, Hrvoje Kukina, Ramin Hasani, and Clemens Heitzinger. Superhuman performance on sepsis MIMIC-III data by distributional reinforcement learning. PLOS ONE, 17(11):e0275358/1–18, 2022. Pierrick Lorang, Horvath Helmut, Tobias Kietreiber, Patrik Zips, Clemens Heitzinger, Matthias Scheutz. Adapting to the ``open world’’: the utility of hybrid hierarchical reinforcement learning and symbolic planning. Proc. 2024 IEEE International Conference on Robotics and Automation (ICRA 2024), 13–17 May 2024, accepted for publication.
Advisor: Prof. Clemens Heitzinger
Describe the scientific merit and novelty of your idea. It is very important to narrow down the rough topic to a tentative research question and approach of interest to us. The research question should not have been answered previously and the answer needs to be verifiable. To answer the question, typically one has to:
This semester we want to emphasize the applied project type. In this type, you implement an application that uses machine learning at its core. We encourage you to think about interesting scenarios and are looking forward to discussing them with you.
If you choose your own topic, please briefly describe your project following this structure (check our suggested topics to get an idea):
Yes, but the project must allow this. The distribution of work and contribution must be clear. The important thing here is that the project is separable into well-defined parts that permit individual grading. In any case, the evaluation is done individually for each student.