- TISS: (link)
- contact: Patrick Indri (email)
- meeting link: https://tuwien.zoom.us/my/patrickindri
- everything important will be announced in TUWEL/TISS.

This seminar simulates a machine learning conference, where the students take on the role of authors and reviewers. It consists of multiple phases.

Attend the **mandatory** first meeting either in person or remotely (details on TUWEL).

You select

twotopics/papers (i.e., two bullet points) from one of the topics below. You will work with the material mentioned in the overview and the topic-specific resources.

You choose

your owntopic to work on. This can be some existing machine learning paper/work or an own creative idea in the context of machine learning. We strongly encourage you to start from existing papers from the following venues: NeurIPS, ICML, ICLR, COLT, AISTATS, UAI, JMLR, MLJ. Importantly, your idea has to be specific and worked out well. Nevertheless, chooseoneof our suggestions as well.

**Independent of the option you chose**, understand the fundamentals of your topic and try to answer the following questions:

**What**is the problem?**Why**is it an interesting problem?**How**do you plan to approach the problem? /**How**have the authors of your topic approached the problem?

Select topics and write a short description of them together with the answers to the questions (~3 sentences should be sufficient) in **TUWEL**.

We can only accept your own proposals if you can answer the mentioned questions and have a well worked out topic.

You will also act as reviewers and bid on the topics of your peers you want to review. Based on the biddings, we (in the role as chairs of the conference) will select one of each student’s proposals as the actual topic you will work on for the rest of this semester. You **do not** need to work on the other topic, anymore. Additionally, we will also assign two different topics from other students to you, which you will have to review later in the semester.

Now the actual work starts. Gather deep understanding of your topic, write a first draft of your report and give a 5-minute presentation. We recommend to **go beyond** the given material.

You will again act as a reviewer for the conference by writing two reviews, one for each draft report assigned to you.

Based on the reviews from your peers (and our feedback) you will further work on your topic.

Give a final presentation and submit your report.

- Understanding machine learning: from theory to algorithms. Shai Shalev-Shwartz and Shai Ben-David (pdf)
- Foundations of machine learning. Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar (pdf)
- Foundations of data science. Avrim Blum, John Hopcroft, and Ravindran Kannan (pdf)
- Mathematics for machine learning. Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong (pdf)
- Mining of massive datasets. Jure Leskovec, Anand Rajaraman, and Jeffrey D. Ullman (pdf)
- Reinforcement learning: an introduction. Richard Sutton and Andrew Barto (pdf)
- Research Methods in Machine Learning. Tom Dietterich (pdf)

You should have access to the literature and papers through Google scholar, DBLP, the provided links, or the TU library.

- Shai Ben-David and Shai Shalev-Shwartz. Understanding Machine Learning. Chapter 2,3
- Shai Ben-David Lectures. (youtube-link) Lecture 1,2,3
- Kaifu Wang, Efthymia Tsamoura, Dan Roth. On learning latent models with multi-instance weak supervision. Neurips 2024
- Sam Adam-Day, Theodor Mihai Iliant, İsmail İlkan Ceylan. Zero-One Laws of Graph Neural Networks. 2024
- Sam Adam-Day, Michael Benedikt, İsmail İlkan Ceylan, Ben Finkelshtein. Graph neural network outputs are almost surely asymptotically constant. 2024
- Rex Ying, Dylan Bourgeois, Jiaxuan You, Marinka Zitnik, Jure Leskovec. GNNExplainer: Generating Explanations for Graph Neural Networks. 2019

**Motivation**: Ability to learn logically definable concepts from labelled data is a theoretical model of Machine Learning which is explainable by design, and integrates ideas from both logic (especially finite model theory) and PAC Learning.

**Overview**:

**Papers and topics**:

- chapter 1 “Automating inquiry” of Burr Settles’ “Active learning” book, 2012.
- introduction and recent research: Rob Nowak and Steve Hanneke - ICML 2019 tutorial (youtube-link)
- active learning with comparison queries (Kane, D. M., Lovett, S., Moran, S., & Zhang, J. “Active classification with comparison queries.” FOCS 2017)
- sample complexity in active learning (Maria-Florina Balcan, Steve Hanneke, and Jennifer Wortman Vaughan “The true sample complexity of active learning.” Machine Learning 2010)
- bounded memory active learning (M. Hopkins, D. Kane, S. Lovett & M. Moshkovitz. “Bounded memory active learning through enriched queries.” COLT 2021)

**Motivation**: In active learning, the learning algorithm is allowed to select the data points it wants to see labelled, for example, where it is most uncertain. The goal is to reduce the labelling effort. This is useful in applications where unlabelled data is abundant, yet labels are scarce, such as node classification in social networks, drug discovery, and autonomous driving.

**Overview**:

**Papers and topics**:

- chapter 8 “equivariant neural networks” of “Deep learning for molecules and materials” by Andrew D. White, 2021. (pdf).
- introduction to equivariance: Taco Cohen and Risi Kondor - Neurips 2020 Tutorial (first half) (slideslive-link)
- neural network that can learn on sets (Zaheer, et al. “Deep sets.” NeurIPS 2017)
- learning equivariance from data (Zhou, et al. “Meta-learning symmetries by reparameterization.” ICLR 2021)

**Motivation**: Many datastructures have an innate structure that our neural networks should respect. For example the output of a graph neural networks should not change if we permute the vertices (permutation equivariance/invariance).

**Overview**:

**Papers and topics**:

- Veličković,
*Everything is connected: Graph neural networks*, Current Opinion in Structural Biology, 2023 - Sanchez-Lengeling et al.,
*A Gentle Introduction to Graph Neural Networks*, distill.pub 2021 - Veličković,
*Intro to graph neural networks (ML Tech Talks)*, YouTube, 2021 - Baranwal et al.,
*Optimality of Message-Passing Architectures for Sparse Graphs*, NeurIPS, 2023 - Zhou et al.,
*Distance-Restricted Folklore Weisfeiler-Leman GNNs with Provable Cycle Counting Power*, NeurIPS, 2023 - Zahng et al.,
*A Complete Expressiveness Hierarchy for Subgraph GNNs via Subgraph Weisfeiler-Lehman Tests*, ICML, 2023 - Zhang et al.,
*Rethinking the Expressive Power of GNNs via Graph Biconnectivity*, ICLR, 2023 - Lim et al.,
*Sign and Basis Invariant Networks for Spectral Graph Representation Learning*, ICLR, 2023 - Joshi et al.,
*On the Expressive Power of Geometric Graph Neural Networks*, ICML, 2023 - Huang et al.,
*You Can Have Better Graph Neural Networks by Not Training Weights at All: Finding Untrained GNNs Tickets*, LoG, 2022 - Nils M. Kriege,
*Weisfeiler and leman go walking: Random walk kernels revisited*, NeurIPS, 2022 - Zhang et al.,
*Beyond Weisfeiler-Lehman: A Quantitative Framework for GNN Expressiveness*, ICLR, 2024 - Franks et al.,
*Weisfeiler-Leman at the margin: When more expressivity matters*, arXiv, 2024

**Motivation:** Graphs are a very general structure and can be applied to many areas: molecules and developing medicine, geographical maps, spread of diseases. They can be used to model physical systems and solve partial differential equations. Even images and text can be seen as a special case of graphs. Thus it makes sense to develop neural networks that can work with graphs. GNNs have strong connections to many classical computer science topics (algorithmics, logic, …) while also making use of neural networks. This means that work on GNN can be very theoretical, applied or anything in between.

**Overview:**

**Papers:**

*Note:* For very long papers we do not expect you to read the entire appendix.

- Olivier Bousquet Stéphane Boucheron, and Gábor Lugosi: “Introduction to Statistical Learning Theory” 2003.
- Chapters 1-6 of “Understanding machine learning”
- “Extending Generalization Theory Towards Addressing Modern Challenges in ML” by Shay Moran, talk at the HUJI ML Club, 2021 (youtube-link)
- (Basic material) Statistical Machine Learning by Ulrike von Luxburg (we recommend part 38-41) (youtube playlist)
- partial concept classes (Alon, et al., “A theory of PAC learnability of partial concept classes”, unpublished arXiv:2107.08444)
- tight bounds (Bousquet, et al., “Proper learning, Helly number, and an optimal SVM bound” COLT 2020)
- universal learning (Bousquet, et al., “A theory of universal learning” STOC 2021)
- sample compression schemes (Moran, et al., “Sample compression schemes for VC classes” Journal of the ACM 2016)
- algorithmic stability (Bousquet, Olivier, and André Elisseeff. “Algorithmic stability and generalization performance.” NeurIPS, 2000)

**Motivation**: Learning theory studies computational and algorithmic aspects of machine learning algorithms to prove guarantees such as sample complexity bounds. This important to understand and devise novel learning algorithms. In recent years, many long-standing open questions in learning theory have been answered.

**Overview**:

**Papers and topics**:

- Neurosymbolic AI: The 3rd Wave, 2020 (A. Garcez, L. Lamb)
- Neural-Symbolic Cognitive Reasoning, 2009 (A. Garcez, L. Lamb)
- find your own topic :) (a starting point can be the survey from L. De Raedt, S. Dumancic, R. Manhaeve, G. Marra. “From Statistical Relational to Neuro-Symbolic Artificial Intelligence”, 2020)
- SAT solving using deep learning
- D. Selsam, M. Lamm, B. Bünz, P. Liang, D. Dill, L. de Moura. “Learning a SAT Solver from Single-Bit Supervision”, 2019
- V. Kurin, S. Godil, S. Whiteson, B. Catanzaro. “Improving SAT Solver Heuristics with Graph Networks and Reinforcement Learning”, 2019
- J. You, H. Wu, C. Barrett, R. Ramanujan, J. Leskovec. “G2SAT: Learning to Generate SAT Formulas”, 2019

**Overview**:

**Papers and topics**:

- A. Globerson: How SGD Can Succeed Despite Non-Convexity and Over-Parameterization (slides)
- generalization bounds for deep neural networks (G.K. Dziugaite, D.M. Roy, “Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data”, 2017)
- why SGD avoids overfitting and finds global minima (A. Brutzkus et al: “SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data”, 2017)
- connection between flatness of loss curve and generalisation (H. Petzka et al. “Relative Flatness and Generalization”, 2021)
- mode connectivity (Garipov, et al. “Loss surfaces, mode connectivity, and fast ensembling of dnns.” NeurIPS 2018).
- deep learning and generalisation (Zhang, et al. “Understanding deep learning (still) requires rethinking generalization.” Communications of the ACM, 2021)
- connectivity of the optimisation landscape (A. Shevchenko, M. Mondelli. “Landscape Connectivity and Dropout Stability of SGD Solutions for Over-parameterized Neural Networks”, 2020)
- SGD stability (Hardt, Moritz, Ben Recht, and Yoram Singer. “Train faster, generalize better: Stability of stochastic gradient descent.” International conference on machine learning. PMLR, 2016)
- choose one or more papers listed on page 14 in the above mentioned slides :)

**Overview**:

**Papers and topics**:

- General: What does it mean for ML to be trustworthy? (youtube-link)
- General: Trustworthy ML (Kush R. Varshney) (link)
- Differential privacy: Chapter 2 of: Dwork, Cynthia, and Aaron Roth. “The algorithmic foundations of differential privacy.” Found. Trends Theor. Comput. Sci. 2014
- private gossip protocols (Jin, Richeng, et al. “On the Privacy Guarantees of Gossip Protocols in General Networks.” IEEE Transactions on Network Science and Engineering 2023)
- private subgraph counting (Imola, Jacob, Takao Murakami, and Kamalika Chaudhuri. “Differentially private triangle and 4-cycle counting in the shuffle model.” SIGSAC 2022)
- privacy amplification with random walks (Liew, Seng Pei, et al. “Network Shuffling: Privacy Amplification via Random Walks.” SIGMOD 2022)
- differential privacy and deep learning (Chen, Xiangyi, Steven Z. Wu, and Mingyi Hong. “Understanding gradient clipping in private SGD: A geometric perspective.” NeurIPS 2020)
- data reconstruction and differential privacy (Hayes, Jamie, Saeed Mahloujifar, and Borja Balle. “Bounding Training Data Reconstruction in DP-SGD.” arXiv, 2023).
- (extensions of) gaussian mechanism (Balle, Borja, and Yu-Xiang Wang. “Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising.” International Conference on Machine Learning. PMLR, 2018)
- high-dimensional mean estimation which is robust and private (Narayanan, Shyam, Vahab Mirrokni, and Hossein Esfandiari. “Tight and Robust Private Mean Estimation with Few Users.” ICML. PMLR, 2022.)
- private learnability (Alon, Noga, et al. “A Unified Characterization of Private Learnability via Graph Theory.” arXiv, 2023)
- graph statistics under differential privacy (Laeuchli, Jesse, Yunior Ramírez-Cruz, and Rolando Trujillo-Rasua. “Analysis of centrality measures under differential privacy models.” Applied Mathematics and Computation, 2022)

**Motivation**: Machine learning systems are ubiquitous and it is necessary to make sure they behave as intended. In particular, trustworthiness can be achieved by means of privacy-preserving, robust, and explainable algorithms. We focus here on the desideratum of privacy.

**Overview**:

**Papers and topics**:

- Fang, Huang, Su, and Kasai (2023) - Wasserstein Graph Distance Based on L1–Approximated Tree Edit Distance between Weisfeiler–Lehman Subtrees. AAAI
- Kriege, Giscard, Wilson (2016) - On Valid Optimal Assignment Kernels and Applications to Graph Classification. NeurIPS
- Schulz, Horvath, Welke, Wrobel (2022) - A generalized Weisfeiler-Lehman graph kernel
- Nikolentzos, Vazirgiannis (2023) - Graph Alignment Kernels using Weisfeiler and Leman Hierarchies
- Chen, Lim, Memoli, Wan, Wang (2022) - Weisfeiler-Lehman Meets Gromov-Wasserstein
- Togninalli, Ghisu, Llinares-López, Rieck, Borgwardt (2019) - Wasserstein Weisfeiler-Lehman Graph Kernels

**Motivation:**

Graphs can be used to model many areas of interest: e.g. molecules, geographical maps, spread of diseases, or communication networks.
They can be used to model physical systems and solve partial differential equations.
Even images and text can be seen as a special case of graphs.
Learning on graphs nowadays is mostly done using message passing graph neural networks (MPNNs).
MPNNs are connected to the classical Weisfeiler Leman graph isomorphism algorithm (WL) by the fact that they will always compute identical representations for two graphs which cannot be distinguished by WL.
Current theoretical and practical work focuses hence on more *expressive* graph neural networks which can distinguish more graph pairs.

However, learning (particularly with neural networks) is based on similarities between representations. We will delve into recent methods which combine Weisfeiler Leman based graph representations with ideas from optimal transport. Ideally, we will investigate options to transfer some of the ideas to analyze similarities that MPNNs compute.

**Papers of Interest:**

*Note:* For very long papers we do not expect you to read the entire appendix.

**Supervisor**: Pascal