- TISS: (link)
- contact: Maximilian Thiessen
- meeting link: https://tuwien.zoom.us/my/maxthiessen
- everything important will be announced in TUWEL/TISS.

This seminar simulates a machine learning conference, where the students take on the role of authors and reviewers. It consists of multiple phases.

Attend the **mandatory** first meeting on 07.03, 15:00 (either in person at **Gußhausstraße 27-29 CA 03 13**, or remotely at **https://tuwien.zoom.us/my/maxthiessen**).

You select

twoprojects/papers (i.e., two bullet points) from one of the topics below. You will work with the material mentioned in the overview and the project-specific resources.

You choose

twodifferent own project ideas to work on. This can be some existing machine learning paper/work or an own creative idea in the context of machine learning. Importantly, it has to be specific and worked out well.

**Independent of the option you chose**, understand the fundamentals of your projects and try to answer the following questions:

**What**is the problem?**Why**is it an interesting problem?**How**do you plan to approach the problem? /**How**have the authors of your project approached the problem?

Select projects and write a short description of them together with the answers to the questions (~3 sentences shoud be sufficient) in **TUWEL**.

We can only accept your own proposals if you can answer the mentioned questions and have a well worked out project idea.

You will also act as reviewers and bid on the projects of your peers you want to review. Based on the biddings, we (in the role as chairs of the conference) will select one of each student’s proposals as the actual project you will work on for the rest of this semester. You **do not** need to work on the other project, anymore. Additionally, we will also assign two different projects from other students to you, which you will have to review later in the semester.

Now the actual work starts. Gather deep understanding of your project, write a first draft of your report and give a 5-minute presentation. We recommend to **go beyond** the given material.

You will again act as a reviewer for the conference by writing two reviews, one for each draft report assigned to you.

Based on the reviews from your peers (and our feedback) you will further work on your project.

Give a final presentation and submit your report.

- Understanding machine learning: from theory to algorithms. Shai Shalev-Shwartz and Shai Ben-David (pdf)
- Foundations of machine learning. Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar (pdf)
- Foundations of data science. Avrim Blum, John Hopcroft, and Ravindran Kannan (pdf)
- Mathematics for machine learning. Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong (pdf)
- Mining of massive datasets. Jure Leskovec, Anand Rajaraman, and Jeffrey D. Ullman (pdf)
- Reinforcement learning: an introduction. Richard Sutton and Andrew Barto (pdf)
- Research Methods in Machine Learning. Tom Dietterich (pdf)

You should have access to the literature and papers through Google scholar, DBLP, the provided links, or the TU library.

Overview:

- "graph representation learning" by William L. Hamilton (pdf)
- Knowledge Graph Embeddings Tutorial: From Theory to Practice, 2020 (https://kge-tutorial-ecai2020.github.io/)

Papers and projects:

- Knowledge Graph Embeddings (focus on deep learning approaches)
- Q. Wang, Z. Mao, B. Wang, L. Guo. "Knowledge Graph Embedding: A Survey of Approaches and Applications", 2017
- Y. Dai, S. Wang, N. Xiong, W. Guo. "A Survey on Knowledge Graph Embedding: Approaches, Applications and Benchmarks", 2020
- M. Wang, L. Qiu, X. Wang. "A Survey on Knowledge Graph Embeddings for Link Prediction", 2021

Overview:

- Neurosymbolic AI: The 3rd Wave, 2020 (A. Garcez, L. Lamb)
- Neural-Symbolic Cognitive Reasoning, 2009 (A. Garcez, L. Lamb)

Papers and projects:

- find your own topic :) (a starting point can be the survey from L. De Raedt, S. Dumancic, R. Manhaeve, G. Marra. "From Statistical Relational to Neuro-Symbolic Artificial Intelligence", 2020)
- SAT solving using deep learning
- D. Selsam, M. Lamm, B. Bünz, P. Liang, D. Dill, L. de Moura. "Learning a SAT Solver from Single-Bit Supervision", 2019
- V. Kurin, S. Godil, S. Whiteson, B. Catanzaro. "Improving SAT Solver Heuristics with Graph Networks and Reinforcement Learning", 2019
- J. You, H. Wu, C. Barrett, R. Ramanujan, J. Leskovec. "G2SAT: Learning to Generate SAT Formulas", 2019

Overview:

- chapter 1-3 of "Learning with submodular functions: a convex optimization perspective" by Francis Bach, 2013.
- introduction to submodularity in machine learning: Stefanie Jegelka - MLSS 2017 (youtube-link)

Papers and projects:

- submodularity in data subset selection and active learning (Wei, et al. "Submodularity in data subset selection and active learning." ICML 2015)
- robust submodular observation selection (Krause, et al. "Robust submodular observation selection." Journal of machine learning research 2008)
- submodular function maximization (Krause and Golovin. "Submodular function maximization." 2014)
- graph cuts for image segmentation (Blum and Chawla. "Learning from labeled and unlabeled data using graph mincuts." ICML 2001
**and**Jegelka and Bilmes. "Submodularity beyond submodular energies: coupling edges in graph cuts." CVPR 2011) - learning submodular functions (Balcan and Harvey. "Learning submodular functions." ACM symposium on theory of computing 2011)
- batch active learning using submodular optimization (Chen and Krause. "Near-optimal batch mode active learning and adaptive submodular optimization." ICML 2013)

Overview:

- first 23 pages of "A survey on graph kernels" (Applied Network Science 2019) by Nils M. Kriege, et al.
- practical motivation for graph kernels in computational biology: Karsten Borgwardt - MLSS 2013 (the first 35 minutes) (youtube-link)

Papers and topics:

- hardness and expressivity (Gärtner, et al. "On graph kernels: Hardness results and efficient alternatives." COLT 2003
**and**Ramon and Gärtner. "Expressivity versus efficiency of graph kernels." Workshop on mining graphs, trees and sequences 2003) - (k-dimensional) Weisfeiler-Lehman kernel (Shervashidze, et al. "Weisfeiler-Lehman graph kernels." Journal of machine learning research 2011
**and**Morris, et al. "Glocalized Weisfeiler-Lehman graph kernels: Global-local feature maps of graphs." ICDM 2017) - mutiple and deep graph kernel learning (Aiolli, et al. "Multiple graph-kernel learning"
**and**Yanardag and Vishwanathan. "Deep graph kernels." SIGKDD 2015)

Overview:

- chapters 1 and 2 of "Learning with kernels" by Bernhard Schölkopf and Alex Smola, 2002 (pdf)
- introduction to kernels: Bernhard Schölkopf - MLSS 2013 (youtube-link)

Papers and projects:

- Nyström method (Drineas and Mahoney. "On the Nyström method for approximating a Gram matrix for improved kernel-based learning." Journal of machine learning research 2005
**and**Kumar, et al. "Sampling methods for the Nyström method." Journal of machine learning research 2012) - Nyström method with kernel k-means++ samples as landmarks (Drineas and Mahoney. "On the Nyström method for approximating a Gram matrix for improved kernel-based learning." Journal of machine learning research 2005
**and**Oglic and Gärtner. "Nyström method with kernel k-means++ samples as landmarks." ICML 2017) - random features (Rahimi and Recht. "Random features for large-scale kernel machines." NIPS 2007
**and**Le, et al. "Fastfood: approximate kernel expansions in loglinear time." ICML 2013) - neural tangent kernel (Jacot, et al. "Neural tangent kernel: convergence and generalization in neural networks." NIPS 2018)

Overview:

- first chapter/introduction of "Semi-supervised learning" (SSL) by Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien, 2006 (pdf)
- introduction to semi-supervised learning: Tom Mitchell - Carnegie Mellon University 2011 (youtube-link)

Papers and projects:

- transductive support vector machines (chapter 6 in SSL by Thorsten Joachims)
- large-margin semi-supervised learning (Wang, et al. "On efficient large margin semisupervised learning: method and theory." Journal of machine learning research 2009)
- PAC model for semi-supervised learning (chapter 22 of SSL by Maria-Florina Balcan and Avrim Blum)
- generalization error bounds (Rigollet. "Generalization error bounds in semi-supervised classification under the cluster assumption." Journal of machine learning research 2007)
- regularization and semi-supervised learning on graphs (Belkin, et al. "Regularization and semi-supervised learning on large graphs." COLT 2004)
- manifold regularization (Belkin, et al. "Manifold regularization: A geometric framework for learning from labeled and unlabeled examples." Journal of machine learning research 2006)
- label propagation (Zhu, et al. "Semi-supervised learning using Gaussian fields and harmonic functions." ICML 2003
**and**Zhou, et al. "Learning with local and global consistency." NIPS 2004) - normalized cuts (Shi and Malik "Normalized cuts and image segmentation." IEEE TPAMI Journal 2000
**and**Joachims "Transductive learning via spectral graph partitioning." AAAI 2003)

Overview:

- chapter 1 "Automating inquiry" of Burr Settles' "Active learning" book, 2012.
- introduction and recent research: Rob Nowak and Steve Hanneke - ICML 2019 tutorial (youtube-link)

Papers and projects:

- active learning with graph cuts (Blum and Chawla. "Learning from labeled and unlabeled data using graph mincuts." ICML 2001
**and**Guillory and Bilmes. "Label selection on graphs." NIPS 2009): - agnostic/noisy active learning (Balcan, et al. "Agnostic active learning." Journal of computer and system sciences 2009
**and**Beygelzimer, et al. "Importance weighted active learning.") - active nearest-neighbour learning (Kontorovich, et al. "Active nearest-neighbor learning in metric spaces." Journal of machine learning research 2017)
- active learning on trees and graphs (Cesa-Bianchi, et al. "Active learning on trees and graphs", COLT 2013)
- shortest-path-based active learning (Dasarathy, et al. "S2: an efficient graph based active learning algorithm with application to nonparametric classification." COLT 2015)

Overview:

- chapter 1 of "A modern introduction to online learning" by Francesco Orabona, 2020.
- introduction to online learning (iterative learning / streaming settings): Nicolò Cesa-Bianchi - Mediterranean Machine Learning school 2021 (youtube-link)

Papers and projects:

- weighed majority and Littlestone dimension (Littlestone and Warmuth. "The weighted majority algorithm." Information and computation 1994
**and**Littlestone "Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm." Machine Learning 1988). - online (sub-)gradient descent (chapter 2-4 of "A modern introduction to online learning", Francesco Orabona, 2020)
- bandits and expert advice (introduction and chapter 1,5,6 of "Introduction to multi-armed bandits", Aleksandrs Slivkins, 2019)
- (online) learning with partial orders (Gärtner and Garriga. "The cost of learning directed cuts." ECML 2007
**and**Missura and Gärtner. "Predicting dynamic difficulty." NIPS 2011)

Overview:

- chapter 1 and 2 of "Dimension reduction: a guided tour" by Christopher Burges, 2010,
**and**chapter 22 (the introduction section before 22.1 and section 22.5) of "Understanding machine learning". - introduction and theoretical overview on clustering: Shai Ben-David Cheriton Symposium 2017 (youtube-link)
- introduction and overview on probabilistic dimensionality reduction: Neil Lawrence - MLSS 2012 (youtube-link)

Papers and projects:

- kernel PCA and multidimensional scaling (Schölkopf, et al. "Kernel principal component analysis." ICANN 1997
**and**Williams "On a connection between kernel PCA and metric multidimensional scaling." Machine learning 2002) - spectral clustering (Von Luxburg. "A tutorial on spectral clustering." Statistics and computing 2007)
- (adaptive) correlation clustering (Bansal, et al. "Correlation clustering." Machine learning 2004
**and**Bressan, Marco, et al. "Correlation clustering with adaptive similarity queries." NeurIPS 2019) - (approximate) k-means++ (Arthur and Vassilvitskii. "k-means++: The advantages of careful seeding." Stanford, 2006
**and**Bachem, Olivier, et al. "Approximate k-means++ in sublinear time." AAAI 2016) - clustering under approximation stability (Balcan, et al. "Clustering under approximation stability." Journal of the ACM 2013)
- auto-encoders and generative adversarial nets (Diederik and Welling "Auto-encoding variational Bayes" ICLR 2014
**and**Goodfellow, et al. "Generative adversarial nets" NIPS 2014**and**Tolstikhin, et al. "Wasserstein auto-encoders" ICLR 2018)

Overview:

- Olivier Bousquet Stéphane Boucheron, and Gábor Lugosi: "Introduction to Statistical Learning Theory" 2003.
- Chapters 1-6 of "Understanding machine learning"
- "Extending Generalization Theory Towards Addressing Modern Challenges in ML" by Shay Moran, talk at the HUJI ML Club, 2021 (youtube-link)
- (Basic material) Statistical Machine Learning by Ulrike von Luxburg (we recommend part 38-41) (youtube playlist)

Papers and projects:

- partial concept classes (Alon, et al., "A theory of PAC learnability of partial concept classes", unpublished arXiv:2107.08444)
- tight bounds (Bousquet, et al., "Proper learning, Helly number, and an optimal SVM bound" COLT 2020)
- universal learning (Bousquet, et al., "A theory of universal learning" STOC 2021)
- sample compression schemes (Moran, et al., "Sample compression schemes for VC classes" Journal of the ACM 2016).
- generalization bounds for deep neural networks (G.K. Dziugaite, D.M. Roy, "Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data", 2017)

Overview:

- Došilović, Filip Karlo, Mario Brčić, and Nikica Hlupić. "Explainable artificial intelligence: A survey." MIPRO 2018
- Samek, Wojciech, and Klaus-Robert Müller. "Towards explainable artificial intelligence." Explainable AI: interpreting, explaining and visualizing deep learning." Springer, Cham, 2019

Papers and projects:

- interpreting model predictions with SHAP and LIME (Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. ""Why should i trust you?" Explaining the predictions of any classifier." ACM SIGKDD 2016
**and**Lundberg, Scott M., and Su-In Lee. "A unified approach to interpreting model predictions." NIPS 2017 - nonlinear classifiers (Montavon, Grégoire, et al. "Explaining nonlinear classification decisions with deep taylor decomposition." Pattern recognition, 2017)

Overview:

- Dwork, Cynthia. "Differential privacy: A survey of results." International conference on theory and applications of models of computation. Springer, 2008)
- Chapter 2 of: Dwork, Cynthia, and Aaron Roth. "The algorithmic foundations of differential privacy." Found. Trends Theor. Comput. Sci. 9.3-4 2014

Papers and projects:

- differential privacy and deep learning (Chen, Xiangyi, Steven Z. Wu, and Mingyi Hong. "Understanding gradient clipping in private SGD: A geometric perspective." NeurIPS 2020)
- (extensions of) gaussian mechanism (Balle, Borja, and Yu-Xiang Wang. "Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising." International Conference on Machine Learning. PMLR, 2018)

Overview:

- Spike-timing dependent plasticity (link)

Papers and projects:

- Spiking neural networks:
- B. Confavreux, F. Zenke, E.J. Agnes, T. Lillicrap, T.P. Vogels. "A meta-learning approach to (re)discover plasticity rules that carve a desired function into a neural network", 2020
- F. Zenke, S. Ganguli. "Superspike: Supervised learning in multilayer spiking neural networks", 2018
- Feedback alignment:
- M. Refinetti et al. "Align, then memorise: the dynamics of learning with feedback alignment", 2021
- J.M. Murray: "Local online learning in recurrent networks with random feedback", 2019

Overview:

- A. Globerson: How SGD Can Succeed Despite Non-Convexity and Over-Parameterization (slides)

Papers and projects:

- A. Brutzkus et al: "SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data", 2017
- Choose one or more papers listed on page 14 in the above mentioned slides :)
- A. Shevchenko, M. Mondelli. "Landscape Connectivity and Dropout Stability of SGD Solutions for Over-parameterized Neural Networks", 2020
- H. Petzka et al. "Relative Flatness and Generalization", 2021