Bachelor Seminar Wissenschaftliches Arbeiten

General information

introductory slides: (pdf)
TISS: (link)
contact: Maximilian Thiessen
meeting link: https://gotomeet.me/maximilianthiessen

Timeline

Date	Deadline
14.10. 17:00	first goto meeting
29.10.	spotlight and abstract
11.11.	bidding
02.12. 16:00	progress presentation goto
08.12.	draft report
15.12.	reviewing your peers
20.01. 16:00	final presentation goto
28.01.	final report

Format

This seminar simulates a machine learning conference, where the students take on the role of authors and reviewers. It consists of multiple phases.

1. Proposal phase

Option 1: our suggestions

You select two projects/papers (i.e. two bullet points) from one of the topics below. You will work with the material mentioned in the overview and the project-specific resources.

Option 2: your own projects

You choose two different own project ideas to work on. This can be some existing machine learning paper/work or an own creative idea in the context of machine learning. Importantly, it has to be specific and worked out well.

Independent of the option you chose, understand the fundamentals of your projects and try to answer the following questions:

What is the problem?
Why is it an interesting problem?
How do you plan to approach the problem? / How have the authors of your project approached the problem?

Send an email to Maximilian Thiessen with the subject “Seminar on Theoretical Aspects of Machine Learning Algorithms (proposal)”, containing your name, the two selected projects and a short description of your projects together with initial answers to the questions (~3 sentences shoud be sufficient).

We can only accept your own proposals if you can answer the mentioned questions and have a well worked out project idea.

Attend the mandatory first meeting on 14.10 at 16:00 (https://gotomeet.me/maximilianthiessen). There you will have a chance to introduce yourself and pitch your projects. We will give preferences to students who can already present some details of their projects.

Until 29.10. (AoE), record a short (~30 seconds) spotlight talk for both your topics upload it to TUWEL. Also, write an abstract on both topics and upload them to easychair.org.

2. Bidding and assignment phase

You will also act as reviewers and bid on the projects of your peers you want to review. Based on the biddings, we (in the role as chairs of the conference) will select one of each student’s proposals as the actual project you will work on for the rest of this semester. You do not need to work on the other project, anymore. Additionally, we will also assign two different projects from other students to you, which you will have to review later in the semester.

3. Working phase

Now the actual work starts. Gather deep understanding of your project, write a first draft of your report and give a 5-minute presentation. Feel free to go beyond the given material.

4. Reviewing phase

You will again act as a reviewer for the conference by writing two reviews, one for each draft report assigned to you.

5. Writing phase

Based on the reviews from your peers (and our feedback) you will further work on your project.

6. Submission phase

Give a final presentation and submit your report.

General resources (freely available books)

Understanding machine learning: from theory to algorithms. Shai Shalev-Shwartz and Shai Ben-David (pdf)
Foundations of machine learning. Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar (pdf)
Foundations of data science. Avrim Blum, John Hopcroft, and Ravindran Kannan (pdf)
Mathematics for machine learning. Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong (pdf)
Mining of massive datasets. Jure Leskovec, Anand Rajaraman, and Jeffrey D. Ullman (pdf)
Reinforcement learning: an introduction. Richard Sutton and Andrew Barto (pdf)

Topics

You should have access to the literature and papers through Google scholar, DBLP, the provided links, or the TU library.

Kernels

Overview:
preface and introduction up to section 1.5 of “Learning with kernels” by Bernhard Schölkopf and Alex Smola, 2002 (pdf).

Papers and projects:

support vector machines (Bennett and Campbell. “Support vector machines: hype or hallelujah?.” ACM SIGKDD 2000)
one class support vector machine (Khan and Madden. “A survey of recent trends in one class classification.” Irish conference on artificial intelligence and cognitive science 2009)
string kernels (Lodhi, et al. “Text classification using string kernels.” Journal of machine learning research 2002)
kernels for distances (Schölkopf. “The kernel trick for distances.” NIPS 2001)

Online learning

Overview:
chapter 1 of “A modern introduction to online learning” by Francesco Orabona, 2020.

Papers and projects:

online (sub-)gradient descent (chapter 2 of “A modern introduction to online learning”, Francesco Orabona, 2020)
stochastic bandits (introduction and chapter 1 of “Introduction to multi-armed bandits”, Aleksandrs Slivkins, 2019)
online learning with expert advice (introduction and chapter 5 of “Introduction to multi-armed bandits”, Aleksandrs Slivkins, 2019)
adversarial bandits (introduction and chapter 6 of “Introduction to multi-armed bandits”, Aleksandrs Slivkins, 2019)
learning directed cuts (Gärtner and Garriga. “The cost of learning directed cuts.” ECML 2007)
predicting dynamic difficulty (Missura and Gärtner. “Predicting dynamic difficulty.” NIPS 2011)

Clustering

Overview:
chapter 7.1 of “Mining of massive datasets” (MMD) and chapter 22 (the introduction section before 22.1 and section 22.5) of “Understanding machine learning”.

Papers and projects:

k-means (chapter 7.3 of the MMD book)
hierarchical clustering (chapter 7.2 of the MMD book)

Dimensionality reduction

Overview:
chapter 1 and 2 of “Dimension reduction: a guided tour” by Christopher Burges, 2010.

Papers and projects:

principal component analysis (PCA) and singular value decomposition (SVD) (chapter 3 of Foundations of Data Science book)
random projections (chapter 23.2 of “Understanding machine learning” and Dasgupta. “Experiments with random projection.” UAI 2000)

Graph kernels and graph neural networks

Overview:
chapter 1/introduction of “Graph representation learning” (GRL) by William L. Hamilton, 2020 (pdf).

Papers and projects:

graph kernels for protein prediction (Borgwardt, et al. “Protein function prediction via graph kernels.” Bioinformatics 2005)
graphlet kernels (Shervashidze, et al. “Efficient graphlet kernels for large graph comparison.” Artificial intelligence and statistics 2009)
hardness and walk-based kernels (Gärtner, et al. “On graph kernels: hardness results and efficient alternatives.” Learning theory and kernel machines 2003)
cyclic pattern kernel (Horváth, et al. “Cyclic pattern kernels for predictive graph mining.” ACM SIGKDD 2004)
graph embeddings with node2vec (Grover and Leskovec. “node2vec: scalable feature learning for networks.” ACM SIGKDD 2016)
neural message passing and graph convolutions (chapter 5.1 to 5.2.1 of the GRL book)

Reinforcement learning

Overview:
chapter 1/introduction of “Reinforcement learning: an introduction”

Papers and projects:

multi-view reinforcement learning (Li, et al. “Multi-view reinforcement learning.” NeurIPS 2019)
trust-region policy optimization (Kurutach, et al. “Model-ensemble trust-region policy optimization.” ICLR 2018)
game theory and reinforcement learning (Lanctot, et al. “A unified game-theoretic approach to multiagent reinforcement learning.” NIPS 2017)

Semi-supervised learning

Overview:
chapter 1/introduction of “Semi-supervised learning” by Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien, 2006 (pdf).

Papers and projects:

graph cuts (Blum and Chawla. “Learning from labeled and unlabeled data using graph mincuts.” ICML 2001)
label propagation (Zhu, et al. “Semi-supervised learning using Gaussian fields and harmonic functions.” ICML 2003)
learning with local and global consistency (Zhou, et al. “Learning with local and global consistency.” NIPS 2004)
semi-supervised learning by entropy minimization (Grandvalet and Bengio. “Semi-supervised learning by entropy minimization.” NIPS 2005)

Active learning

Overview:
chapter 1 “Automating inquiry” of Burr Settles’ “Active learning” (AL) book, 2012.

Papers and projects:

uncertainty sampling (chapter 2 of AL)
searching through the hypothesis space (chapter 3 of AL)
minimizing expected error and variance (chapter 4 of AL)
exploiting structure in data (chapter 5 of AL)

Overview:
chapter 10.1 “Social networks as graphs” of the book “Mining of massive datasets” (MMD).

Papers and projects:

community detection (chapter 10.5 of the MMD book)
discovering social circles (Leskovec, et al. “Learning to discover social circles in ego networks.” NIPS 2012)
link prediction (Backstrom and Leskovec. “Supervised random walks: predicting and recommending links in social networks.” ACM conference on web search and data mining 2011)
graphs over time (Leskovec, et al. “Graphs over time: densification laws, shrinking diameters and possible explanations.” ACM SIGKDD 2005)