Graph-based active learning

Task

Implement and design an efficient (python) framework to compare graph-based active learning algorithms.

We recommend using efficient graph libraries like graph-tool (https://graph-tool.skewed.de/).

The goal is to have a framework to compare different active learning algorithms more easily. Adding new active learning algorithms, new datasets, or new evaluation strategies should be as simple as possible. Data I/O, the active learning algorithms, the prediction algorithms, and the evaluations should be encapsulated and exchangeable (think of scikit-learn).

Start by reimplementing existing active learning methods:

Additionally, implementations of the following methods are of great interest for us (but are most probably out of scope for the 3 ECTS of an “algorithms and applications project” PR and rather suited for a bachelor or master thesis)

One simple (theoretically motivated) way of comparing active learning algorithms is desribed in (section 6. Experiments)

To empirically evaluate the predictive performance (e.g., the accuracy) of the active learning algorithms, you will also need prediction methods. We recommend using label propagation as described in

It is also described more generally in:

You do not have to implement these methods by yourself. Feel free to use scikit’s label propagation and/or label spreading https://scikit-learn.org/stable/modules/semi_supervised.html#label-propagation

It might also be interesting to compare it to cut-based approaches like

Regarding datasets, you could start with smaller synthetic graphs (e.g., sampled random graphs). There are also many graph datasets freely available online, e.g., snap http://snap.stanford.edu/data/index.html

Context

Even though implementations of graph-based active learning algorithms are often available, it is tedious to evaluate and compare them on real-world graphs. Additionally, most existing active learning libraries like modAL, alipy and libact focus on heuristic (e.g., uncertainty based) methods, whereas we are interested in graph-based methods with theoretical guarantees. Because of that, our goal is to design an easy-to-use python framework that enables efficient evaluations and comparisons of graph-based methods.

In the long run, it might be possible to integrate these implementations in popular machine learning libraries like scikit-learn (https://scikit-learn.org) or vowpal wabbit (https://vowpalwabbit.org/).