Bo Kang (Ghent University) Subjectively Interesting Data Representations (9/11/2021, 2pm)
Stefano Teso (University of Trento) Debugging Machine Learning Models with Explanations (16/11/2021, 2pm)
A central tenet of explainable AI is that the bugs and biases affecting a model can be uncovered by computing and analyzing explanations for the model’s predictions. However, and crucially, techniques for explaining machine learning models do not enable experts to correct the bugs that they expose. In this talk, I will overview recent work on debugging machine learning models that approach the problem by supplying corrective supervision on the model’s explanations. In particular, I will discuss approaches based on local attribute-based explanations, global explanations, as well as example-based explanations. Moreover, I will illustrate how these techniques can be generalized to concept-based models by mixing attribute- and concept-level supervision. I will conclude by outlining some important open issues in this flourishing research topic.
Marco Bressan (University of Milan) Exact recovery of clusters in metric spaces: margins and convexities (24/9/2021, 1pm)
Sebastian Mair (Leuphana Universität Lüneburg) Computing Efficient Data Summaries (29/6/2021, 2pm)
Katrin Ullrich (Fraunhofer IWU) Binding Affinity Prediction - Multi-View Regression in Three Different Learning Scenarios (8/6/2021, 2pm)
Pascal Welke (University of Bonn) Efficient Graph Similarity Learning (27/4/2021, 2pm)
Mario Boley (Monash) Better Short Than Greedy: Interpretable Models Through Optimal Rule Boosting (20/4/2021, 11am)
Rule ensembles are designed to provide a useful trade-off between predictive accuracy and model interpretability. However, the myopic and random search components of current rule ensemble methods can compromise this goal: they often need more rules than necessary to reach a certain accuracy level or can even outright fail to accurately model a distribution that can actually be described well with a few rules. Here, we present a novel approach aiming to fit rule ensembles of maximal predictive power for a given ensemble size (and thus model comprehensibility). In particular, we present an efficient branch-and-bound algorithm that optimally solves the per-rule objective function of the popular second-order gradient boosting framework. Our main insight is that the boosting objective can be tightly bounded in linear time of the number of covered data points. Along with an additional novel pruning technique related to rule redundancy, this leads to a computationally feasible approach for boosting optimal rules that, as we demonstrate on a wide range of common benchmark problems, consistently outperforms the predictive performance of boosting greedy rules.
Antoine Ledent (TU Kaiserslautern) Orthogonal Inductive Matrix Completion (13/4/2021, 2pm)
In this talk I will go over our recent method, OMIC, an interpretable approach to inductive matrix completion based on a sum of multiple orthonormal side information terms, together with nuclear-norm regularization. The approach allows us to inject prior knowledge about the eigenvectors of the ground truth matrix. The approach is optimized by a provably converging algorithm, which optimizes all components of the model simultaneously. I will go over the most relevant particular cases, which apply when one wishes to include user/item biases, or when community side information is available. Time permitting, I will finish by presenting an optimized implementation of the algorithm in these cases, with computational complexity comparable to SoftImpute.
Magda Gregorova (University of Applied Sciences-Western Switzerland, Geneva) Learned transform compression with optimized entropy encoding (30/3/2021, 2pm)
We consider the problem of learned transform compression where we learn both, the transform as well as the probability distribution over the discrete codes. We utilize a soft relaxation of the quantization operation to allow for back-propagation of gradients and employ vector (rather than scalar) quantization of the latent codes. Furthermore, we apply similar relaxation in the code probability assignments enabling direct optimization of the code entropy. To the best of our knowledge, this approach is completely novel. We conduct a set of proof-of concept experiments confirming the potency of our approaches.
Gavin Smith (University of Nottingham) ``Model Class Reliance for Random Forests’’ (16/3/2021, 2pm)
Variable Importance (VI) has traditionally been cast as the process of estimating each variable’s contribution to a predictive model’s overall performance. Analysis of a single model instance, however, guarantees no insight into a variables relevance to underlying generative processes. Recent research has sought to address this concern via analysis of Rashomon sets - sets of alternative model instances that exhibit equivalent predictive performance to some reference model, but which take different functional forms. Measures such as Model Class Reliance (MCR) have been proposed, that are computed against Rashomon sets, in order to ascertain how much a variable must be relied on to make robust predictions, or whether alternatives exist. If MCR range is tight, we have no choice but to use a variable; if range is high then there exists competing, perhaps fairer models, that provide alternative explanations of the phenomena being examined. Applications are wide, from enabling construction of ‘fairer’ models in areas such as recidivism to health analytics and ethical marketing. Tractable estimation of MCR for non-linear models is currently restricted to Kernel Regression under squared loss [7]. In this paper we introduce a new technique that extends computation of Model Class Reliance (MCR) to Random Forest classifiers and regressors. The proposed approach addresses a number of open research questions, and in contrast to prior Kernel SVM MCR estimation, runs in linearithmic rather than polynomial time. Taking a fundamentally different approach to previous work, we provide a solution for this important model class, identifying situations where irrelevant covariates do not improve predictions.
Daniel Paurat (Telekom) ``Machine Learning @ Telekom’’ (2/3/2021, 2pm)
Dino Oglic (King’s College London) on ``Parznets – Deep CNNs for Waveform-based Speech Recognition ‘’ (10/11/2020, 2pm)
We investigate the potential of stochastic neural networks for learning effective waveform-based acoustic models. The waveform-based setting, inherent to fully end-to-end speech recognition systems, is motivated by several comparative studies of automatic and human speech recognition that associate standard non-adaptive feature extraction techniques with information loss which can adversely affect robustness. Stochastic neural networks, on the other hand, are a class of models capable of incorporating rich regularization mechanisms into the learning process. We consider a deep convolutional neural network that first decomposes speech into frequency sub-bands via an adaptive parametric convolutional block where filters are specified by cosine modulations of compactly supported windows. The network then employs standard non-parametric 1D convolutions to extract relevant spectro-temporal patterns while gradually compressing the structured high dimensional representation generated by the parametric block. We rely on a probabilistic parametrization of the proposed neural architecture and learn the model using stochastic variational inference. This requires evaluation of an analytically intractable integral defining the Kullback-Leibler divergence term responsible for regularization, for which we propose an effective approximation based on the Gauss-Hermite quadrature. Our empirical results demonstrate a superior performance of the proposed approach over comparable waveform-based baselines and indicate that it could lead to robustness. Moreover, the approach outperforms a recently proposed deep convolutional neural network for learning of robust acoustic models with standard FBANK features.
Linara Adilova (Fraunhofer IAIS) (27/10/2020, 2pm)
Florian Seiffarth (University of Bonn) ``Learning with Closure Spaces’’ (13/10/2020)
Fabio Vitale (Inria Lille – Nord Europe and University of Lille) on ``Fast Clustering through Pairwise Similarity Information’’ (29/9/2020, 2pm)
Michael Kamp (Monash University, Melbourne) on ``Black-Box Machine Learning’’ (1/9/2020, 1pm)