Contributions to approximate bayesian inference for machine learning

Rodriguez Santana, Simon

Contributions to approximate bayesian inference for machine learning

Rodriguez Santana, Simon

Zuzendaria:

Daniel Hernández Lobato Zuzendaria
David Gómez-Ullate Oteiza Zuzendaria

Defentsa unibertsitatea: Universidad Complutense de Madrid

Fecha de defensa: 2022(e)ko urtarrila-(a)k 18

Epaimahaia:

José Ignacio Hidalgo Pérez Presidentea
Juan Tinguaro Rodríguez González Idazkaria
Pablo Martínez Olmos Kidea
Ángela Fernández Pascual Kidea
Maria Isabel Valera Martinez Kidea

Saila:

Física Teórica

Mota: Tesia

Teseo: 157918 DIALNET Docta Complutense editor

Laburpena

Machine learning (ML) methods can learn from data and then be used for making predictions on new data instances. However, some of the most popular ML methods cannot provide information about the uncertainty of their predictions, which may be crucial in many applications. The Bayesianframework for ML introduces a natural approach to formulate many ML methods, and it also has the advantage of easily incorporating and reflecting different sources of uncertainty in the final predictive distribution. These sources include uncertainty related to, for example, the data, the model chosen, and its parameters. Moreover, they can be automatically balanced and aggregated using information from the observed data. Nevertheless, in spite of this advantage, exact Bayesian inference is intractable in most ML methods, and approximate inference techniques have to be used in practice. In this thesis we propose a collection of methods for approximate inference, withspecific applications in some popular approaches in supervised ML. First, we introduce neural networks (NNs), from their most basic concepts to some of their mostpopular architectures. Gaussian processes (GPs), a simple but important tool in Bayesian regression, are also reviewed. Sparse GPs are presented as a clever solution to improve GPs’ scalability by introducing new parameters: the inducing points. In the second half of the introductory partwe also describe Bayesian inference and extend the NN formulation using a Bayesian approach, which results in a NN model capable of outputting a predictive distribution. We will see why Bayesian inference is intractable in most ML approaches, and also describe sampling-based and optimization-based methods for approximate inference. The use of -divergences is introduced next, leading to a generalization of certain methods for approximate inference. Finally we will extend the GPs to implicit processes (IPs), a more general class of stochastic processes which provide a flexible framework from which we can define numerous models. Although promising, current IP-based ML methods fail to exploit of all their potential due to the limitations of the approximations required in their formulation...