Bayesian interpretation of kernel regularization information

Within bayesian statistics for machine learning, kernel methods arise from the assumption of an inner product space or similarity structure on inputs. For some such methods, such as support vector machines (SVMs), the original formulation and its regularization were not Bayesian in nature. It is helpful to understand them from a Bayesian perspective. Because the kernels are not necessarily positive semidefinite, the underlying structure may not be inner product spaces, but instead more general reproducing kernel Hilbert spaces. In Bayesian probability kernel methods are a key component of Gaussian processes, where the kernel function is known as the covariance function. Kernel methods have traditionally been used in supervised learning problems where the input space is usually a space of vectors while the output space is a space of scalars. More recently these methods have been extended to problems that deal with multiple outputs such as in multi-task learning.^[1]

A mathematical equivalence between the regularization and the Bayesian point of view is easily proved in cases where the reproducing kernel Hilbert space is finite-dimensional. The infinite-dimensional case raises subtle mathematical issues; we will consider here the finite-dimensional case. We start with a brief review of the main ideas underlying kernel methods for scalar learning, and briefly introduce the concepts of regularization and Gaussian processes. We then show how both points of view arrive at essentially equivalent estimators, and show the connection that ties them together.

^ Álvarez, Mauricio A.; Rosasco, Lorenzo; Lawrence, Neil D. (June 2011). "Kernels for Vector-Valued Functions: A Review". arXiv:1106.6251 [stat.ML].

[AlvRosLaw11-1] Álvarez, Mauricio A.; Rosasco, Lorenzo; Lawrence, Neil D. (June 2011). "Kernels for Vector-Valued Functions: A Review". arXiv:1106.6251 [stat.ML].

Bayesian interpretation of kernel regularization information

and 26 Related for: Bayesian interpretation of kernel regularization information

Bayesian interpretation of kernel regularization

List of things named after Thomas Bayes

Bayesian linear regression

Outline of machine learning

Gaussian process

Support vector machine

Regularized least squares

Kernel methods for vector output

Regularization perspectives on support vector machines

Outline of statistics

Supervised learning

Pattern recognition

Nonparametric regression

List of statistics articles

Inverse problem

Partial least squares regression

Casimir effect

Types of artificial neural networks

Regression analysis

Path integral formulation

Polynomial regression

Autoencoder

Wolfgang Pauli

Canonical correlation

Discrete choice

Probabilistic numerics