{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Understanding Kernels in Gaussian Processes Regression\n", "> Using GPy and some interactive visualisations for understanding GPR and applying on a real world data set\n", "\n", "- toc: true \n", "- badges: true\n", "- comments: true\n", "- author: Nipun Batra\n", "- categories: [ML]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Disclaimer\n", "\n", "This blog post is forked from [GPSS 2019](http://gpss.cc/gpss19/) [Lab 1](https://nbviewer.jupyter.org/github/gpschool/labs/blob/2019/2019/.answers/lab_1.ipynb). This is produced only for educational purposes. All credit goes to the GPSS organisers. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Support for maths\n", "import numpy as np\n", "# Plotting tools\n", "from matplotlib import pyplot as plt\n", "# we use the following for plotting figures in jupyter\n", "%matplotlib inline\n", "\n", "import warnings\n", "warnings.filterwarnings('ignore')\n", "\n", "# GPy: Gaussian processes library\n", "import GPy\n", "from IPython.display import display\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Covariance functions, aka kernels\n", "\n", "We will define a covariance function, from hereon referred to as a kernel, using `GPy`. The most commonly used kernel in machine learning is the Gaussian-form radial basis function (RBF) kernel. It is also commonly referred to as the exponentiated quadratic or squared exponential kernel – all are equivalent.\n", "\n", "The definition of the (1-dimensional) RBF kernel has a Gaussian-form, defined as:\n", "\n", "$$\n", " \\kappa_\\mathrm{rbf}(x,x') = \\sigma^2\\exp\\left(-\\frac{(x-x')^2}{2\\mathscr{l}^2}\\right)\n", "$$\n", "\n", "It has two parameters, described as the variance, $\\sigma^2$ and the lengthscale $\\mathscr{l}$.\n", "\n", "In GPy, we define our kernels using the input dimension as the first argument, in the simplest case `input_dim=1` for 1-dimensional regression. We can also explicitly define the parameters, but for now we will use the default values:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
rbf. | value | constraints | priors |
---|---|---|---|
variance | 4.0 | +ve | |
lengthscale | 0.5 | +ve |