I’m guessing that most people are pretty comfortable with the concept of uncorrelated Gaussian noise. It’s the most frequently assumed noise. Even if you don’t realise it, you’re probably assuming Gaussian noise.
Quick check: Are you using a chi-squared test to fit your data? Yes? Well there you go.
Co-variate Gaussian Noise
Here I’m going to talk about multi-variate, or co-variate, Gaussian noise. Co-variate Gaussian noise is the situation where the value of one data point affects the value of another. This kind of co-variance, i.e. variance between things, is usually expressed as a covariance matrix.
If you have N data points, then your covariance matrix will have a size: N x N. The matrix is normally denoted K (or sometimes ) .
Uncorrelated, or independent, Gaussian noise is a special case of the covariance matrix where only the diagonal elements have a non-zero value, i.e.
The value of each diagonal element corresponds to the variance of a particular data point, e.g. a data point at position with a mean value would have a variance ; a data point at position with a mean value would have a variance , and so on and so forth.
The actual y-value at each x-position will be drawn from a probability distribution
which in the case of a diagonal covariance matrix reduces to
However, if we start to add in non-zero values to the other elements of the covariance matrix then this will no longer be the case and the y-value at one position will affect the y-value at another.
This is a very brief explanation of covariate Gaussian noise. For a better and more detailed description, I like these references:
- Gaussian Processes for Machine Learning, Carl Edward Rasmussen and Chris Williams, the MIT Press
- Gaussian processes for time-series modelling, S. Roberts, M. Osborne, M. Ebden, S. Reece, N. Gibson and S. Aigrain, Phil. Trans. R. Soc. A 2013 371, 20110550
The second of these has a particularly nice figure showing the effect of covariate Gaussian noise. It looks like this:
Here I’m going to explain how to recreate this figure using Python.
Covariate Gaussian Noise in Python
To simulate the effect of co-variate Gaussian noise in Python we can use the numpy library function
Note: the Normal distribution and the Gaussian distribution are the same thing.
First off, let’s load some libraries:
import numpy as np # the numpy library import pylab as pl # the matplotlib for plotting
then we can call the
# make an array of positions # these are evenly spaced, but they don’t have to be x = np.arange(0, 20.,0.01) # use numpy to draw a sample from the multi-variate # normal distribution, N(0,K), at all the positions in x y = np.random.multivariate_normal(np.zeros(len(x)),K)
multivariate_normal function takes two arguments: (1) an array of noiseless mean values for each of the x-positions, and (2) a covariance matrix for all the x-positions.
In this case, I’ve made all of the mean values equal to zero.
At the moment we haven’t specified K, so these lines of code won’t work just yet. To create K we need to build a matrix of values that are calculated by the function little k. This function is known as the covariance kernel and it defines how much of an affect one data value has on another.
The Covariance Kernel
To start with we are going to define a squared-exponential covariance kernel. This has the form:
where is the x-position of one data point and is the x-position of another. The value of the kernel is a function of how far away from each other these data points are, i.e. . It also has a couple of hyper-parameters that govern the overall shape of the kernel: and . These are referred to as hyper-parameters because they don’t really have any direct physical meaning so they’re not bog-standard normal model parameters. Here the parameter controls the normalisation of the kernel and controls the width of the kernel.
def cov_kernel(x1,x2,h,lam): """ Squared-Exponential covariance kernel """ k12 = h**2*np.exp(-1.*(x1 - x2)**2/lam**2) return k12
We can then use this kernel function to fill our covariance matrix:
def make_K(x, h, lam): """ Make covariance matrix from covariance kernel """ # for a data array of length x, make a covariance matrix x*x: K = np.zeros((len(x),len(x))) for i in range(0,len(x)): for j in range(0,len(x)): # calculate value of K for each separation: K[i,j] = cov_kernel(x[i],x[j],h,lam) return K
and we can then update our earlier call to the numpy
# make an array of 200 evenly spaced positions between 0 and 20: x = np.arange(0, 20.,0.01) # make a covariance matrix: K = make_K(x,h,lam) # draw samples from a co-variate Gaussian # distribution, N(0,K), at positions x1: y = np.random.multivariate_normal(np.zeros(len(x)),K)
Putting It All Together
We can expand this to look at what happens when we vary the hyper-parameters of the covariance kernel, in particular the width, :
# make an array of 200 evenly spaced positions between 0 and 20: x = np.arange(0, 20.,0.01) for i in range(0,3): h = 1.0 if (i==0): lam = 0.1 if (i==1): lam = 1.0 if (i==2): lam = 5.0 # make a covariance matrix: K = make_K(x,h,lam) # five realisations: for j in range(0,5): # draw samples from a co-variate Gaussian distribution, N(0,K): y = np.random.multivariate_normal(np.zeros(len(x)),K) tmp2 = '23'+str(i+3+1) pl.subplot(int(tmp2)) pl.plot(x,y) tmp1 = '23'+str(i+1) pl.subplot(int(tmp1)) pl.imshow(K) pl.title(r"$\lambda = $"+str(lam)) pl.show()
The top row of plots is showing the covariance matrix for each of the three different kernel widths. The bottom row is showing five realisations of data y-values that correspond to each of the covariance matrices. Remember that all of these realisations have zero mean so the only contribution to the y-values is (covariate) Gaussian noise.
The narrower the covariance kernel, the more diagonal the covariance matrix becomes and the more like independent noise the y-values appear.