Marginal likelihood for Bayesian linear regression#
Author: Zeel B Patel, Nipun Batra
Bayesian linear regression is defined as below,
For a Gaussian random variable \(\mathbf{z} \sim \mathcal{N}(\boldsymbol{\mu}, \Sigma)\), \(A\mathbf{z} + \mathbf{b}\) is also a Gaussian random variable.
Marginal likelihood is \(p(\mathbf{y})\) so,
Multiplication of two Gaussians (work in progress)#
We need Gaussian pdf over same variables to evaluate their multiplication. Let us convert \(y\) into \(\theta\).
Now, we have both \(p(\mathbf{y}|\boldsymbol{\theta})\) and \(p(\boldsymbol{\theta})\) in terms of \(\boldsymbol{\theta}\). We can apply the rules from 6.5.2 of MML book. Writing our results in terminology of 6.5.2.
we know that,
In the Bayesian setting,
Let us evaluate the posterior,
Now, we evaluate the marginal likelihood,
Another well-known formulation of marginal likelihood is the following,
Let us verify if both are the same, empirically,
import numpy as np
import scipy.stats
np.random.seed(0)
def ML1(X, y, m0, S0, sigma_n):
N = len(y)
return scipy.stats.multivariate_normal.pdf(y.ravel(), (X@m0).squeeze(), X@S0@X.T + np.eye(N)*sigma_n**2)
def ML2(X, y, m0, S0, sigma_n):
D = len(m0)
a = np.linalg.inv(X.T@X)@X.T@y
b = m0
A = np.linalg.inv(X.T@X)*sigma_n**2
B = S0
return scipy.stats.multivariate_normal.pdf(a.ravel(), b.ravel(), A+B)
def ML3(X, y, m0, S0, sigma_n):
N = len(y)
Sn = np.linalg.inv((X.T@X)/(sigma_n**2) + np.linalg.inv(S0))
Mn = Sn@((X.T@y)/(sigma_n**2) + np.linalg.inv(S0)@m0)
LML = -0.5*N*np.log(2*np.pi) - 0.5*N*np.log(sigma_n**2) - 0.5*np.log(np.linalg.det(S0)/np.linalg.det(Sn)) - 0.5*(y.T@y)/sigma_n**2 + 0.5*(Mn.T@np.linalg.inv(Sn)@Mn)
return np.exp(LML)
X = np.random.rand(10,2)
m0 = np.random.rand(2,1)
s0 = np.random.rand(2,2)
S0 = s0@s0.T
sigma_n = 10
y = np.random.rand(10,1)
ML1(X, y, m0, S0, sigma_n), ML2(X, y, m0, S0, sigma_n), ML3(X, y, m0, S0, sigma_n)
(9.577110083272389e-15, 0.0034284478634232078, array([[2.08309892e-14]]))
Products of Gaussian PDFs (Work under progress)#
Product of two Gaussians \(\mathbf{x} \sim \mathcal{N}(\boldsymbol{\mu}_0, \Sigma_0)\) and \(\mathbf{x} \sim \mathcal{N}(\boldsymbol{\mu}_1, \Sigma_1)\) is an unnormalized Gaussian.
We need to find figure out value of \(c\) to solve the integration.
We can compare the exponent terms directly. We get the following results by doing that
Now, solving for the normalizing constant \(c\),
If we have two Gaussians \(\mathcal{N}(\mathbf{a}, A)\) and \(\mathcal{N}(\mathbf{b}, B)\) for same random variable \(\mathbf{x}\), Marginal likelihood can be given as,
Here, we have two Gaussians \(\mathcal{N}(0, \sigma^2I)\) and \(\mathcal{N}((X^TX)^{-1}X^T\mathbf{y}, \frac{(X^TX)^{-1}}{\sigma_n^2} )\) for same random variable \(\boldsymbol{\theta}\), Marginal likelihood can be given as,
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats
np.random.seed(0)
N = 10
D = 5
sigma_n = 0.1 # noise
sigma = 1 # variance in parameters
m0 = np.random.rand(D)
S0 = np.eye(D)*sigma**2
x = np.random.rand(N,D)
theta = np.random.rand(D,1)
y = x@theta + np.random.multivariate_normal(np.zeros(N), np.eye(N)*sigma_n**2, size=1).T
plt.scatter(x[:,0], x[:,1], c=y)
x.shape, theta.shape, y.shape
((10, 5), (5, 1), (10, 1))
a = np.linalg.inv(x.T@x)@x.T@y
b = m0.reshape(-1,1)
A = np.linalg.inv(x.T@x)/(sigma_n**2)
B = S0
A_inv = np.linalg.inv(A)
B_inv = np.linalg.inv(B)
c_cov = np.linalg.inv(A_inv + B_inv)
c_mean = c_cov@(A_inv@a + B_inv@b)
a.shape, A.shape, b.shape, B.shape, c_mean.shape, c_cov.shape
((5, 1), (5, 5), (5, 1), (5, 5), (5, 1), (5, 5))
c_denom = 1/(((2*np.pi)**(D/2))*(np.linalg.det(c_cov)**0.5))
b_denom = 1/(((2*np.pi)**(D/2))*(np.linalg.det(B)**0.5))
a_denom = 1/(((2*np.pi)**(D/2))*(np.linalg.det(A)**0.5))
a_denom, b_denom, c_denom, 1/c_denom
(1.5040129154541655e-07,
0.010105326013811642,
0.0110028525380197,
90.88552232655665)
normalizer_c = (1/(((2*np.pi)**(D/2))*(np.linalg.det(A+B)**0.5)))*np.exp(-0.5*((a-b).T@np.linalg.inv(A+B)@(a-b)))
norm_c_a_given_b = scipy.stats.multivariate_normal.pdf(a.squeeze(), b.squeeze(), A+B)
norm_c_b_given_a = scipy.stats.multivariate_normal.pdf(b.squeeze(), a.squeeze(), A+B)
normalizer_c, norm_c_a_given_b, norm_c_b_given_a, 1/normalizer_c
(array([[1.35765194e-07]]),
1.357651942204283e-07,
1.357651942204283e-07,
array([[7365658.08152844]]))
a_pdf = scipy.stats.multivariate_normal.pdf(theta.squeeze(), a.squeeze(), A)
b_pdf = scipy.stats.multivariate_normal.pdf(theta.squeeze(), b.squeeze(), B)
c_pdf = scipy.stats.multivariate_normal.pdf(theta.squeeze(), c_mean.squeeze(), c_cov)
a_pdf, b_pdf, c_pdf, np.allclose(a_pdf*b_pdf, normalizer_c*c_pdf)
(1.5039199356435742e-07, 0.008635160418150373, 0.00956547808509135, True)
K = x@S0@x.T + np.eye(N)*sigma_n**2
marginal_Likelihood_closed_form = scipy.stats.multivariate_normal.pdf(y.squeeze(), (x@m0).squeeze(), K)
marginal_Likelihood_closed_form, 1/normalizer_c
(1.8288404157840938, array([[7365658.08152844]]))
from sklearn.model_selection import KFold
from sklearn.linear_model import LinearRegression
splitter = KFold(n_splits=100)
for train_ind, test_ind in splitter(x):
train_x, train_y = x[train_ind], y[train_ind]
test_x, test_y = x[test_ind], y[test_ind]
model = LinearRegression()
model.fit(train_x, train_y)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-75-224c1ff02397> in <module>
3
4 splitter = KFold(n_splits=100)
----> 5 for train_ind, test_ind in splitter(x):
6 train_x, train_y = x[train_ind], y[train_ind]
7 test_x, test_y = x[test_ind], y[test_ind]
TypeError: 'KFold' object is not callable