Constrained Non-negative matrix factorisation using CVXPY#
Constrained NMF using CVXPY!
toc: true
badges: true
comments: true
author: Nipun Batra
categories: [ML]
In a previous post, we saw how we can use CVXPY to perform non-negative matrix factorisation. In this post, I’ll show how to add additional constraints that may arise from the problem domain. As a trivial example, I’ll take constraints of the form when there is a less-than relationship among members of the matrix. For example, we may want to enforce certain movies to be always rated more than others? We’ll create a matrix of 30 users and 12 items. We will enforce the contraint that the rating of the first 6 items be atleast twice that of the last 6 items.
Creating a ratings matrix#
We will now create a matrix where the relationship among items exists.
import numpy as np
import pandas as pd
K, N, M = 2, 12, 30
Y_gen = np.random.rand(M, K)
X_1 = np.random.rand(K, N/2)
# So that atleast twice as much
X_2 = 2* X_1 + np.random.rand(K, N/2)
X_gen = np.hstack([X_2, X_1])
# Normalizing
X_gen = X_gen/np.max(X_gen)
# Creating A (ratings matrix of size M, N)
A = np.dot(Y_gen, X_gen)
pd.DataFrame(A).head()
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.732046 | 0.613565 | 0.961128 | 0.920089 | 0.244323 | 0.506472 | 0.280477 | 0.251049 | 0.324418 | 0.378219 | 0.075556 | 0.131750 |
1 | 0.903630 | 0.340956 | 0.784109 | 0.919741 | 0.190856 | 0.433635 | 0.321932 | 0.135134 | 0.290862 | 0.394680 | 0.052976 | 0.081148 |
2 | 0.972145 | 0.576558 | 1.046197 | 1.098279 | 0.261103 | 0.562996 | 0.358574 | 0.233405 | 0.368118 | 0.460967 | 0.077286 | 0.128344 |
3 | 0.292231 | 0.263864 | 0.401968 | 0.377116 | 0.102567 | 0.210890 | 0.113070 | 0.108163 | 0.134489 | 0.154266 | 0.031993 | 0.056299 |
4 | 0.694038 | 0.803459 | 1.125454 | 0.987344 | 0.290605 | 0.582178 | 0.278848 | 0.331075 | 0.365935 | 0.397023 | 0.093088 | 0.168300 |
We can see that for each user, the 0th item has higher rating compared to the 5th, 1st more than the 6th and so on. Now, in our alternating least squares implementation, we break down A as Y.X. Here X has dimensions of K, N. To ensure the relationship among the items, we will put contraints on X of the form: X[:, 0] > 2 x X[:, 5] and so on. We will create a simple for loop for the same.
e = "["
for a in range(N/2):
e+="X[:,%d] > 2 * X[:,%d]," %(a, a+N/2)
e = e[:-1] + "]"
e
'[X[:,0] > 2 * X[:,6],X[:,1] > 2 * X[:,7],X[:,2] > 2 * X[:,8],X[:,3] > 2 * X[:,9],X[:,4] > 2 * X[:,10],X[:,5] > 2 * X[:,11]]'
As we can see, we now have 6 constraints that we can feed into the optimisation routine. Whenever we learn X in the ALS, we apply these constraint.
CVX routine for handling input constraints#
def nmf_features(A, k, MAX_ITERS=30, input_constraints_X=None, input_constraints_Y = None):
import cvxpy as cvx
np.random.seed(0)
# Generate random data matrix A.
m, n = A.shape
mask = ~np.isnan(A)
# Initialize Y randomly.
Y_init = np.random.rand(m, k)
Y = Y_init
# Perform alternating minimization.
residual = np.zeros(MAX_ITERS)
for iter_num in xrange(1, 1 + MAX_ITERS):
# For odd iterations, treat Y constant, optimize over X.
if iter_num % 2 == 1:
X = cvx.Variable(k, n)
constraint = [X >= 0]
if input_constraints_X:
constraint.extend(eval(input_constraints_X))
# For even iterations, treat X constant, optimize over Y.
else:
Y = cvx.Variable(m, k)
constraint = [Y >= 0]
Temp = Y * X
error = A[mask] - (Y * X)[mask]
obj = cvx.Minimize(cvx.norm(error, 'fro'))
prob = cvx.Problem(obj, constraint)
prob.solve(solver=cvx.SCS)
if prob.status != cvx.OPTIMAL:
pass
residual[iter_num - 1] = prob.value
if iter_num % 2 == 1:
X = X.value
else:
Y = Y.value
return X, Y, residual
# Without constraints
X, Y, r = nmf_features(A, 3, MAX_ITERS=20)
# With contstraints
X_c, Y_c, r_c = nmf_features(A, 3, MAX_ITERS=20, input_constraints_X=e)
pd.DataFrame(X)
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.749994 | 0.112355 | 0.485850 | 0.674801 | 0.113004 | 0.281371 | 0.257239 | 0.04056 | 0.196474 | 0.297978 | 0.02745 | 0.033952 |
1 | 0.102384 | 0.222149 | 0.266055 | 0.199361 | 0.070403 | 0.133510 | 0.047174 | 0.09233 | 0.081233 | 0.076518 | 0.02375 | 0.045097 |
2 | 0.567213 | 0.558638 | 0.825066 | 0.756059 | 0.211427 | 0.430690 | 0.222174 | 0.22944 | 0.273260 | 0.307475 | 0.06659 | 0.118371 |
pd.DataFrame(X_c)
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.749882 | 0.112384 | 0.485923 | 0.674778 | 0.113027 | 0.281399 | 0.257206 | 0.040566 | 0.196489 | 0.297960 | 0.027461 | 0.033971 |
1 | 0.102366 | 0.222080 | 0.266058 | 0.199353 | 0.070404 | 0.133511 | 0.047168 | 0.092298 | 0.081233 | 0.076513 | 0.023751 | 0.045091 |
2 | 0.567363 | 0.558700 | 0.825253 | 0.756242 | 0.211473 | 0.430789 | 0.222234 | 0.229470 | 0.273319 | 0.307549 | 0.066604 | 0.118382 |
Ok. The obtained X matrix looks fairly similar. How about we reverse the constraints.
e_rev = "["
for a in range(N/2):
e_rev+=" 2* X[:,%d] < X[:,%d]," %(a, a+N/2)
e_rev = e_rev[:-1] + "]"
e_rev
'[ 2* X[:,0] < X[:,6], 2* X[:,1] < X[:,7], 2* X[:,2] < X[:,8], 2* X[:,3] < X[:,9], 2* X[:,4] < X[:,10], 2* X[:,5] < X[:,11]]'
X_c_rev, Y_c_rev, r_c_rev = nmf_features(A, 3, MAX_ITERS=20, input_constraints_X=e_rev)
pd.DataFrame(X_c_rev)
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.250945 | 0.038070 | 0.174189 | 0.252085 | 0.033251 | 0.069176 | 0.502026 | 0.076147 | 0.348450 | 0.504277 | 0.066521 | 0.138405 |
1 | 0.030757 | 0.088033 | 0.085947 | 0.065135 | 0.024395 | 0.045976 | 0.061398 | 0.176002 | 0.171773 | 0.130146 | 0.048760 | 0.091882 |
2 | 0.220256 | 0.183292 | 0.269014 | 0.282814 | 0.065713 | 0.128120 | 0.440553 | 0.366600 | 0.538065 | 0.565669 | 0.131436 | 0.256263 |
There you go! We now have learnt latent factors that conform to our constraints.