<no title> — Prog-ML

theta_lin = np.linspace(-3.2, 3.2, 100)
costs_square_vals = np.zeros_like(theta_lin)
costs_cross_vals = np.zeros_like(theta_lin)


for i, th in enumerate(theta_lin):
    costs_square_vals[i] = cost_sq(X, th)
    costs_cross_vals[i] = cost_cross(X, th)


plt.plot(theta_lin, costs_square_vals, label="Squared")
plt.plot(theta_lin, costs_cross_vals, label="Cross-Entropy")
plt.legend()

<matplotlib.legend.Legend at 0x7f950daa7730>

../../_images/e225b0bec670e749da438254fb3fef3c69341e33f22eb725fe96ab5f0064b5ad.png

Let us look at the squared loss curve in more detail. Clearly, if we draw a line joining two points, some part of the curve is above and some part of the curve is below the line.

plt.plot(theta_lin, costs_square_vals, label="Squared")

plt.plot([-1, 3], [cost_sq(X, -1), cost_sq(X, 3)], color='k')

[<matplotlib.lines.Line2D at 0x7f950db99d30>]

../../_images/cea43c42a0938ee3247ff9b76c3c1befdd1ef1cec22aea0098b23b2cf9bcf0da.png

We can also understand the same from a formulae perspective

Let us first discuss cross-entropy loss. For our data it is:

\(\hat{y}_0 = \frac{1}{1+e^{\theta}}\)

\(\hat{y}_1 = \frac{1}{1+e^{-\theta}}\)

Loss_cross = \(-1*\log(1-\hat{y}_0) -1*\log(\hat{y}_1) \because y_0 = 0, y_1 = 1 \)

From https://www.sarthaks.com/354259/expand-log-1-e-x-in-ascending-powers-of-x-up-to-the-term-containing-x-4 we can see that \(\log(1+e^{x})\) contains \(x + x^2 +x^4\) like terms, thus this loss is convex

x2 = np.linspace(-10, 10, 100)
plt.plot(x2, sigm(x2))
plt.plot(x2, sq_sigm(x2))

[<matplotlib.lines.Line2D at 0x7f950d7f8730>]

../../_images/da88308830d6d9001e705481542d3dcff846f33ae46e6df45be0424d491519b4.png

Contents