Optimization

When we do not have a closed form, we can use the optimization to find estimate of parameters. In order to find optimization, you need a loos function and use a procedure to minimize the value of loss function based on the parameter values.

Newton-Raphson
Gradient Descent

Newton-Raphson method

The Newton-Raphson algorithm is one of the old iterative algorithm to approximately find the roots of a real-valued function (loos function); $eq1$ . The idea behind of it is based on the simple linear approximation. Given x_0, a starting point, by solving the root of function by solving
$eq2$ , we get closer to the solution.

The simple and plain algorithm of Newton-Raphosn is given in the below

Since we should run the iteration for many time; it is better to replace for with while loop. In the below, we present the algorithm in Python.

To write the algorithm in code, we consider $E_0=mc^2$ that is also used in Wikipedia.

Define the function

def f(x):
    return x**4 - 3 * x**3+2

def df(x):
    return 4 * x**3 - 9 * x**2

To see the minimum of the function, plot it:

from matplotlib import pyplot as plt
import numpy as np 
x_ran_0 = np.linspace(-7,7,100) 
plt.plot(x_ran_0 ,f(x_ran_0))
plt.xlabel('x')
plt.ylabel('f(x)')
plt.show(block=False)

To see how the algorithm works, we can add the plot pf gradients along running the iterations:

import matplotlib as mpl

x_s = -4 # Starting point
cur_x = x_s
diff_cur_prev= 1
precision = 0.001
max_ite = 1000  #Maximum number of iterations

#Choose different colors for each gradient
colors = mpl.cm.autumn(np.linspace(0,5,max_ite)) 

plt.plot(x_ran_0 ,f(x_ran_0)) # plot again the  f(x)

i= 1
while diff_cur_prev > precision:
    #Plot the points 
    plt.scatter(cur_x, f(cur_x), color='black', s=10, zorder=2);
    plt.scatter(cur_x, f(cur_x), color='white', s=5, zorder=2)  
    x_ran_1= np.linspace(cur_x-5,cur_x+5,10)
    plt.plot(x_ran_1, (df(cur_x) * (x_ran_1 - cur_x)) + f(cur_x), color=colors[i], zorder=1)
    prev_x = cur_x
    cur_x += -f(prev_x) / df(prev_x)
    diff_cur_prev = abs(cur_x - prev_x)
    print("Run: %d, Current: %f, Previous: %f, Different: %f" % (i, cur_x,prev_x,diff_cur_prev))
    i+= 1

plt.xlim([-10,10])
plt.xlabel('x')
plt.ylabel('f(x)')
plt.show(block=False)

Gradient Descent

Gradient Descent is variant of Newton-Raphson that can be used to find the minimum value of a differentiable function. The local minimum can be obtained by solving

$eq3$ .

We often use theta instead of x: $eq3$ , where $eq2$ is the derivative of loss function for the given parameters.

where $eq4$ is called the step size or learning rate, see Wikipedia. To not take big steps we use $eq4$ . We use Obviously, the criterion function is different from the Newton-Raphson Algorithm.

To write a simple code to find the minimum using the Gradient Descent consider $eq5$ that is already used for explaining the Newton-Raphson.

The plot of gradient can be obtained using running the following script:

import matplotlib as mpl

x_s = -4 # Starting point
cur_x = x_s
diff_cur_prev= 1
gamma = 0.001 
precision = 0.001
max_ite = 1000  #Maximum number of iterations

#Choose different colors for each gradient
colors = mpl.cm.autumn(np.linspace(0,5,max_ite)) 

plt.plot(x_ran_0 ,f(x_ran_0)) # plot again the  f(x)

i= 1
while diff_cur_prev > precision:
    #Plot the points 
    plt.scatter(cur_x, f(cur_x), color='black', s=10, zorder=2);
    plt.scatter(cur_x, f(cur_x), color='white', s=5, zorder=2)  
    x_ran_1= np.linspace(cur_x-5,cur_x+5,10)
    plt.plot(x_ran_1, (df(cur_x) * (x_ran_1 - cur_x)) + f(cur_x), color=colors[i], zorder=1)
    prev_x = cur_x
    cur_x += -gamma * df(prev_x)
    diff_cur_prev = abs(cur_x - prev_x)
    print("Run: %d, Current: %f, Previous: %f, Different: %f" % (i, cur_x,prev_x,diff_cur_prev))
    i+= 1

plt.xlim([-10,10])
plt.xlabel('x')
plt.ylabel('f(x)')
plt.show(block=False)

⬆ back to top

Optimization

Optimization

Contents

Newton-Raphson method

Gradient Descent

License