Simulated annealing in Python

Simulated annealing in Python

Optimization is a crucial aspect of many fields, as it helps us find the best possible solution to a problem. In statistics, for example, it’s common to maximize the likelihood function or minimize the norm of residuals, in microeconomics optimization is used to study the behaviour of economic agents, who are assumed to maximize their utility subject to various constraints.

There are many different types of optimization problems, including linear programming, nonlinear programming, convex optimization, and integer programming, to name a few. Each type of optimization problem requires a different approach and a different set of algorithms to solve it.

In this post, I will talk about simulated annealing, which is a well-known algorithm but also is still exotic for noninitiated. For the sake of simplicity, I'll talk about minimization problems since seeking the maximum of a function \(f\) equals seeking the maximum of a function \(-f\).

Simulated annealing

Simulated annealing is an iterative method for solving unconstrained and bound-constrained optimization problems. The algorithm borrows inspiration from the physical process of heating a material and then slowly lowering the temperature.

At each iteration of the simulated annealing algorithm, a new point \(x_i\) is randomly generated (if you don't know how computers deal with randomicity, see this article). As we'll see in a minute, the distance of the new point \(x\_i\) from the current point \(x\_{i-1}\) is proportional to the temperature and based on a certain probability distribution. The algorithm accepts all new points \(x_i \) such that \(f(x\_i) \leq f(x\_{i-1})\) where \(f\) is the objective function (i.e. the function to be minimized), but also \(x_i \) such that \(f(x\_i) \geq f(x\_{i-1})\), with a certain probability. This property is significant and it prevents the algorithm from being trapped in local minima.

Simulated annealing with Python

First of all, we need to load some packages:

import math
import random as rd

We now define the parameters, we need:

  • an objective function \(f\);

  • a domain (where the algorithm should look for a solution);

  • initial temperature;

  • an initial point (which is usually selected randomly);

  • a step size;

  • a maximum number of iterations.

# 1) the objective function
def f(x):
    return x**3 - 8

# 2) the domain
domain = [-10., 10.]

# 3) initial temperature
start_temp = 100

# 4) starting value
x_0 = rd.uniform(domain[0], domain[1])

# 5) the step size
step_size = 2

# 6) maximum number of iterations
max_iter = 1000
iteration = 0

First of all, we evaluate \(x_0\) and assign \(x_0\) and \(y_0 \) to x_best and y_best (the best value since now) and x_curr and y_curr (the current solution).

y_0 = f(x_0)

x_curr, y_curr = x_0, y_0
x_best, y_best = x_0, y_0

The first step of the algorithm is to generate a new candidate solution \(f(x_1) \) from the current solution \(f(x_0)\). We count an iteration (this step is crucial, otherwise the algorithm would run forever).

x_1 = x_curr + step_size * rd.uniform(-1, 1)
y_1 = f(x_1)

iteration += 1

Since we are looking for a minimum, if y_1 is smaller than y_best, we assign y_1 and x_1 to y_best and x_best. We then calculate the difference between y_best and y_curr.

if y_1 < y_best:
    x_best, y_best = x_1, y_1

diff = y_1 - y_curr

Here comes the most exciting part: we update the temperature (using fast annealing schedule) and use this value to calculate the Metropolis criterion:

where \(\Delta y\) is diff and \(t\) is temp. The numbers represent the probability of accepting the transition from \(x\_i\) to \(x\_{i+1}\) and are what allows to escape local minima.

temp = start_temp / (iteration + 1.)
metropolis = math.exp(-diff / temp)

if diff <= 0 or rd.random() < metropolis:
    x_curr, y_curr = x_best, y_best

And this is the last step. In fact after that, the algorithm calculates \(x_3\) and \(y_3\) and evaluates them for x_best and y_best and x_curr and y_curr and repeat itself until iteration == max_iter.

Making a function for simulated annealing

Since the algorithm at some point repeats itself, we may want to wrap it up in a function.

import math
import numpy as np
import random as rd

def simulated_annealing(f, domain, step_size, start_temp, max_iter = 1000):

    x_0 = rd.uniform(domain[0], domain[1])
    y_0 = f(x_0)
    x_curr, y_curr = x_0, y_0
    x_best, y_best = x_0, y_0

    for n in range(max_iter):
        x_i = x_curr + step_size * rd.uniform(-1, 1)
        y_i = f(x_i)

        if y_i < y_best:
            x_best, y_best = x_i, y_i

        diff = y_best - y_curr

        temp = start_temp/ float(n + 1)
        metropolis = math.exp(-diff / start_temp)

        if diff <= 0 or rd.random() < metropolis:
            x_curr, y_curr = x_i, y_i

    return [y_best, x_best]

Note that we don't have to count the iterations since we are using a for loop.

If we test the function we see that for well-chosen parameters the algorithm finds the value with a good approximation.

def fun(x):
    return x**2 + np.sin(x**4)

simulated_annealing(f = fun, domain = [-3, 3], step_size = 1, start_temp = 100, max_iter = 1000)

#> [9.915548806706291e-08, -0.00031488962865622305]

Finally, we plot what we got (the red line is the real minimum while the blue one is our result):

import numpy as np
from matplotlib import pyplot as plt

def fun(x):
    return x**2 + np.sin(x**4)

x = np.linspace(-3, 3, 1000)
y = fun(x)

plt.plot(x, y)

plt.axvline(x = 0, color = "blue", label = "real minimum")
plt.axvline(x = 9.915548806706291e-08, color = "red", label = "approximate minimum")
plt.legend(bbox_to_anchor = (1.0, 1), loc = "upper left")

plt.show()

In the picture, the approximate minimum overlaps the real minimum (they are too close) and only the approximate minimum is visible.

Beyond 2D

Of course, the algorithm work also in more than one dimension, but the function needs some adjustment. In particular, we have to define a domain for y:

def simulated_annealing_3d(f, domain_x, domain_y, step_size, start_temp, max_iter = 1000):

    x_0 = rd.uniform(domain_x[0], domain_x[1])
    y_0 = rd.uniform(domain_y[0], domain_y[1])
    z_0 = f(x_0, y_0)
    x_curr, y_curr, z_curr = x_0, y_0, z_0
    x_best, y_best, z_best = x_0, y_0, z_0

    for n in range(max_iter):
        x_i = x_curr + step_size * rd.uniform(-1, 1)
        y_i = y_curr + step_size * rd.uniform(-1, 1)
        z_i = f(x_i, y_i)

        if z_i < z_best:
            x_best, y_best, z_best = x_i, y_i, z_i

        diff = z_i - z_curr

        temp = start_temp / (n + 1)       
        metropolis = math.exp(-diff / temp)

        if diff <= 0 or rd.random() < metropolis:
            x_curr, y_curr, z_curr = x_i, y_i, z_i

    return [z_best, y_best, x_best]

Let's test the function:

def fun_3d(x, y):
    return (x-y)**2 + (x+y)**2

simulated_annealing_3d(f = fun_3d, domain_x = [-5, 5], domain_y = [-5, 5], step_size = 1, start_temp = 1000, max_iter = 10000)

#> [0.0007833147844967029, 0.018319454959260906, 0.007486986192318135]

If we plot the result:

from matplotlib import pyplot as plt

x = np.linspace(-.1, .1, 20)
y = np.linspace(-.1, .1, 20)
X, Y = np.meshgrid(x, y)
Z = fun_3d(X,Y)

a = np.repeat(0, 50)
b = np.repeat(0, 50)
c = np.arange(0, .05, .001)

a_ = np.repeat(0.015536251178558613, 50) 
b_ = np.repeat(-0.014426988378332561, 50)
c_ = np.arange(0, .05, .001)

fig = plt.figure(figsize=(4,4))
ax = fig.add_subplot(111, projection='3d')
ax.plot_wireframe(X, Y, Z, color = "red", linewidth = .3)
ax.plot(a, b, c, color = "blue", label = "real minimum")
ax.plot(a_, b_, c_, color = "green", label = "approximated minimum")


ax.set_xlabel("x")
ax.set_ylabel("y")
ax.set_zlabel("z")
plt.legend(bbox_to_anchor = (1.0, 1), loc = "upper left") 

plt.show()

Zooming we can appreciate the error.


And that's it for this article.

Thanks for reading.

For any question or suggestion related to what I covered in this article, please add it as a comment. For special needs, you can contact me here.