Download the notebook here

[1]:
import numpy as np
import pandas as pd

Numerical optimization#

This tutorial shows how to do an optimization with estimagic with simple examples. More details on the topics covered here can be found in the how to guides.

Basic usage of minimize#

[2]:
from estimagic import minimize


def sphere(params):
    return params @ params


res = minimize(
    criterion=sphere,
    params=np.arange(5),
    algorithm="scipy_lbfgsb",
)

res.params.round(5)
[2]:
array([ 0., -0., -0., -0., -0.])

params do not have to be vectors#

In estimagic, params can by arbitrary pytrees. Examples are (nested) dictionaries of numbers, arrays and pandas objects.

[3]:
def dict_sphere(params):
    return params["a"] ** 2 + params["b"] ** 2 + (params["c"] ** 2).sum()


res = minimize(
    criterion=dict_sphere,
    params={"a": 0, "b": 1, "c": pd.Series([2, 3, 4])},
    algorithm="scipy_powell",
)

res.params
[3]:
{'a': -6.821706323446569e-25,
 'b': 2.220446049250313e-16,
 'c': 0    8.881784e-16
 1    8.881784e-16
 2    1.776357e-15
 dtype: float64}

The result contains all you need to know#

[4]:
res = minimize(
    criterion=dict_sphere,
    params={"a": 0, "b": 1, "c": pd.Series([2, 3, 4])},
    algorithm="scipy_neldermead",
)
res
[4]:
Minimize with 5 free parameters terminated successfully after 805 criterion evaluations and 507 iterations.

The value of criterion improved from 30.0 to 1.6760003634613059e-16.

The scipy_neldermead algorithm reported: Optimization terminated successfully.

Independent of the convergence criteria used by scipy_neldermead, the strength of convergence can be assessed by the following criteria:

                             one_step     five_steps
relative_criterion_change  1.968e-15***  2.746e-15***
relative_params_change     9.834e-08*    1.525e-07*
absolute_criterion_change  1.968e-16***  2.746e-16***
absolute_params_change     9.834e-09**   1.525e-08*

(***: change <= 1e-10, **: change <= 1e-8, *: change <= 1e-5. Change refers to a change between accepted steps. The first column only considers the last step. The second column considers the last five steps.)

You can visualize the convergence#

[5]:
from estimagic import criterion_plot, params_plot

fig = criterion_plot(res, max_evaluations=300)
fig.show(renderer="png")
../_images/getting_started_first_optimization_with_estimagic_9_0.png
[6]:
fig = params_plot(
    res,
    max_evaluations=300,
    # optionally select a subset of parameters to plot
    selector=lambda params: params["c"],
)
fig.show(renderer="png")
../_images/getting_started_first_optimization_with_estimagic_10_0.png

There are many optimizers#

If you install some optional dependencies, you can choose from a large (and growing) set of optimization algorithms – all with the same interface!

For example, we wrap optimizers from scipy.optimize, nlopt, cyipopt, pygmo, fides, tao and others.

We also have some optimizers that are not part of other packages. Examples are a parallel Nelder-Mead algorithm, The BHHH algorithm and a parallel Pounders algorithm.

The full list is here

You can add bounds#

[7]:
res = minimize(
    criterion=sphere,
    params=np.arange(5),
    algorithm="scipy_lbfgsb",
    lower_bounds=np.arange(5) - 2,
    upper_bounds=np.array([10, 10, 10, np.inf, np.inf]),
)

res.params.round(5)
[7]:
array([0., 0., 0., 1., 2.])

You can fix parameters#

[8]:
res = minimize(
    criterion=sphere,
    params=np.arange(5),
    algorithm="scipy_lbfgsb",
    constraints=[{"loc": [1, 3], "type": "fixed"}],
)

res.params.round(5)
[8]:
array([0., 1., 0., 3., 0.])

Or impose other constraints#

As an example, let’s impose the constraint that the first three parameters are valid probabilities, i.e. they are between zero and one and sum to one:

[9]:
res = minimize(
    criterion=sphere,
    params=np.array([0.1, 0.5, 0.4, 4, 5]),
    algorithm="scipy_lbfgsb",
    constraints=[{"loc": [0, 1, 2], "type": "probability"}],
)

res.params.round(5)
[9]:
array([ 0.33334,  0.33333,  0.33333, -0.     ,  0.     ])

For a full overview of the constraints we support and the syntax, check out the documentation.

Note that "scipy_lbfgsb" is not a constrained optimizer. If you want to know how we achieve this, check out the explanations.

There is also maximize#

If you ever forgot to switch back the sign of your criterion function after doing a maximization with scipy.optimize.minimize, there is good news:

[10]:
from estimagic import maximize


def upside_down_sphere(params):
    return -params @ params


res = maximize(
    criterion=upside_down_sphere,
    params=np.arange(5),
    algorithm="scipy_bfgs",
)

res.params.round(5)
[10]:
array([ 0., -0., -0.,  0., -0.])

You can provide closed form derivatives#

[11]:
def sphere_gradient(params):
    return 2 * params


res = minimize(
    criterion=sphere,
    params=np.arange(5),
    algorithm="scipy_lbfgsb",
    derivative=sphere_gradient,
)
res.params.round(5)
[11]:
array([ 0., -0., -0., -0., -0.])

Or use parallelized numerical derivatives#

[12]:
res = minimize(
    criterion=sphere,
    params=np.arange(5),
    algorithm="scipy_lbfgsb",
    numdiff_options={"n_cores": 6},
)

res.params.round(5)
[12]:
array([ 0., -0., -0., -0., -0.])

Turn local optimizers global with multistart#

[13]:
res = minimize(
    criterion=sphere,
    params=np.arange(10),
    algorithm="scipy_neldermead",
    soft_lower_bounds=np.full(10, -5),
    soft_upper_bounds=np.full(10, 15),
    multistart=True,
    multistart_options={"convergence.max_discoveries": 5},
)
res.params.round(5)
[13]:
array([ 0.,  0., -0., -0.,  0., -0.,  0., -0., -0., -0.])

And plot the criterion history of all local optimizations#

[14]:
fig = criterion_plot(res)
fig.show(renderer="png")
../_images/getting_started_first_optimization_with_estimagic_28_0.png

Exploit the structure of your optimization problem#

Many estimation problems have a least-squares structure. If so, specialized optimizers that exploit this structure can be faster than standard optimizers. Other problems have at least a sum-structure that can be exploited by optimizers (e.g. likelihood functions).

If you defined your criterion function a bit differently, you can seamlessly switch between least-squares, sum-structure and standard optimizers.

[15]:
def general_sphere(params):
    contribs = params**2
    out = {
        # root_contributions are the least squares residuals.
        # if you square and sum them, you get the criterion value
        "root_contributions": params,
        # if you sum up contributions, you get the criterion value
        "contributions": contribs,
        # this is the standard output
        "value": contribs.sum(),
    }
    return out


res = minimize(
    criterion=general_sphere,
    params=np.arange(5),
    algorithm="pounders",
)
res.params.round(5)
[15]:
array([ 0.,  0., -0.,  0., -0.])

Using and reading persistent logging#

For long running and difficult optimizations, it can be good to store the progress in a persistent log file. You can do this providing a path as logging argument:

[16]:
res = minimize(
    criterion=sphere,
    params=np.arange(5),
    algorithm="scipy_lbfgsb",
    logging="my_log.db",
    log_options={"if_database_exists": "replace"},
)

You can read the entries in the log file (while the optimization is still running or afterwards) as follows:

[17]:
from estimagic import OptimizeLogReader

reader = OptimizeLogReader("my_log.db")
reader.read_history().keys()
[17]:
dict_keys(['params', 'criterion', 'runtime'])

For more info on what you can do with the log and log reader, check out the logging tutorial

The persistent log file is always instantly synchronized when the optimizer tries a new parameter vector. This is very handy if an optimization has to be aborted and you want to extract the current status. It is also used by the estimagic dashboard.

Customize your optimizer#

Most algorithms have a few optional arguments. Examples are convergence criteria or tuning parameters. You can find an overview of supported arguments here.

[18]:
algo_options = {
    "convergence.relative_criterion_tolerance": 1e-9,
    "stopping.max_iterations": 100_000,
}

res = minimize(
    criterion=sphere,
    params=np.arange(5),
    algorithm="scipy_lbfgsb",
    algo_options=algo_options,
)
res.params.round(5)
[18]:
array([ 0., -0., -0., -0., -0.])
[ ]: