Download the notebook here

How to do multistart optimizations#

How to turn on multistart#

Turning on multistart, literally just means adding multistart=True when you call maximize or minimize and adding sampling bounds to the params DataFrame. Of course, you can configure every aspect of the multistart optimization if you want. But if you don’t, we pick good defaults for you.

Let’s look at the well known “sphere” example again:

[12]:
import pandas as pd

from estimagic import minimize


def sphere(params):
    return params["value"] @ params["value"]


params = pd.DataFrame()
params["value"] = [1, 2, 3]
params["soft_lower_bound"] = -5
params["soft_upper_bound"] = 10
params
[12]:
value soft_lower_bound soft_upper_bound
0 1 -5 10
1 2 -5 10
2 3 -5 10
[13]:
res = minimize(
    criterion=sphere,
    params=params,
    algorithm="scipy_lbfgsb",
    multistart=True,
)
res["solution_params"]
[13]:
lower_bound soft_lower_bound soft_upper_bound upper_bound value
0 -inf -5 10 inf -2.236443e-09
1 -inf -5 10 inf -2.064518e-09
2 -inf -5 10 inf -5.864428e-10

Understanding multistart results#

The results of a multistart optimization are exactly the same as the results from a standard optimization but with additional information.

  • res["multistart_info"]["local_optima"] is a list with the results from all optimizations that were performed

  • res["multistart_info"]["start_parameters"] is a list with the start parameters from those optimizations

  • res["multistart_info"]["exploration_sample"] is a list with parameter vectors at which the criterion function was evaluated in an initial exploration phase.

  • res["multistart_info"]["exploration_results"] are the corresponding criterion values.

What does multistart mean in estimagic?#

The way we do multistart optimizations is inspired by the TikTak algorithm. Our multistart optimizations consist of the following steps:

  1. Draw a large sample of parameter vectors, randomly or using a low-discrepancy sequence. The size, sampling method, distribution and more things can be configured.

  2. Evaluate the criterion function in parallel on all parameter vectors.

  3. Sort the parameter vectors from best to worst.

  4. Run local optimizations. The first local optimization is started from the best parameter vector in the sample. All subsequent ones are started from a convex combination of the currently best known parameter vector and the next sample point.

How to configure mutlistart?#

As you can imagine from the above description, there are many details that can be configured. This can be done by adding a dictionary with multistart_options when calling minimize or maximize. Let’s look at an extreme example where we manually set everything to it’s default value:

[14]:
options = {
    # Set the number of points at which criterion is evaluated
    # in the exploration phase
    "n_samples": 10 * len(params),
    # Pass in a DataFrame or array with a custom sample
    # for the exploration phase.
    "sample": None,
    # Determine number of optimizations, relative to n_samples
    "share_optimizations": 0.1,
    # Determine distribution from which sample is drawn
    # Allowed: ["uniform", "triangle"]
    "sampling_distribution": "uniform",
    # Determine sampling method. Allowed: ["sobol", "random",
    # "halton", "hammersley", "korobov", "latin_hypercube"]
    "sampling_method": "sobol",
    # Determine how start parameters for local optimizations are
    # calculated. Allowed: ["tiktak", "linear"] or a custom
    # function with arguments iteration, n_iterations, min_weight,
    # and max_weight
    "mixing_weight_method": "tiktak",
    # Determine bounds on mixing weights.
    "mixing_weight_bounds": (0.1, 0.995),
    # Determine after how many re-discoveries of the currently best
    # local optimum the multistart optimization converges.
    "convergence.max_discoveries": 2,
    # Determine the maximum relative distance two parameter vectors
    # can have to be considered equal for convergence purposes:
    "convergence.relative_params_tolerance": 0.01,
    # Determine how many cores are used
    "n_cores": 1,
    # Determine which batch_evaluator is used:
    "batch_evaluator": "joblib",
    # Determine the batch size. It must be larger than n_cores.
    # Setting the batch size larger than n_cores allows to reproduce
    # the exact results of a highly parallel optimization on a smaller
    # machine.
    "batch_size": 1,
    # Set the random seed:
    "seed": None,
    # Set how errors are handled during the exploration phase:
    # Allowed: ["raise", "continue"]
    "exploration_error_handling": "continue",
    # Set how errors are handled during the optimization phase:
    # Allowed: ["raise", "continue"]
    "optimization_error_handling": "continue",
}

res = minimize(
    criterion=sphere,
    params=params,
    algorithm="scipy_lbfgsb",
    multistart=True,
    multistart_options=options,
)
res["solution_params"]
[14]:
lower_bound soft_lower_bound soft_upper_bound upper_bound value
0 -inf -5 10 inf -2.236443e-09
1 -inf -5 10 inf -2.064518e-09
2 -inf -5 10 inf -5.864428e-10