How to do multistart optimizations#

How to turn on multistart#

Turning on multistart literally just means adding multistart=True when you call maximize or minimize and adding sampling bounds to the params DataFrame. Of course, you can configure every aspect of the multistart optimization if you want. But if you don’t, we pick good defaults for you.

Let’s look at the well known “sphere” example again:

import estimagic as em
import pandas as pd
def sphere(params):
    return params["value"] @ params["value"]
params = pd.DataFrame()
params["value"] = [1, 2, 3]
params["soft_lower_bound"] = -5
params["soft_upper_bound"] = 10
params
value soft_lower_bound soft_upper_bound
0 1 -5 10
1 2 -5 10
2 3 -5 10
res = em.minimize(
    criterion=sphere,
    params=params,
    algorithm="scipy_lbfgsb",
    multistart=True,
)
res.params.round(5)
value soft_lower_bound soft_upper_bound
0 -0.0 -5 10
1 -0.0 -5 10
2 0.0 -5 10

Understanding multistart results#

The OptimizeResult result object of a multistart optimization has exactly the same structure as OptimizeResult from a standard optimization but with additional information.

  • res.multistart_info["local_optima"] is a list with the results from all optimizations that were performed

  • res.multistart_info["start_parameters"] is a list with the start parameters from those optimizations

  • res.multistart_info["exploration_sample"] is a list with parameter vectors at which the criterion function was evaluated in an initial exploration phase.

  • res.multistart_info["exploration_results"] are the corresponding criterion values.

What does multistart mean in estimagic?#

The way we do multistart optimizations is inspired by the TikTak algorithm. Our multistart optimizations consist of the following steps:

  1. Draw a large sample of parameter vectors, randomly or using a low-discrepancy sequence. The size, sampling method, distribution, and more things can be configured.

  2. Evaluate the criterion function in parallel on all parameter vectors.

  3. Sort the parameter vectors from best to worst.

  4. Run local optimizations. The first local optimization is started from the best parameter vector in the sample. All subsequent ones are started from a convex combination of the currently best known parameter vector and the next sample point.

How to configure mutlistart?#

As you can imagine from the above description, there are many details that can be configured. This can be done by adding a dictionary with multistart_options when calling minimize or maximize. Let’s look at an extreme example where we manually set everything to it’s default value:

options = {
    # Set the number of points at which criterion is evaluated
    # in the exploration phase
    "n_samples": 10 * len(params),
    # Pass in a DataFrame or array with a custom sample
    # for the exploration phase.
    "sample": None,
    # Determine number of optimizations, relative to n_samples
    "share_optimizations": 0.1,
    # Determine distribution from which sample is drawn
    "sampling_distribution": "uniform",
    # Determine sampling method. Allowed: ["sobol", "random",
    # "halton", "hammersley", "korobov", "latin_hypercube"]
    "sampling_method": "sobol",
    # Determine how start parameters for local optimizations are
    # calculated. Allowed: ["tiktak", "linear"] or a custom
    # function with arguments iteration, n_iterations, min_weight,
    # and max_weight
    "mixing_weight_method": "tiktak",
    # Determine bounds on mixing weights.
    "mixing_weight_bounds": (0.1, 0.995),
    # Determine after how many re-discoveries of the currently best
    # local optimum the multistart optimization converges.
    "convergence.max_discoveries": 2,
    # Determine the maximum relative distance two parameter vectors
    # can have to be considered equal for convergence purposes:
    "convergence.relative_params_tolerance": 0.01,
    # Determine how many cores are used
    "n_cores": 1,
    # Determine which batch_evaluator is used:
    "batch_evaluator": "joblib",
    # Determine the batch size. It must be larger than n_cores.
    # Setting the batch size larger than n_cores allows to reproduce
    # the exact results of a highly parallel optimization on a smaller
    # machine.
    "batch_size": 1,
    # Set the random seed:
    "seed": None,
    # Set how errors are handled during the exploration phase:
    "exploration_error_handling": "continue",
    # Set how errors are handled during the optimization phase:
    "optimization_error_handling": "continue",
}
res = em.minimize(
    criterion=sphere,
    params=params,
    algorithm="scipy_lbfgsb",
    multistart=True,
    multistart_options=options,
)

res
Minimize with 3 free parameters terminated successfully after 6 criterion evaluations, 6 derivative evaluations and 4 iterations.

The value of criterion improved from 14.0 to 6.38558109434918e-18.

The multistart_scipy_lbfgsb algorithm reported: CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL

Independent of the convergence criteria used by multistart_scipy_lbfgsb, the strength of convergence can be assessed by the following criteria:

                             one_step     five_steps 
relative_criterion_change   3.06e-14***   3.06e-14***
relative_params_change     5.482e-07*    5.482e-07*  
absolute_criterion_change   3.06e-15***   3.06e-15***
absolute_params_change     5.482e-08*    5.482e-08*  

(***: change <= 1e-10, **: change <= 1e-8, *: change <= 1e-5. Change refers to a change between accepted steps. The first column only considers the last step. The second column considers the last five steps.)
res.params.round(5)
value soft_lower_bound soft_upper_bound
0 -0.0 -5 10
1 -0.0 -5 10
2 0.0 -5 10
res.multistart_info["local_optima"]
[Minimize with 3 free parameters terminated successfully after 3 criterion evaluations, 3 derivative evaluations and 2 iterations.
 
 The scipy_lbfgsb algorithm reported: CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL
 
 Independent of the convergence criteria used by scipy_lbfgsb, the strength of convergence can be assessed by the following criteria:
 
                            one_step five_steps
 relative_criterion_change  15.17      49.8    
 relative_params_change     12.32     22.32    
 absolute_criterion_change  1.517      4.98    
 absolute_params_change     1.232     2.232    
 
 (***: change <= 1e-10, **: change <= 1e-8, *: change <= 1e-5. Change refers to a change between accepted steps. The first column only considers the last step. The second column considers the last five steps.),
 Minimize with 3 free parameters terminated successfully after 3 criterion evaluations, 3 derivative evaluations and 2 iterations.
 
 The scipy_lbfgsb algorithm reported: CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL
 
 Independent of the convergence criteria used by scipy_lbfgsb, the strength of convergence can be assessed by the following criteria:
 
                             one_step  five_steps
 relative_criterion_change   0.4662     14.78    
 relative_params_change       2.159     12.16    
 absolute_criterion_change  0.04662     1.478    
 absolute_params_change      0.2159     1.216    
 
 (***: change <= 1e-10, **: change <= 1e-8, *: change <= 1e-5. Change refers to a change between accepted steps. The first column only considers the last step. The second column considers the last five steps.)]
res.multistart_info["start_parameters"]
[    value  soft_lower_bound  soft_upper_bound
 0 -0.3125                -5                10
 1 -2.1875                -5                10
 2 -0.3125                -5                10,
       value  soft_lower_bound  soft_upper_bound
 0 -0.330195                -5                10
 1 -0.330195                -5                10
 2 -1.122663                -5                10]
res.multistart_info["start_parameters"]
[    value  soft_lower_bound  soft_upper_bound
 0 -0.3125                -5                10
 1 -2.1875                -5                10
 2 -0.3125                -5                10,
       value  soft_lower_bound  soft_upper_bound
 0 -0.330195                -5                10
 1 -0.330195                -5                10
 2 -1.122663                -5                10]
res.multistart_info["exploration_results"]
array([  4.98046875,   8.27636719,  14.        ,  18.75      ,
        19.04296875,  19.921875  ,  21.16699219,  22.92480469,
        29.296875  ,  30.76171875,  42.1875    ,  48.70605469,
        64.52636719,  66.87011719,  67.45605469,  74.48730469,
        75.65917969,  75.65917969,  79.6875    ,  82.32421875,
        87.01171875,  94.921875  , 106.12792969, 110.44921875,
       113.74511719, 120.19042969, 126.85546875, 131.54296875,
       141.796875  , 196.94824219])