How to handle errors during optimization#
Try to avoid errors#
Often, optimizers try quite extreme parameter vectors, which then can raise errors in your criterion function or derivative. Even though estimagic makes it simple to restart your optimization from the last parameter value, this is annoying. Below is a very short list of things you can do to avoid this behavior:
Set bounds for your parameters, that prevent extreme parameter constellations.
Use the
bounds_distance
option with a not too small value forcovariance
andsdcorr
constraints.Use
robust_cholesky()
instead of normal cholesky decompositions or try to avoid cholesky decompositions by restructuring your algorithm.Avoid to take
np.exp
without further safeguards. With 64 bit floating point numbers, the exponential function is only well defined roughly between -700 and 700. Below it is -inf, above it is inf. Sometimes you can usescipy.special.logsumexp
to avoid unsafe evaluations of the exponential function. Otherwise you can avoid problems by setting bounds. In the worst case, use clipping. Note, however, that clipping leads to flat regions in your criterion function which can lead to erroneous convergence.
The two levels of error handling#
Despite all efforts, some errors cannot be avoided. Therefore, you have a lot of control over error handling during the optimizations. The three levels on which you can configure it are:
The error_handling
and error_penalty
arguments of maximize
and minimize
#
These two arguments determine if the optimization algorithms sees an error that might occur during the criterion or gradient evaluation.
error_handling
takes the values "raise"
and "continue"
. If "raise"
,
the error is not caught and the optimizer will stop or handle it. If "continue"
,
we replace the criterion function by a penalty term that can be fix or parameter
dependent and the optimizer will never know that an error occurred. Note that you will
still be warned about all errors.
The default error handling is "raise"
.
error_penalty
is a dict with the entries “constant” (float) and “slope” (float)
which determine the value of the penalty. The penalty function is then calculated as
constant + slope * norm(params - start_params)
where norm
is the euclidean
distance.
Making the penalty parameter dependent via the slope is meant to avoid flat spots in the penalized region. For minimization problems, a positive slope guides the optimizer back to the start parameters until it reaches a valid region again. The same holds for a negative slope in maximization problems.
Of course you can deactivate this and set the slope to 0.
The default constant is f0 + abs(f0) + 100 for minimizations and f0 - abs(f0) - 100 for maximizations, where f0 is the criterion value at start parameters. The default slope is 0.1 for minimization problems and -0.1 for maximization problems.
The error_handling
entry in the batch_evaluator_options
#
This argument determines when you get notified about a failed optimization.
It is mainly relevant when using estimagic’s ability to run several optimizations in parallel.
It can take the values "raise"
and "continue"
. If "raise"
, you will get an
error as soon as any optimization fails. Otherwise, estimagic runs all optimizations
even if some of them fail. The result of the failing optimizations will contain the
traceback of the error.
The default value is "continue"
if more than one optimization is run and "raise"
if you run only one optimization.