How supported optimization algorithms are tested#

estimagic provides a unified interface that supports a large number of optimization algorithms from different libraries. Additionally, it allows putting constraints on the optimization problem. To test the external interface of all supported algorithms, we consider different criterion (benchmark) functions and test each algorithm with every type of constraint.

Benchmark functions for testing#

Trid function#

\(f({x}) = \Sigma^{D}_{i=1}(x_{i} - 1)^2 - \Sigma^{D}_{i=2}(x_i x_{i-1})\)

Rotated Hyper Ellipsoid function#

\(f({x}) = \Sigma^{D}_{i=1} \Sigma^{i}_{j=1}x_j^2\)

Rosenbrock function#

\(\Sigma^{D-1}_{i=1}(100(x_i+1 - x_i^2)^2 + (x_i - 1)^2)\)

Sphere function#

\(f({x}) = \Sigma^{D}_{i=1} ix_{i}^2\)

How testcases are implemented#

We consider different implementations of each criterion and its gradient. All algorithms accept criterion functions specified in a dictionary, while a subset also accepts the criterion specified in scalar form. Likewise, if specified, the gradient of a criterion can be an np.ndarray or a pandas object. We test for all possible cases. For instance, for rotated hyper ellipsoid, we implement the following functions:

rotated_hyper_ellipsoid_scalar_criterion
rotated_hyper_ellipsoid_dict_criterion: This provides a dictionary wherein the contributions and root_contributions keys present the criterion as a least squares problem, relevant when we are testing a least squares algorithm.
rotated_hyper_ellipsoid_gradient
rotated_hyper_ellipsoid_pandas_gradient: Computes the gradient of the rotated hyper ellipsoid function, as a pandas object.
rotated_hyper_ellipsoid_criterion_and_gradient

These criterion functions are specified in the examples directory. For an overview of all constraints supported in estimagic, please see this how-to guide.

We write several test functions, each corresponding to the case of one constraint. Given the constraint, the test function considers all possible combinations of the algorithm, whether to maximize or to minimize, criterion function implementation, gradient implementation for that criterion (if provided), and whether criterion_and_derivative has been provided or not.

Below, we show the calculations behind the true values, for each testcase (one criterion and one constraint).

Trid: Solutions for three-dimension case#

\(f({x}) = (x_1-1)^2 + (x_2-1)^2 + (x_3-1)^2 - x_2 x_1 - x_3 x_2\)

Rotated Hyper Ellipsoid: Solutions for three-dimension case#

\(f({x}) = x^2_1 + (x^2_1 + x^2_2) + (x^2_1 + x^2_2 + x^2_3)\)
No constraints
constraints = []
\(x* = (0, 0, 0)\)
Fixed constraints
constraints = [{"loc": "x_1", "type": "fixed", "value": 1}]
\(x_{1} = 1 \rightarrow x* = (1, 0, 0)\)
Probability constraints
constraints = [{"loc": ["x_1", "x_2"], "type": "probability"}]
\(x_{1} + x_{2} = 1, \quad 0 \leq x_1 \leq 1, \quad 0 \leq x_2 \leq 1 \\ \mathcal{L}({x_i}) = x^2_1 + (x^2_1 + x^2_2) + (x^2_1 + x^2_2 + x^2_3)\\ -\lambda(x_1 +x_2-1)\\ \Rightarrow \frac{\delta \mathcal{L}}{\delta x_1}\\ = 6x_1 - \lambda = 0\\ \Rightarrow \frac{\delta \mathcal{L}}{\delta x_2}\\ = 4x_2 - \lambda = 0\\ \Rightarrow \frac{\delta \mathcal{L}}{\delta x_3}\\ = 2 x_3 = 0\\ \Rightarrow \frac{\delta \mathcal{L}}{\delta \lambda} \\ = -x_1 - x_2 + 1 = 0\\ \rightarrow x* = (\frac{2}{5}, \frac{3}{5}, 0),\\ \quad f({x*}) = \frac{6}{5}\)
Increasing constraints
constraints = [{"loc": ["x_2", "x_3"], "type": "increasing"}]
Not binding \(\rightarrow x* = (0, 0, 0)\)
Decreasing constraints
constraints = [{"loc": ["x_1", "x_2"], "type": "decreasing"}]
Not binding \(\rightarrow x* = (0, 0, 0)\)
Equality constraints
constraints = [{"loc": ["x_1", "x_2", "x_3"], "type": "equality"}]
Not binding \(\rightarrow x* = (0, 0, 0)\)
Pairwise equality constraints
constraints = [{"locs": ["x_1", "x_2"], "type": "pairwise_equality"}]
Not binding \(\rightarrow x* = (0, 0, 0)\)
Covariance constraints
constraints = [{"loc": ["x_1", "x_2", "x_3"], "type": "covariance"}]
Not binding \(\rightarrow x* = (0, 0, 0)\)
sdcorr constraints
constraints = [{"loc": ["x_1", "x_2", "x_3"], "type": "sdcorr"}]
Not binding \(\rightarrow x* = (0, 0, 0)\)
Linear constraints
constraints = [{"loc": ["x_1", "x_2"], "type": "linear", "weights": [1, 2], "value": 4}]
\(x_1 + 2x_2 = 4\\\mathcal{L}({x_i}) = x^2_1 + (x^2_1 + x^2_2) + (x^2_1 + x^2_2 + x^2_3) -\lambda(x_1 +2x_2-4)\\ \Rightarrow \frac{\delta\mathcal{L}}{\delta x_1} = 6x_1 - \lambda = 0\\ \Rightarrow \frac{\delta \\ \mathcal{L}}{\delta x_2} = 4x_2 - 2\lambda = 0\\ \Rightarrow \frac{\delta \\ \mathcal{L}}{\delta x_3} = 2 x_3 = 0\\ \Rightarrow \frac{\delta \\ \mathcal{L}}{\delta \lambda} = -x_1 - 2x_2 + 4 = 0\\ \rightarrow x* = (\frac{4}{7}, \frac{12}{7}, 0)\)

Rosenbrock: Solutions for three-dimension case#

\(f({x}) = 100(x_2 - x_1^2) + (x_1 - 1)^2\)

Global minima: \(x* = (1, 1, 1)\)

No constraints
constraints = []
\(x* = (1, 1, 1)\)
Fixed constraints
constraints = [{"loc": "x_1", "type": "fixed", "value": 1}]
\(x_{1} = 1 \rightarrow x* = (1, 1, 1)\)
Fixed constraints
constraints = [{"loc": ["x_1", "x_2"], "type": "probability"}]
No solution available.
Increasing constraints
constraints = [{"loc": ["x_2", "x_3"], "type": "increasing"}]
Not binding \(\rightarrow x* = (1, 1, 1)\)
Decreasing constraints
constraints = [{"loc": ["x_1", "x_2"], "type": "decreasing"}]
Not binding \(\rightarrow x* = (1, 1, 1)\)
Equality constraints
constraints = [{"loc": ["x_1", "x_2", "x_3"], "type": "equality"}]
Not binding \(\rightarrow x* = (1, 1, 1)\)
Pairwise equality constraints
constraints = [{"locs": ["x_1", "x_2"], "type": "pairwise_equality"}]
Not binding \(\rightarrow x* = (1, 1, 1)\)
Covariance constraints
constraints = [{"loc": ["x_1", "x_2", "x_3"], "type": "covariance"}]
Not binding \(\rightarrow x* = (1, 1, 1)\)
sdcorr constraints
constraints = [{"loc": ["x_1", "x_2", "x_3"], "type": "sdcorr"}]
Not binding \(\rightarrow x* = (1, 1, 1)\)
Linear constraints
constraints = [{"loc": ["x_1", "x_2"], "type": "linear", "weights": [1, 2], "value": 4}]
No solution available.