Download the notebook here!

[1]:
import numpy as np
import pandas as pd

How to Specify Constraints

General Structure of Constraints

Each type of constraint is described in more detail below.

Tutorial

Below we show you how to specify constraints with simplified examples.

For the simpler constraints, we re-use the completely made up params DataFrame we used to explain how to select parameters on the previous page. If you are unfamiliar with DataFrame.loc and DataFrame.query make sure that your read this explanation first!

Some other examples are inspired by real projects. In any case, you don’t have to understand any of the examples in detail. Just look at the index of their params DataFrame to see how you can use the constraints in your own projects.

[2]:
index = pd.MultiIndex.from_product(
    [["a", "b"], np.arange(3)], names=["category", "number"]
)

df = pd.DataFrame(
    data=[0.1, 0.45, 0.55, 0.75, 0.85, -1.0], index=index, columns=["value"]
)
df
[2]:
value
category number
a 0 0.10
1 0.45
2 0.55
b 0 0.75
1 0.85
2 -1.00

fixed constraints

To diagnose what goes wrong in difficult optimizations you often want to fix some of the parameters. Of course, you could just remove them from your parameter vector, but again, it’s very handy if the parameter vector that arrives in your utility function always looks exactly the same. Therefore, estimagic can fix the parameters for you. A good example of a parameter that is fixed is a discount factor in a structural model. Assume this parameter is called delta and we want to fix it at 0.95. Then, the constraint is

[3]:
constraints = [{"loc": "delta", "type": "fixed", "value": 0.95}]

Note that "value" is optional. If it is not specified, the parameter is fixed at the value specified in the DataFrame.

probability constraints

Probability constraints are similar to sum constraints, but they always sum to 1 and there is the additional constraint that they are all between zero and one. Probability constraints are therefore also pratical for shares or parameters of certain production functions. Let’s assume we have a params DataFrame with “shares” in the fist index level. As you probably guess by now, the constraint will look as follows:

[4]:
constraints = [{"loc": "shares", "type": "probability"}]

increasing and decreasing constraints

As the name suggests, increasing constraints ensure that the selected parameters are increasing. The prime example are cutoffs in ordered choice models as for example the ordered logit model Ordered Logit Example

The constraint then looks as follows:

[5]:
constraints = [{"loc": "cutoffs", "type": "increasing"}]

Decreasing constraints are defined analogously.

equality constraints

Equality constraints ensure that all selected parameters are equal. This sounds useless because one could simply leave all but one parameters out. But it does very often make the parsing of the parameter vector much easier. For example in dynamic models where you sometimes want to keep parameters time-invariant and sometimes not. The code often becomes much easier if you do not need if-conditions to handle those two (or potentially many more) cases and instead let estimagic handle them for you. An example could be the simple DataFrame from the very beginning, where “a” could be the name of a parameter and “number” could enumerate periods in the model.

[6]:
# make sure the equality constraint is satisfied
df = df.copy()
df.loc["b", "value"] = 2
df
[6]:
value
category number
a 0 0.10
1 0.45
2 0.55
b 0 2.00
1 2.00
2 2.00

Keeping the parameter group “b” time-invariant would be as simple as:

[7]:
constraints = [{"loc": "b", "type": "equality"}]

Under the hood this will optimize over just one b-parameter and set the other b-parameters equal to that one parameter.

pairwise_equality constraints

Pairwise equality constraints are different from all other constraints because they correspond to several sets of parameters. Let’s assume we want to keep the parameters “a” and “b” pairwise equal, then the constraint looks like this:

[8]:
constraints = [{"locs": ["a", "b"], "type": "pairwise_equality"}]

Alternatively, you could have an entry “queries” where the corresponding value is a list of query strings. Both “locs” and “queries” can have any number of entries.

Covariance Constraints

In maximum likelihood estimations, you often have to estimate a covariance matrix of a contribution.

Of course, such a covariance matrix has to be a valid, i.e. positive semi-definite, covariance matrix. This is where the “covariance” constraint comes in handy. The covariance constraint assumes that the parameters selected by its "loc" or "query" field correspond to the lower triangle of a covariance matrix. The elements are ordered in C-order, i.e starting with the first and only non-zero element of the first row, then the first and second element of the second row and so on.

It’s easier to see this in an example taken from the respy package. The toy model represents the Robinson Crusoe economy in the setting of a discrete choice dynamic programming model. In the model, Robinson can choose between fishing, relaxing in a hammock, and improving his fishing skills by talking to Friday. The reward of each alternative is subject to a shock distributed according to a covariance matrix.

[9]:
params = pd.read_csv("robinson-crusoe-covariance.csv").set_index(["category", "name"])
params
[9]:
value
category name
delta delta 0.950000
wage_fishing exp_fishing 0.100000
contemplation_with_friday 0.400000
nonpec_fishing constant -1.000000
nonpec_friday constant -1.000000
not_fishing_last_period -1.000000
nonpec_hammock constant 2.500000
not_fishing_last_period -1.000000
shocks_cov var_fishing 1.000000
cov_friday_fishing 0.000000
var_friday 1.000000
cov_hammock_fishing -0.200000
cov_hammock_friday 0.000000
var_hammock 1.000000
lagged_choice_1_hammock constant 1.000000
meas_error sd_fishing 0.000001

The parameters that form the covariance matrix are the ones where category equals "shocks_cov". The constraint could not be easier to express:

[10]:
constraints = [{"loc": "shocks_cov", "type": "covariance"}]

That’s all. To look at the resulting covariance matrix, we can use another nice function from estimagic:

[11]:
from estimagic.optimization.utilities import cov_params_to_matrix

cov_params_to_matrix(params.loc["shocks_cov", "value"])
[11]:
array([[ 1. ,  0. , -0.2],
       [ 0. ,  1. ,  0. ],
       [-0.2,  0. ,  1. ]])

Note that the names in the index are not used at all to determine which element goes where. Otherwise estimagic would have to make assumptions on your index and we don’t want to do that.

Covariance constraints are not compatible with any other type of constraint, including box constraints. You don’t have to add box constraints to keep the variances positive because estimagic does this for you.

Some optimizers are more aggressive than others and test more extreme parameters. This is especially bad for variance-covariance matrices because they have to be positive semi-definite which might not be the case for every proposed parameterization. Internally, estimagic uses the Cholesky factor \(C\), a lower-triangular matrix, of the variance-covariance matrix to do unconstrained optimization and rebuild the variance-covariance with \(\Omega = C C^T\). To ensure positive semi-definiteness, you can add {"bounds_distance": 1e-6} to your constraint to ensure that the diagonal elements of the Cholesky factor are farther away from zero. The complete constraint with distance to the bounds is

[12]:
constraints = [{
    "loc": "shocks_cov",
    "type": "covariance",
    "bounds_distance": 1e-6
}]

sdcorr Constraints

Most of the time, it is more intuitive to look at standard deviations and correlations than at covariance matrices. If this is the case, you want to use an "sdcorr" constraint instead of the "covariance" constraint. The sdcorr constraint assumes that that the first elements are standard deviations and the rest is the lower triangle (excluding the diagonal) of a correlation matrix. Again, the names in the index are ignored by estimagic.

Let’s look at the same example:

[13]:
params = pd.read_csv("robinson-crusoe-sdcorr.csv")
params.set_index(["category", "name"], inplace=True)
params
[13]:
value
category name
delta delta 0.950000
wage_fishing exp_fishing 0.100000
contemplation_with_friday 0.400000
nonpec_fishing constant -1.000000
nonpec_friday constant -1.000000
not_fishing_last_period -1.000000
nonpec_hammock constant 2.500000
not_fishing_last_period -1.000000
shocks_sdcorr sd_fishing 1.000000
sd_friday 1.000000
sd_hammock 1.000000
corr_friday_fishing 0.000000
corr_hammock_fishing -0.200000
corr_hammock_friday 0.000000
lagged_choice_1_hammock constant 1.000000
meas_error sd_fishing 0.000001

The constraint is then just:

[14]:
constraints = [{"loc": "shocks_sdcorr", "type": "sdcorr"}]

And, of course, there is another helper function in the utilities module:

[15]:
from estimagic.optimization.utilities import sdcorr_params_to_sds_and_corr
[16]:
sds, corr = sdcorr_params_to_sds_and_corr(
    params.loc["shocks_sdcorr", "value"])
sds
[16]:
array([1., 1., 1.])
[17]:
corr
[17]:
array([[ 1. ,  0. , -0.2],
       [ 0. ,  1. ,  0. ],
       [-0.2,  0. ,  1. ]])

Note that the "bounds_distance" option is also available for "sdcorr" constraints. See the previous section on covariance constraints for more information.

linear constraints

“linear” constraints have many of the above constraints as special cases. They are a bit more complicated to write but can be very powerful. You should only write a linear constraint if your constraint can’t be expressed as one of the special cases.

They can be used to express constraints of the form:

lower <=  weights.dot(x) <= upper

or

weights.dot(x) = value

, where x are the selected parameters.

Linear constraints have the following additional fields beside the loc or query and type field:

  • weights: This will be used to construct the vector a. It can be a numpy array, pandas Series, list or a float in which case the weigths for all selected parameters are equal to that number.

  • value: float

  • lower: float

  • upper: float

You can specify either value or lower and upper bounds. Suppose you have the following params DataFrame:

[18]:
params = pd.DataFrame(
    index=pd.MultiIndex.from_product([["a", "b", "c"], [0, 1, 2]]),
    data=[[2], [1], [0], [1], [3], [4], [1], [1], [1]],
    columns=["value"]
)
params
[18]:
value
a 0 2
1 1
2 0
b 0 1
1 3
2 4
c 0 1
1 1
2 1

Suppose you want to express the following constraints:

  1. The first parameter in the a category is two times the second parameter in that category.

  2. The mean of the b parameters is larger than 3

  3. The sum of the last three parameters is between 0 and 5

Then the constraints would look as follows:

[19]:
constraints = [
    {"loc": "a", "type": "linear", "weights": [1, -2, 0], "value": 0},
    {"loc": "b", "type": "linear", "weights": 1 / 3, "lower": 3},
    {"loc": "c", "type": "linear", "weights": 1, "lower": 0, "upper": 5},
]

Constraint killers

All constraints can have an additional key called “id”. An example could be:

[20]:
constraints = [
    {"loc": "a", "type": "equality", "id": 0},
    {"loc": "b", "type": "increasing", "id": 1}
]

In structural economic models, the list of constraints can become quite large and cumbersome to write. Therefore, packages that implement such models will often write the constraints for you and only allow you to complement them with additional user constraints. But what if you want to relax some of the constraints they implement automatically? For this we have constraint killers. They take the following form:

[21]:
killer = {"kill": 0}

For example, the following two lists of constraints will be equivalent:

[22]:
constraints1 = [
    {"loc": "a", "type": "equality", "id": 0},
    {"loc": "b", "type": "increasing", "id": 1},
    {"kill": 0}
]
constraints2 = [{"loc": "b", "type": "increasing", "id": 1}]

If you write a package that implements constraints for the user, the following are best practices:

  1. Give the user the chance to add additional constraints.

  2. Add “id” entries to all constraints.

  3. Give the user the possibility to look at the constraints that were constructed automatically.

[ ]: