Download the notebook here

Ordered Logit Example

Let’s suppose, completely hypothetically, that we are not a big fan of Stata or simply want to learn the mechanics behind an ordered logit model by coding it up ourselves.

In this notebook we show you how estimagic can help you to implement such a model very easily. Implementing a logit model consists of four basic steps:

  1. Processing the user input into inputs for the likelihood function

  2. Writing the likelihood function of an ordered logit model

  3. Maximizing the likelihood function

  4. Calculating standard errors

The first two have to be done by the user, the last two are done by estimagic.

To be very clear: Estimagic is not a package to estimate logit models or other models that are implemented in Stata, statsmodels or anywhere else. Its purpose is to estimate parameters with custom likelihood or method of simulated moments functions. We just use an orederd logit model as an example of a very simple likelihood function.

The example we will use to test our model is taken from the Stata Documentation.

[1]:
import numpy as np
import pandas as pd
from patsy import dmatrices
from scipy import stats

from estimagic import maximize
from estimagic.inference.likelihood_inference import do_likelihood_inference

Process the user input

First we have to take the formula and dataset, extract all relevant information about the model and construct the inputs for the likelihood function.

We will need four inputs:

  1. A DataFrame with start parameters for the optimization.

  2. An array with the dependent variable.

  3. A 2d array with explanatory variables.

  4. Constraints for the optimization that keep the cutoffs increasing.

[2]:
def ordered_logit_processing(formula, data):
    """Process user input for an ordered logit model."""
    # extract data arrays
    y, x = dmatrices(formula + " - 1", data, return_type="dataframe")
    y = y[y.columns[0]]

    # extract dimensions
    num_choices = len(y.unique())
    beta_names = list(x.columns)
    num_betas = len(beta_names)
    num_cutoffs = num_choices - 1

    # set-up index for params_df
    names = beta_names + list(range(num_cutoffs))
    categories = ["beta"] * num_betas + ["cutoff"] * num_cutoffs
    index = pd.MultiIndex.from_tuples(zip(categories, names), names=["type", "name"])

    # make params_df
    np.random.seed(5471)
    start_params = pd.DataFrame(index=index)
    start_params["value"] = np.hstack(
        [
            np.random.uniform(low=-0.5, high=0.5, size=len(x.columns)),
            np.arange(num_cutoffs) * 2,
        ]
    )
    start_params["group"] = start_params.index.get_level_values("type")

    # make constraints
    constr = [{"loc": "cutoff", "type": "increasing"}]

    # turn pandas objects into numpy arrays
    y_arr = y.to_numpy().astype(int)
    x_arr = x.to_numpy()

    return start_params, y_arr, x_arr, constr

Calculate the Likelihood

Next, we want to evaluate the likelihood function, given parameters and data. There are more efficient ways of calculating the likelihood for an ordered logit but this one was chosen for brevity and readability.

[3]:
def ordered_logit_loglike(params, y, x):
    """Likelihood function of an orderd logit model."""
    # parse the parameter vector into its quantities
    beta = params.loc["beta", "value"].to_numpy()
    cutoffs = params.loc["cutoff", "value"].to_numpy()

    # calculate deterministic part of utilities
    xb = x.dot(beta)

    # evaluate likelihood
    upper_cutoffs = np.hstack([cutoffs, np.inf])[y]
    lower_cutoffs = np.hstack([-np.inf, cutoffs])[y]
    upper_cdf = stats.logistic.cdf(upper_cutoffs - xb)
    lower_cdf = stats.logistic.cdf(lower_cutoffs - xb)

    contributions = np.log(upper_cdf - lower_cdf)

    res = {"contributions": contributions, "value": contributions.sum()}

    return res

If you have never programmed an estimator before, you migt be surprised how much code is spent on processing compared to calculating the actual likelihood function. This will almost always be the case - at least if you try to make your estimator flexible and user friendly. Estimagic is there to shorten this type of code as much as possible.

Another peculiarity you might notice is that the likelihood function does not just return a scalar value, but also the likelihood contributions of each individual. This is because some optimizers (e.g. bhhh) can actually use the information on the contributions. Morover, you will need the contributions to calculate standard errors by the outer product of gradients.

All estimagic functions (whether for numerical differentiation, standard error calculation or optimization) will simply pick from the dictionary what they need!

Maximizing the likelihood

[4]:
data = pd.read_pickle("ologit.pickle")
formula = "apply ~ pared + public + gpa"
start_params, y, x, constraints = ordered_logit_processing(formula, data)

res = maximize(
    criterion=ordered_logit_loglike,
    params=start_params,
    algorithm="scipy_lbfgsb",
    constraints=constraints,
    criterion_kwargs={"y": y, "x": x},
    logging="ordered_logit.db",
)
[5]:
params = res["solution_params"]
params
[5]:
group lower_bound upper_bound value
type name
beta pared beta -inf inf 1.047661
public beta -inf inf -0.058685
gpa beta -inf inf 0.615744
cutoff 0 cutoff -inf inf 2.203318
1 cutoff -inf inf 4.298762

Calculate standard errors

[6]:
from estimagic.decorators import numpy_interface

numpy_interface(ordered_logit_loglike, params=params, constraints=constraints)
[6]:
<function __main__.ordered_logit_loglike(params, y, x)>
[7]:
inference, free_cov = do_likelihood_inference(
    loglike=ordered_logit_loglike,
    params=params,
    loglike_kwargs={"x": x, "y": y},
    n_samples=10_000,
    constraints=constraints,
)

inference.round(3)
[7]:
value standard_error p_value ci_lower ci_upper stars
type name
beta pared 1.048 0.276 0.000 0.507 1.589 ***
public -0.059 0.269 0.811 -0.587 0.469
gpa 0.616 0.275 0.025 0.077 1.155 **
cutoff 0 2.203 0.822 0.007 0.592 3.815 ***
1 4.299 0.846 0.000 2.641 5.956 ***

Compare to STATA’s results

[8]:
stata_results = pd.read_csv("stata_ologit_results.csv")
stata_results.round(3)
[8]:
name stata_value stata_standard_error stata_p_value stata_ci_lower stata_ci_upper
0 pared 1.048 0.266 0.000 0.527 1.569
1 public -0.059 0.298 0.844 -0.642 0.525
2 gpa 0.616 0.261 0.018 0.105 1.127
3 cut1 2.203 0.780 NaN 0.675 3.731
4 cut2 4.299 0.804 NaN 2.722 5.875

This looks pretty good! The parameter estimates line up perfectly. I actually had to try three optimizers to get at least one differenet digit which makes the result more credible. Other optimizers hit it on all digits.

Note that standard error calculation, especially in combination with constraints is still considered experimental in estimagic.

Use the dashboard for monitoring the optimization

Often you may want to monitor an optimization to see how far the algorithm has moved away from the start values or see how the algorithm arrived at its solution after it has finished.

Both can be done using the estimagic dashboard.

To use the dashboard, we need to activate logging which we had deactivated up until now. To activate logging, simply supply a database path to ordered_logit.

To start the dashboard, make sure you have the estimagic environment installed and activated.

Then all you need to do is navigate to the path’s directory in your command line, start the cell below and enter the following into your command line after the optimization has started:

estimagic dashboard {db_path}

This should open a page in your browser where you can press “Start Updating from Database” to start watching the optimization.

[9]:
db_path = "./logging.db"

res = maximize(
    criterion=ordered_logit_loglike,
    params=start_params,
    algorithm="scipy_lbfgsb",
    constraints=constraints,
    criterion_kwargs={"y": y, "x": x},
    logging=False,
)