API Reference¶
Estimation¶
estimate_ml
- estimagic.estimation.estimate_ml.estimate_ml(loglike, params, optimize_options, *, constraints=None, logging=False, log_options=None, loglike_kwargs=None, derivative=None, derivative_kwargs=None, loglike_and_derivative=None, loglike_and_derivative_kwargs=None, numdiff_options=None, jacobian=None, jacobian_kwargs=None, hessian=False, hessian_kwargs=None, ci_level=0.95, n_samples=10000, bounds_handling='raise', design_info=None)[source]¶
Do a maximum likelihood (ml) estimation.
This is a high level interface of our lower level functions for maximization, numerical differentiation and inference. It does the full workflow for maximum likelihood estimation with just one function call.
While we have good defaults, you can still configure each aspect of each step via the optional arguments of this function. If you find it easier to do the “difficult” steps (mainly maximization and calculating numerical derivatives of a potentially noisy function) separately, you can do so and just provide those results as
params
,jacobian
andhessian
.The docstring is aspirational and not all options are supported yet.
- Parameters
loglike (callable) – Likelihood function that takes a params DataFrame (and potentially other keyword arguments) and returns a dictionary that has at least the entries “value” (a scalar float) and “contributions” (a 1d numpy array or pandas Series) with the log likelihood contribution per individual.
params (pd.DataFrame) – DataFrame where the “value” column contains the estimated or start parameters of a likelihood model. See How to specify start parameters for details. If the supplied parameters are estimated parameters, set optimize_options to False.
optimize_options (dict or False) – Keyword arguments that govern the numerical optimization. Valid entries are all arguments of
minimize()
except for criterion, derivative, criterion_and_derivative and params. If you pass False as optimize_options you signal thatparams
are already the optimal parameters and no numerical optimization is needed.constraints (list) – List with constraint dictionaries. See .. _link: ../../docs/source/how_to_guides/how_to_use_constraints.ipynb
logging (pathlib.Path, str or False) – Path to sqlite3 file (which typically has the file extension
.db
. If the file does not exist, it will be created. The dashboard can only be used when logging is used.log_options (dict) – Additional keyword arguments to configure the logging. - “fast_logging”: A boolean that determines if “unsafe” settings are used to speed up write processes to the database. This should only be used for very short running criterion functions where the main purpose of the log is a real-time dashboard and it would not be catastrophic to get a corrupted database in case of a sudden system shutdown. If one evaluation of the criterion function (and gradient if applicable) takes more than 100 ms, the logging overhead is negligible. - “if_table_exists”: (str) One of “extend”, “replace”, “raise”. What to do if the tables we want to write to already exist. Default “extend”. - “if_database_exists”: (str): One of “extend”, “replace”, “raise”. What to do if the database we want to write to already exists. Default “extend”.
loglike_kwargs (dict) – Additional keyword arguments for loglike.
derivative (callable) – Function takes params and potentially other keyword arguments and calculates the first derivative of loglike. It can either return a numpy array or pandas Series/DataFrame with the derivative or a dictionary with derivatives of each output of loglike. If loglike returns a dict but derivative does not, it is your responsibility to make sure that the correct derivative for the numerical optimizers you are using is returned.
derivative_kwargs (dict) – Additional keyword arguments for loglike.
loglike_and_derivative (callable) – Return a tuple consisting of the result of loglike and the result of derivative. Only use this if you can exploit synergies in the calculation of loglike and derivative.
loglike_and_derivative_kwargs (dict) – Additional keyword arguments for loglike_and_derivative.
numdiff_options (dict) – Keyword arguments for the calculation of numerical derivatives for the calculation of standard errors. See Derivatives for details.
jacobian (callable or pandas.DataFrame or False) – A function that takes
params
and potentially other keyword arguments and returns the jacobian of loglike[“contributions”] with respect to the params. Alternatively, you can pass a pandas.DataFrame with the Jacobian at the optimal parameters. This is only possible if you passoptimize_options=False
. Note that you only need to pass a Jacobian function if you have a closed form Jacobian but decided not to return it as part ofderivative
(e.g. because you use a scalar optimizer and can calculate a gradient in a way that is faster than calculating and summing the Jacobian). If you pass None, a numerical Jacobian will be calculated. If you passFalse
, you signal that no Jacobian should be calculated. Thus, no result that requires the Jacobian will be calculated.jacobian_kwargs (dict) – Additional keyword arguments for the Jacobian function.
hessian (callable or pd.DataFrame) – A function that takes
params
and potentially other keyword arguments and returns the Hessian of loglike[“value”] with respect to the params. Alternatively, you can pass a pandas.DataFrame with the Hessian at the optimal parameters. This is only possible if you passoptimize_options=False
. If you pass None, a numerical Hessian will be calculated. If you passFalse
, you signal that no Hessian should be calculated. Thus, no result that requires the Hessian will be calculated.hessian_kwargs (dict) – Additional keyword arguments for the Hessian function.
ci_level (float) – Confidence level for the calculation of confidence intervals. The default is 0.95.
n_samples (int) – Number of samples used to transform the covariance matrix of the internal parameter vector into the covariance matrix of the external parameters. For background information about internal and external params see How constraints are implemented. This is only used if you have specified constraints.
bounds_handling (str) – One of “clip”, “raise”, “ignore”. Determines how bounds are handled. If “clip”, confidence intervals are clipped at the bounds. Standard errors are only adjusted if a sampling step is necessary due to additional constraints. If “raise” and any lower or upper bound is binding, we raise an Error. If “ignore”, boundary problems are simply ignored.
design_info (pandas.DataFrame) – DataFrame with one row per observation that contains some or all of the variables “psu” (primary sampling unit), “stratum” and “fpc” (finite population corrector). See Robust Likelihood inference for details.
- Returns
- The estimated parameters, standard errors and covariance matrix of the
parameters.
- Return type
estimate_msm
- estimagic.estimation.estimate_msm.estimate_msm(simulate_moments, empirical_moments, moments_cov, params, optimize_options, *, constraints=None, logging=False, log_options=None, simulate_moments_kwargs=None, weights='diagonal', numdiff_options=None, jacobian=None, jacobian_kwargs=None, simulate_moments_and_jacobian=None, simulate_moments_and_jacobian_kwargs=None, ci_level=0.95, n_samples=10000, bounds_handling='raise')[source]¶
Do a method of simulated moments or indirect inference estimation.
This is a high level interface for our lower level functions for minimization, numerical differentiation, inference and sensitivity analysis. It does the full workflow for MSM or indirect inference estimation with just one function call.
While we have good defaults, you can still configure each aspect of each steps vial the optional arguments of this functions. If you find it easier to do the “difficult” steps (mainly minimization and calculating numerical derivatives of a potentially noisy function) separately, you can do so and just provide those results as
params
andjacobian
.The docstring is aspirational and not all options are supported yet.
- Parameters
simulate_moments (callable) – Function that takes params and potentially other keyworrd arguments and returns simulated moments as a pandas Series. Alternatively, the function can return a dict with any number of entries as long as one of those entries is “simulated_moments”.
empirical_moments (pandas.Series) – A pandas series with the empirical equivalents of the simulated moments.
moments_cov (pandas.DataFrame) – A quadratic pandas DataFrame with the covariance matrix of the empirical moments. This is typically calculated with our
get_moments_cov
function. The index and columns need to be the same as the index ofempirical_moments
.params (pandas.DataFrame) – Start params for the optimization. See How to specify start parameters for details.
simulate_moments_kwargs (dict) – Additional keyword arguments for
simulate_moments
.weights (str or pandas.DataFrame) – Either a DataFrame with a positive semi-definite weighting matrix or a string that specifies one of the pre-implemented weighting matrices: “diagonal” (default), “identity” or “optimal”. Note that “optimal” refers to the asymptotically optimal weighting matrix and is often not a good choice due to large finite sample bias.
constraints (list) – List with constraint dictionaries. See .. _link: ../../docs/source/how_to_guides/how_to_use_constraints.ipynb
logging (pathlib.Path, str or False) – Path to sqlite3 file (which typically has the file extension
.db
. If the file does not exist, it will be created. The dashboard can only be used when logging is used.log_options (dict) – Additional keyword arguments to configure the logging. - “fast_logging”: A boolean that determines if “unsafe” settings are used to speed up write processes to the database. This should only be used for very short running criterion functions where the main purpose of the log is a real-time dashboard and it would not be catastrophic to get a corrupted database in case of a sudden system shutdown. If one evaluation of the criterion function (and gradient if applicable) takes more than 100 ms, the logging overhead is negligible. - “if_table_exists”: (str) One of “extend”, “replace”, “raise”. What to do if the tables we want to write to already exist. Default “extend”. - “if_database_exists”: (str): One of “extend”, “replace”, “raise”. What to do if the database we want to write to already exists. Default “extend”.
optimize_options (dict or False) – Keyword arguments that govern the numerical optimization. Valid entries are all arguments of
minimize()
except for criterion, derivative, criterion_and_derivative and params. If you pass False as optimize_options you signal thatparams
are already the optimal parameters and no numerical optimization is needed.numdiff_options (dict) – Keyword arguments for the calculation of numerical derivatives for the calculation of standard errors. See Derivatives for details. Note that by default we increase the step_size by a factor of 2 compared to the rule of thumb for optimal step sizes. This is because many msm criterion functions are slightly noisy.
jacobian (callable or pandas.DataFrame) – A function that take
params
and potentially other keyword arguments and returns the Jacobian of simulate_moments with respect to the params. Alternatively, you can pass a pandas.DataFrame with the Jacobian at the optimal parameters. This is only possible if you passoptimize_options=False
.jacobian_kwargs (dict) – Additional keyword arguments for the Jacobian function.
simulate_moments_and_jacobian (callable) – A function that takes params and potentially other keyword arguments and returns a tuple with simulated moments and the jacobian of simulated moments with respect to params.
simulate_moments_and_jacobian_kwargs (dict) – Additional keyword arguments for simulate_moments_and_jacobian.
ci_level (float) – Confidence level for the calculation of confidence intervals. The default is 0.95
n_samples (int) – Number of samples used to transform the covariance matrix of the internal parameter vector into the covariance matrix of the external parameters. For background information about internal and external params see How constraints are implemented. This is only used if you have constraints in the
optimize_options
bounds_handling (str) – One of “clip”, “raise”, “ignore”. Determines how bounds are handled. If “clip”, confidence intervals are clipped at the bounds. Standard errors are only adjusted if a sampling step is necessary due to additional constraints. If “raise” and any lower or upper bound is binding, we raise an error. If “ignore”, boundary problems are simply ignored.
Returns –
- dict: The estimated parameters, standard errors and sensitivity measures
and covariance matrix of the parameters.
get_moments_cov
- estimagic.estimation.msm_weighting.get_moments_cov(data, calculate_moments, moment_kwargs=None, bootstrap_kwargs=None)[source]¶
Bootstrap the covariance matrix of the moment conditions.
- Parameters
data (pandas.DataFrame) – DataFrame with empirical data.
calculate_moments (callable) – Function that calculates that takes data and moment_kwargs as arguments and returns a 1d numpy array or pandas Series with moment conditions.
moment_kwargs (dict) – Additional keyword arguments for calculate_moments.
bootstrap_kwargs (dict) – Additional keyword arguments that govern the bootstrapping. Allowed arguments are “n_draws”, “seed”, “n_cores”, “batch_evaluator”, “cluster” and “error_handling”. For details see the bootstrap function.
- Returns
- The covariance matrix of the moment
conditions for msm estimation.
- Return type
lollipop_plot
- estimagic.visualization.lollipop_plot.lollipop_plot(data, sharex=True, plot_bar=True, pairgrid_kws=None, stripplot_kws=None, barplot_kws=None, style='whitegrid', dodge=True)[source]¶
Make a lollipop plot.
- Parameters
data (pandas.DataFrame) – The datapoints to be plotted. In contrast to many seaborn functions, the whole data will be plotted. Thus if you want to plot just some variables or rows you need to restrict the dataset before passing it.
sharex (bool) – Whether the x-axis is shared across variables, default True.
plot_bar (bool) – Whether thin bars are plotted, default True.
pairgrid_kws (dict) – Keyword arguments for for the creation of a Seaborn PairGrid. Most notably, “height” and “aspect” to control the sizes.
stripplot_kws (dict) – Keyword arguments to plot the dots of the lollipop plot via the stripplot function. Most notably, “color” and “size”.
barplot_kws (dict) – Keyword arguments to plot the lines of the lollipop plot via the barplot function. Most notably, “color” and “alpha”. In contrast to seaborn, we allow for a “width” argument.
style (str) – A seaborn style.
dodge (bool) – Wheter the lollipops for different datasets are plotted with an offset or on top of each other.
- Returns
seaborn.PairGrid
plot_univariate_effects
- estimagic.visualization.univariate_effects.plot_univariate_effects(criterion, params, n_gridpoints=21, n_random_values=2, plots_per_row=2, seed=5471)[source]¶
Plot criterion along coordinates at given and random values.
- Parameters
criterion (callable) – criterion function. Takes a DataFrame and returns a scalar value or dictionary with the entry “value”.
params (pandas.DataFrame) – See How to specify start parameters. Must contain finite lower and upper bounds for all parameters.
n_gridpoints (int) – Number of gridpoints on which the criterion function is evaluated. This is the number per plotted line.
n_random_values (int) – Number of random parameter vectors that are used as center of the plots.
plots_per_row (int) – How many plots are plotted per row.
Optimization¶
maximize
- estimagic.optimization.optimize.maximize(criterion, params, algorithm, *, criterion_kwargs=None, constraints=None, algo_options=None, derivative=None, derivative_kwargs=None, criterion_and_derivative=None, criterion_and_derivative_kwargs=None, numdiff_options=None, logging=False, log_options=None, error_handling='raise', error_penalty=None, cache_size=100, scaling=False, scaling_options=None, multistart=False, multistart_options=None)[source]¶
Maximize criterion using algorithm subject to constraints.
- Parameters
criterion (callable) –
A function that takes a pandas DataFrame (see How to specify start parameters) as first argument and returns one of the following:
scalar floating point or a
numpy.ndarray
(depending on the algorithm)a dictionary that contains at the entries “value” (a scalar float), “contributions” or “root_contributions” (depending on the algortihm) and any number of additional entries. The additional dict entries will be logged and (if supported) displayed in the dashboard. Check the documentation of your algorithm to see which entries or output type are required.
params (pandas.DataFrame) – A DataFrame with a column called “value” and optional additional columns. See How to specify start parameters for detail.
algorithm (str or callable) – Specifies the optimization algorithm. For supported algorithms this is a string with the name of the algorithm. Otherwise it can be a callable with the estimagic algorithm interface. See How to specify algorithms and algorithm specific options.
criterion_kwargs (dict) – Additional keyword arguments for criterion
constraints (list) – List with constraint dictionaries. See .. _link: ../../docs/source/how_to_guides/how_to_use_constraints.ipynb
algo_options (dict) – Algorithm specific configuration of the optimization. See Available optimizers and their options for supported options of each algorithm.
derivative (callable, optional) – Function that calculates the first derivative of criterion. For most algorithm, this is the gradient of the scalar output (or “value” entry of the dict). However some algorithms (e.g. bhhh) require the jacobian of the “contributions” entry of the dict. You will get an error if you provide the wrong type of derivative.
derivative_kwargs (dict) – Additional keyword arguments for derivative.
criterion_and_derivative (callable) – Function that returns criterion and derivative as a tuple. This can be used to exploit synergies in the evaluation of both functions. The fist element of the tuple has to be exactly the same as the output of criterion. The second has to be exactly the same as the output of derivative.
criterion_and_derivative_kwargs (dict) – Additional keyword arguments for criterion and derivative.
numdiff_options (dict) – Keyword arguments for the calculation of numerical derivatives. See Derivatives for details. Note that the default method is changed to “forward” for speed reasons.
logging (pathlib.Path, str or False) – Path to sqlite3 file (which typically has the file extension
.db
. If the file does not exist, it will be created. When doing parallel optimizations and logging is provided, you have to provide a different path for each optimization you are running. You can disable logging completely by setting it to False, but we highly recommend not to do so. The dashboard can only be used when logging is used.log_options (dict) – Additional keyword arguments to configure the logging. - “fast_logging”: A boolean that determines if “unsafe” settings are used to speed up write processes to the database. This should only be used for very short running criterion functions where the main purpose of the log is a real-time dashboard and it would not be catastrophic to get a corrupted database in case of a sudden system shutdown. If one evaluation of the criterion function (and gradient if applicable) takes more than 100 ms, the logging overhead is negligible. - “if_table_exists”: (str) One of “extend”, “replace”, “raise”. What to do if the tables we want to write to already exist. Default “extend”. - “if_database_exists”: (str): One of “extend”, “replace”, “raise”. What to do if the database we want to write to already exists. Default “extend”.
error_handling (str) – Either “raise” or “continue”. Note that “continue” does not absolutely guarantee that no error is raised but we try to handle as many errors as possible in that case without aborting the optimization.
error_penalty (dict) – Dict with the entries “constant” (float) and “slope” (float). If the criterion or gradient raise an error and error_handling is “continue”, return
constant + slope * norm(params - start_params)
wherenorm
is the euclidean distance as criterion value and adjust the derivative accordingly. This is meant to guide the optimizer back into a valid region of parameter space (in direction of the start parameters). Note that the constant has to be high enough to ensure that the penalty is actually a bad function value. The default constant is f0 + abs(f0) + 100 for minimizations and f0 - abs(f0) - 100 for maximizations, where f0 is the criterion value at start parameters. The default slope is 0.1.cache_size (int) – Number of criterion and derivative evaluations that are cached in memory in case they are needed.
scaling (bool) – If True, the parameter vector is rescaled internally for better performance with scale sensitive optimizers.
scaling_options (dict or None) – Options to configure the internal scaling ot the parameter vector. See How to scale optimization problems for details and recommendations.
multistart (bool) – Whether to do the optimization from multiple starting points. Requires the params to have the columns
"soft_lower_bound"
and"soft_upper_bounds"
with finite values for all parameters, unless the standard bounds are already finite for all parameters.multistart_options (dict) – Options to configure the optimization from multiple starting values. The dictionary has the following entries (all of which are optional): - n_samples (int): Number of sampled points on which to do one function evaluation. Default is 10 * n_params. - sample (pandas.DataFrame or numpy.ndarray) A user definde sample. If this is provided, n_samples, sampling_method and sampling_distribution are not used. - share_optimizations (float): Share of sampled points that is used to construct a starting point for a local optimization. Default 0.1. - sampling_distribution (str): One of “uniform”, “triangle”. Default is “uniform” as in the original tiktak algorithm. - sampling_method (str): One of “random”, “sobol”, “halton”, “hammersley”, “korobov”, “latin_hypercube” or a numpy array or DataFrame with custom points. Default is sobol for problems with up to 30 parameters and random for problems with more than 30 parameters. - mixing_weight_method (str or callable): Specifies how much weight is put on the currently best point when calculating a new starting point for a local optimization out of the currently best point and the next random starting point. Either “tiktak” or “linear” or a callable that takes the arguments
iteration
,n_iterations
,min_weight
,max_weight
. Default “tiktak”. - mixing_weight_bounds (tuple): A tuple consisting of a lower and upper bound on mixing weights. Default (0.1, 0.995). - convergence_max_discoveries (int): The multistart optimization converges if the currently best local optimum has been discovered independently inconvergence_max_discoveries
many local optimizations. Default 2. - convergence.relative_params_tolerance (float): Determines the maximum relative distance two parameter vectors can have to be considered equal for convergence purposes. - n_cores (int): Number cores used to evaluate the criterion function in parallel during exploration stages and number of parallel local optimization in optimization stages. Default 1. - batch_evaluator (str or callable): See Batch evaluators for details. Default “joblib”. - batch_size (int): If n_cores is larger than one, several starting points for local optimizations are created with the same weight and from the same currently best point. Thebatch_size
argument is a way to reproduce this behavior on a small machine where less cores are available. By default the batch_size is equal ton_cores
. It can never be smaller thann_cores
. - seed (int): Random seed for the creation of starting values. Default None. - exploration_error_handling (str): One of “raise” or “continue”. Default is continue, which means that failed function evaluations are simply discarded from the sample. - optimization_error_handling (str): One of “raise” or “continue”. Default is continue, which means that failed optimizations are simply discarded.
minimize
- estimagic.optimization.optimize.minimize(criterion, params, algorithm, *, criterion_kwargs=None, constraints=None, algo_options=None, derivative=None, derivative_kwargs=None, criterion_and_derivative=None, criterion_and_derivative_kwargs=None, numdiff_options=None, logging=False, log_options=None, error_handling='raise', error_penalty=None, cache_size=100, scaling=False, scaling_options=None, multistart=False, multistart_options=None)[source]¶
Minimize criterion using algorithm subject to constraints.
- Parameters
criterion (Callable) – A function that takes a pandas DataFrame (see How to specify start parameters) as first argument and returns one of the following: - scalar floating point or a numpy array (depending on the algorithm) - a dictionary that contains at the entries “value” (a scalar float), “contributions” or “root_contributions” (depending on the algortihm) and any number of additional entries. The additional dict entries will be logged and (if supported) displayed in the dashboard. Check the documentation of your algorithm to see which entries or output type are required.
params (pandas.DataFrame) – A DataFrame with a column called “value” and optional additional columns. See How to specify start parameters for detail.
algorithm (str or callable) – Specifies the optimization algorithm. For supported algorithms this is a string with the name of the algorithm. Otherwise it can be a callable with the estimagic algorithm interface. See How to specify algorithms and algorithm specific options.
criterion_kwargs (dict) – Additional keyword arguments for criterion
constraints (list) – List with constraint dictionaries. See .. _link: ../../docs/source/how_to_guides/how_to_use_constranits.ipynb
algo_options (dict) – Algorithm specific configuration of the optimization. See Available optimizers and their options for supported options of each algorithm.
derivative (callable, optional) – Function that calculates the first derivative of criterion. For most algorithm, this is the gradient of the scalar output (or “value” entry of the dict). However some algorithms (e.g. bhhh) require the jacobian of the “contributions” entry of the dict. You will get an error if you provide the wrong type of derivative.
derivative_kwargs (dict) – Additional keyword arguments for derivative.
criterion_and_derivative (callable) – Function that returns criterion and derivative as a tuple. This can be used to exploit synergies in the evaluation of both functions. The fist element of the tuple has to be exactly the same as the output of criterion. The second has to be exactly the same as the output of derivative.
criterion_and_derivative_kwargs (dict) – Additional keyword arguments for criterion and derivative.
numdiff_options (dict) – Keyword arguments for the calculation of numerical derivatives. See Derivatives for details. Note that the default method is changed to “forward” for speed reasons.
logging (pathlib.Path, str or False) – Path to sqlite3 file (which typically has the file extension
.db
. If the file does not exist, it will be created. When doing parallel optimizations and logging is provided, you have to provide a different path for each optimization you are running. You can disable logging completely by setting it to False, but we highly recommend not to do so. The dashboard can only be used when logging is used.log_options (dict) – Additional keyword arguments to configure the logging. - “fast_logging”: A boolean that determines if “unsafe” settings are used to speed up write processes to the database. This should only be used for very short running criterion functions where the main purpose of the log is a real-time dashboard and it would not be catastrophic to get a corrupted database in case of a sudden system shutdown. If one evaluation of the criterion function (and gradient if applicable) takes more than 100 ms, the logging overhead is negligible. - “if_table_exists”: (str) One of “extend”, “replace”, “raise”. What to do if the tables we want to write to already exist. Default “extend”. - “if_database_exists”: (str): One of “extend”, “replace”, “raise”. What to do if the database we want to write to already exists. Default “extend”.
error_handling (str) – Either “raise” or “continue”. Note that “continue” does not absolutely guarantee that no error is raised but we try to handle as many errors as possible in that case without aborting the optimization.
error_penalty (dict) – Dict with the entries “constant” (float) and “slope” (float). If the criterion or gradient raise an error and error_handling is “continue”, return
constant + slope * norm(params - start_params)
wherenorm
is the euclidean distance as criterion value and adjust the derivative accordingly. This is meant to guide the optimizer back into a valid region of parameter space (in direction of the start parameters). Note that the constant has to be high enough to ensure that the penalty is actually a bad function value. The default constant is f0 + abs(f0) + 100 for minimizations and f0 - abs(f0) - 100 for maximizations, where f0 is the criterion value at start parameters. The default slope is 0.1.cache_size (int) – Number of criterion and derivative evaluations that are cached in memory in case they are needed.
scaling (bool) – If True, the parameter vector is rescaled internally for better performance with scale sensitive optimizers.
scaling_options (dict or None) – Options to configure the internal scaling ot the parameter vector. See How to scale optimization problems for details and recommendations.
multistart (bool) – Whether to do the optimization from multiple starting points. Requires the params to have the columns
"soft_lower_bound"
and"soft_upper_bounds"
with finite values for all parameters, unless the standard bounds are already finite for all parameters.multistart_options (dict) – Options to configure the optimization from multiple starting values. The dictionary has the following entries (all of which are optional): - n_samples (int): Number of sampled points on which to do one function evaluation. Default is 10 * n_params. - sample (pandas.DataFrame or numpy.ndarray) A user definde sample. If this is provided, n_samples, sampling_method and sampling_distribution are not used. - share_optimizations (float): Share of sampled points that is used to construct a starting point for a local optimization. Default 0.1. - sampling_distribution (str): One of “uniform”, “triangle”. Default is “uniform” as in the original tiktak algorithm. - sampling_method (str): One of “random”, “sobol”, “halton”, “hammersley”, “korobov”, “latin_hypercube” or a numpy array or DataFrame with custom points. Default is sobol for problems with up to 30 parameters and random for problems with more than 30 parameters. - mixing_weight_method (str or callable): Specifies how much weight is put on the currently best point when calculating a new starting point for a local optimization out of the currently best point and the next random starting point. Either “tiktak” or “linear” or a callable that takes the arguments
iteration
,n_iterations
,min_weight
,max_weight
. Default “tiktak”. - mixing_weight_bounds (tuple): A tuple consisting of a lower and upper bound on mixing weights. Default (0.1, 0.995). - convergence_max_discoveries (int): The multistart optimization converges if the currently best local optimum has been discovered independently inconvergence_max_discoveries
many local optimizations. Default 2. - convergence.relative_params_tolerance (float): Determines the maximum relative distance two parameter vectors can have to be considered equal for convergence purposes. - n_cores (int): Number cores used to evaluate the criterion function in parallel during exploration stages and number of parallel local optimization in optimization stages. Default 1. - batch_evaluator (str or callaber): See Batch evaluators for details. Default “joblib”. - batch_size (int): If n_cores is larger than one, several starting points for local optimizations are created with the same weight and from the same currently best point. Thebatch_size
argument is a way to reproduce this behavior on a small machine where less cores are available. By default the batch_size is equal ton_cores
. It can never be smaller thann_cores
. - seed (int): Random seed for the creation of starting values. Default None. - exploration_error_handling (str): One of “raise” or “continue”. Default is continue, which means that failed function evaluations are simply discarded from the sample. - optimization_error_handling (str): One of “raise” or “continue”. Default is continue, which means that failed optimizations are simply discarded.
Bootstrap¶
bootstrap
- estimagic.inference.bootstrap.bootstrap(data, outcome, outcome_kwargs=None, n_draws=1000, cluster_by=None, ci_method='percentile', alpha=0.05, seed=None, n_cores=1, error_handling='continue', batch_evaluator=<function joblib_batch_evaluator>)[source]¶
Calculate bootstrap estimates, standard errors and confidence intervals for statistic of interest in given original sample.
- Parameters
data (pandas.DataFrame) – original dataset.
outcome (callable) – function of the data calculating statistic of interest. Needs to return a pandas Series.
outcome_kwargs (dict) – Additional keyword arguments for outcome.
n_draws (int) – number of bootstrap samples to draw.
cluster_by (str) – column name of variable to cluster by or None.
ci_method (str) – method of choice for confidence interval computation.
alpha (float) – significance level of choice.
seeds (numpy.array) – array of seeds for bootstrap samples, default is none.
n_cores (int) – number of jobs for parallelization.
error_handling (str) – One of “continue”, “raise”. Default “continue” which means that bootstrap estimates are only calculated for those samples where no errors occur and a warning is produced if any error occurs.
batch_evaluator (str or Callable) – Name of a pre-implemented batch evaluator (currently ‘joblib’ and ‘pathos_mp’) or Callable with the same interface as the estimagic batch_evaluators. See Batch evaluators.
- Returns
DataFrame where k’th row contains mean estimate, standard error, and confidence interval of k’th parameter.
- Return type
results (pandas.DataFrame)
bootstrap_from_outcomes
- estimagic.inference.bootstrap.bootstrap_from_outcomes(data, outcome, bootstrap_outcomes, ci_method='percentile', alpha=0.05, n_cores=1)[source]¶
Set up results table containing mean, standard deviation and confidence interval for each estimated parameter.
- Parameters
data (pandas.DataFrame) – original dataset.
outcome (callable) – function of the data calculating statistic of interest. Needs to return a pandas Series.
bootstrap_outcomes (pandas.DataFrame) – DataFrame of bootstrap_outcomes in the bootstrap samples.
ci_method (str) – method of choice for confidence interval computation.
n_cores (int) – number of jobs for parallelization.
alpha (float) – significance level of choice.
- Returns
table of results.
- Return type
results (pandas.DataFrame)
Derivatives¶
first_derivative
- estimagic.differentiation.derivatives.first_derivative(func, params, func_kwargs=None, method='central', n_steps=1, base_steps=None, scaling_factor=1, lower_bounds=None, upper_bounds=None, step_ratio=2, min_steps=None, f0=None, n_cores=1, error_handling='continue', batch_evaluator='joblib', return_func_value=False, return_info=True, key=None)[source]¶
Evaluate first derivative of func at params according to method and step options.
Internally, the function is converted such that it maps from a 1d array to a 1d array. Then the Jacobian of that function is calculated. The resulting derivative estimate is always a
numpy.ndarray
.The parameters and the function output can be pandas objects (Series or DataFrames with value column). In that case the output of first_derivative is also a pandas object and with appropriate index and columns.
Detailed description of all options that influence the step size as well as an explanation of how steps are adjusted to bounds in case of a conflict, see
generate_steps()
.- Parameters
func (callable) – Function of which the derivative is calculated.
params (numpy.ndarray, pandas.Series or pandas.DataFrame) – 1d numpy array or
pandas.DataFrame
with parameters at which the derivative is calculated. If it is a DataFrame, it can contain the columns “lower_bound” and “upper_bound” for bounds. See How to specify start parameters.func_kwargs (dict) – Additional keyword arguments for func, optional.
method (str) – One of [“central”, “forward”, “backward”], default “central”.
n_steps (int) – Number of steps needed. For central methods, this is the number of steps per direction. It is 1 if no Richardson extrapolation is used.
base_steps (numpy.ndarray, optional) – 1d array of the same length as params. base_steps * scaling_factor is the absolute value of the first (and possibly only) step used in the finite differences approximation of the derivative. If base_steps * scaling_factor conflicts with bounds, the actual steps will be adjusted. If base_steps is not provided, it will be determined according to a rule of thumb as long as this does not conflict with min_steps.
scaling_factor (numpy.ndarray or float) – Scaling factor which is applied to base_steps. If it is an numpy.ndarray, it needs to be as long as params. scaling_factor is useful if you want to increase or decrease the base_step relative to the rule-of-thumb or user provided base_step, for example to benchmark the effect of the step size. Default 1.
lower_bounds (numpy.ndarray) – 1d array with lower bounds for each parameter. If params is a DataFrame and has the columns “lower_bound”, this will be taken as lower_bounds if now lower_bounds have been provided explicitly.
upper_bounds (numpy.ndarray) – 1d array with upper bounds for each parameter. If params is a DataFrame and has the columns “upper_bound”, this will be taken as upper_bounds if no upper_bounds have been provided explicitly.
step_ratio (float, numpy.array) – Ratio between two consecutive Richardson extrapolation steps in the same direction. default 2.0. Has to be larger than one. The step ratio is only used if n_steps > 1.
min_steps (numpy.ndarray) – Minimal possible step sizes that can be chosen to accommodate bounds. Must have same length as params. By default min_steps is equal to base_steps, i.e step size is not decreased beyond what is optimal according to the rule of thumb.
f0 (numpy.ndarray) – 1d numpy array with func(x), optional.
n_cores (int) – Number of processes used to parallelize the function evaluations. Default 1.
error_handling (str) – One of “continue” (catch errors and continue to calculate derivative estimates. In this case, some derivative estimates can be missing but no errors are raised), “raise” (catch errors and continue to calculate derivative estimates at fist but raise an error if all evaluations for one parameter failed) and “raise_strict” (raise an error as soon as a function evaluation fails).
batch_evaluator (str or callable) – Name of a pre-implemented batch evaluator (currently ‘joblib’ and ‘pathos_mp’) or Callable with the same interface as the estimagic batch_evaluators.
return_func_value (bool) – If True, return function value at params, stored in output dict under “func_value”. Default False. This is useful when using first_derivative during optimization.
return_info (bool) – If True, return additional information on function evaluations and internal derivative candidates, stored in output dict under “func_evals” and “derivative_candidates”. Derivative candidates are only returned if n_steps > 1. Default True.
key (str) – If func returns a dictionary, take the derivative of func(params)[key].
- Returns
- Result dictionary with keys:
- ”derivative” (numpy.ndarray, pandas.Series or pandas.DataFrame): The
estimated first derivative of func at params. The shape of the output depends on the dimension of params and func(params):
f: R -> R leads to shape (1,), usually called derivative
f: R^m -> R leads to shape (m, ), usually called Gradient
f: R -> R^n leads to shape (n, 1), usually called Jacobian
f: R^m -> R^n leads to shape (n, m), usually called Jacobian
- ”func_value” (numpy.ndarray, pandas.Series or pandas.DataFrame): Function
value at params, returned if return_func_value is True.
- ”func_evals” (pandas.DataFrame): Function evaluations produced by internal
derivative method, returned if return_info is True.
- ”derivative_candidates” (pandas.DataFrame): Derivative candidates from
Richardson extrapolation, returned if return_info is True and n_steps > 1.
- Return type
result (dict)
derivative_plot
- estimagic.visualization.derivative_plot.derivative_plot(derivative_result)[source]¶
Plot evaluations and derivative estimates.
The resulting grid plot displays function evaluations and derivatives. The derivatives are visualized as a first-order Taylor approximation. Bands are drawn indicating the area in which forward and backward derivatives are located. This is done by filling the area between the derivative estimate with lowest and highest step size, respectively. Do not confuse these bands with statistical errors.
This function does not require the params vector as plots are displayed relative to the point at which the derivative is calculated.
- Parameters
derivative_result (dict) – The result dictionary of call to
first_derivative()
with return_info and return_func_value set to True.- Returns
The figure.
- Return type
fig (matplotlib.pyplot.figure)
Benchmarks¶
get_benchmark_problems
- estimagic.benchmarking.get_benchmark_problems.get_benchmark_problems(name, additive_noise=False, additive_noise_options=None, multiplicative_noise=False, multiplicative_noise_options=None)[source]¶
Get a dictionary of test problems for a benchmark.
- Parameters
name (str) – The name of the set of test problems. Currently “more_wild” is the only supported one.
additive_noise (bool) – Whether to add additive noise to the problem. Default False.
additive_noise_options (dict or None) – Specifies the amount and distribution of the addititve noise added to the problem. Has the entries: - distribition (str): One of “normal”, “gumbel”, “uniform”, “logistic”, “laplace”. Default “normal”. - std (float): The standard deviation of the noise. This works for all distributions, even if those distributions are normally not specified via a standard deviation (e.g. uniform). - correlation (float): Number between 0 and 1 that specifies the auto correlation of the noise.
multiplicative_noise (bool) – Whether to add multiplicative noise to the problem. Default False.
multiplicative_noise_options (dict or None) – Specifies the amount and distribition of the multiplicative noise added to the problem. Has entries: - distribition (str): One of “normal”, “gumbel”, “uniform”, “logistic”, “laplace”. Default “normal”. - std (float): The standard deviation of the noise. This works for all distributions, even if those distributions are normally not specified via a standard deviation (e.g. uniform). - correlation (float): Number between 0 and 1 that specifies the auto correlation of the noise. - clipping_value (float): A non-negative float. Multiplicative noise becomes zero if the function value is zero. To avoid this, we do not implement multiplicative noise as f_noisy = f * epsilon but by f_noisy = f + (epsilon - 1) * f_clipped` where f_clipped is bounded away from zero from both sides by the clipping value.
- Returns
- Nested dictionary with benchmark problems of the structure:
{“name”: {“inputs”: {…}, “solution”: {…}, “info”: {…}}} where “inputs” are keyword arguments for
minimize
such as the criterion function and start parameters. “solution” contains the entries “params” and “value” and “info” might contain information about the test problem.
- Return type
run_benchmark
- estimagic.benchmarking.run_benchmark.run_benchmark(problems, optimize_options, logging_directory, batch_evaluator='joblib', n_cores=1, error_handling='continue', fast_logging=True, seed=None)[source]¶
Run problems with different optimize options.
- Parameters
problems (dict) – Nested dictionary with benchmark problems of the structure: {“name”: {“inputs”: {…}, “solution”: {…}, “info”: {…}}} where “inputs” are keyword arguments for
minimize
such as the criterion function and start parameters. “solution” contains the entries “params” and “value” and “info” might contain information about the test problem.optimize_options (list or dict) – Either a list of algorithms or a Nested dictionary that maps a name for optimizer settings (e.g.
"lbfgsb_strict_criterion"
) to a dictionary of keyword arguments for arguments forminimize
(e.g.{"algorithm": "scipy_lbfgsb", "algo_options": {"convergence.relative_criterion_tolerance": 1e-12}}
). Alternatively, the values can just be an algorithm which is then benchmarked at default settings.batch_evaluator (str or callable) – See Batch evaluators.
logging_directory (pathlib.Path) – Directory in which the log databases are saved.
n_cores (int) – Number of optimizations that is run in parallel. Note that in addition to that an optimizer might parallelize.
error_handling (str) – One of “raise”, “continue”.
fast_logging (bool) – Whether the slightly unsafe but much faster database configuration is chosen.
- Returns
- Nested Dictionary with information on the benchmark run. The outer keys
are tuples where the first entry is the name of the problem and the second the name of the optimize options. The values are dicts with the entries: “runtime”, “params_history”, “criterion_history”, “solution”
- Return type
profile_plot
- estimagic.visualization.profile_plot.profile_plot(problems=None, results=None, runtime_measure='n_evaluations', normalize_runtime=False, stopping_criterion='y', x_precision=0.0001, y_precision=0.0001)[source]¶
Compare optimizers over a problem set.
This plot answers the question: What percentage of problems can each algorithm solve within a certain runtime budget?
The runtime budget is plotted on the x axis and the share of problems each algorithm solved on the y axis.
Thus, algorithms that are very specialized and perform well on some share of problems but are not able to solve more problems with a larger computational budget will have steep increases and then flat lines. Algorithms that are robust but slow, will have low shares in the beginning but reach very high.
Note that failing to converge according to the given stopping_criterion and precisions is scored as needing an infinite computational budget.
For details, see the description of performance and data profiles by Moré and Wild (2009).
- Parameters
problems (dict) – estimagic benchmarking problems dictionary. Keys are the problem names. Values contain information on the problem, including the solution value.
results (dict) – estimagic benchmarking results dictionary. Keys are tuples of the form (problem, algorithm), values are dictionaries of the collected information on the benchmark run, including ‘criterion_history’ and ‘time_history’.
runtime_measure (str) – “n_evaluations” or “walltime”. This is the runtime until the desired convergence was reached by an algorithm. This is called performance measure by Moré and Wild (2009).
normalize_runtime (bool) – If True the runtime each algorithm needed for each problem is scaled by the time the fastest algorithm needed. If True, the resulting plot is what Moré and Wild (2009) called data profiles.
stopping_criterion (str) – one of “x_and_y”, “x_or_y”, “x”, “y”. Determines how convergence is determined from the two precisions.
x_precision (float or None) – how close an algorithm must have gotten to the true parameter values (as percent of the Euclidean distance between start and solution parameters) before the criterion for clipping and convergence is fulfilled.
y_precision (float or None) – how close an algorithm must have gotten to the true criterion values (as percent of the distance between start and solution criterion value) before the criterion for clipping and convergence is fulfilled.
- Returns
fig
convergence_plot
- estimagic.visualization.convergence_plot.convergence_plot(problems=None, results=None, problem_subset=None, algorithm_subset=None, n_cols=2, distance_measure='criterion', monotone=True, normalize_distance=True, runtime_measure='n_evaluations', stopping_criterion='y', x_precision=0.0001, y_precision=0.0001)[source]¶
Plot convergence of optimizers for a set of problems.
This creates a grid of plots, showing the convergence of the different algorithms on each problem. The faster a line falls, the faster the algorithm improved on the problem. The algorithm converged where its line reaches 0 (if normalize_distance is True) or the horizontal blue line labeled “true solution”.
Each plot shows on the x axis the runtime_measure, which can be walltime or number of evaluations. Each algorithm’s convergence is a line in the plot. Convergence can be measured by the criterion value of the particular time/evaluation. The convergence can be made monotone (i.e. always taking the bast value so far) or normalized such that the distance from the start to the true solution is one.
- Parameters
problems (dict) – estimagic benchmarking problems dictionary. Keys are the problem names. Values contain information on the problem, including the solution value.
results (dict) – estimagic benchmarking results dictionary. Keys are tuples of the form (problem, algorithm), values are dictionaries of the collected information on the benchmark run, including ‘criterion_history’ and ‘time_history’.
problem_subset (list, optional) – List of problem names. These must be a subset of the keys of the problems dictionary. If provided the convergence plot is only created for the problems specified in this list.
algorithm_subset (list, optional) – List of algorithm names. These must be a subset of the keys of the optimizer_options passed to run_benchmark. If provided only the convergence of the given algorithms are shown.
n_cols (int) – number of columns in the plot of grids. The number of rows is determined automatically.
distance_measure (str) – One of “criterion”, “parameter_distance”.
monotone (bool) – If True the best found criterion value so far is plotted. If False the particular criterion evaluation of that time is used.
normalize_distance (bool) – If True the progress is scaled by the total distance between the start value and the optimal value, i.e. 1 means the algorithm is as far from the solution as the start value and 0 means the algorithm has reached the solution value.
runtime_measure (str) – “n_evaluations” or “walltime”.
stopping_criterion (str) – “x_and_y”, “x_or_y”, “x”, “y” or None. If None, no clipping is done.
x_precision (float or None) – how close an algorithm must have gotten to the true parameter values (as percent of the Euclidean distance between start and solution parameters) before the criterion for clipping and convergence is fulfilled.
y_precision (float or None) – how close an algorithm must have gotten to the true criterion values (as percent of the distance between start and solution criterion value) before the criterion for clipping and convergence is fulfilled.
- Returns
fig