The logging and log_options Arguments

Estimagic can keep a persistent log of the parameter and criterion values tried out by an optimizer. For this we use an sqlite database, which makes it easy to read from and write to the log-file from several processes or threads. Moreover, it is possible to retrieve data from the log-file without ever loading it into memory, which might be relevant for very long running optimizations.

The log-file is updated instantly when new information becomes available. Thus, no data is lost when an optimization has to be aborted or a server is shut down for maintenance.

The sqlite database is also used to exchange data between the optimization and the dashboard.

In addition to parameters and criterion values, we also save all arguments to an maximize or minimize in the database as well as other information in the database that can help to reproduce an optimization result.

The logging Argument

logging can be a string or pathlib.Path that specifies the path to a sqlite3 database. Typically, those files have the file extension .db. If the file does not exist, it will be created for you. If it exists, we will create and potentially overwrite tables that are used to log the optimization. The details of what estimagic will do with your database file are documented in the following function.

estimagic.logging.create_database.prepare_database(path, params, comparison_plot_data=None, dash_options=None, constraints=None, optimization_status='scheduled', gradient_status=0)[source]

Return database metadata object with all relevant tables for the optimization.

This should always be used to create entirely new databases or to create the tables needed during optimization in an existing database.

A new database is created if path does not exist yet. Otherwise the existing database is loaded and all tables needed to log the optimization are overwritten. Other tables remain unchanged.

The resulting database has the following tables:

  • params_history: the complete history of parameters from the optimization. The index column is “iteration”, the remaining columns are parameter names taken from params[“name”].

  • gradient_history: the complete history of gradient evaluations from the optimization. Same columns as params_history.

  • criterion_history: the complete history of criterion values from the optimization. The index column is “iteration”, the second column is “value”.

  • time_stamps: timestamps from the end of each criterion evaluation. Same columns as criterion_history.

  • convergence_history: the complete history of convergence criteria from the optimization. The index column is “iteration”, the other columns are “ftol”, “gtol” and “xtol”.

  • start_params: copy of user provided params. This is not just the first entry of params_history because it contains all columns and has a different index.

  • optimization_status: table with one row and one column called “value” which takes the values “scheduled”, “running”, “success” or “failure”. Initialized to optimization_status.

  • gradient_status: table with one row and one column called “value” which can be any float between 0 and 1 and indicates the progress of the gradient calculation. Initialized to gradient_status

  • dash_options: table with one row and one column called “value”. It contains a dictionary with the dashboard options. Internally this is a PickleType, so the dictionary must be pickle serializable. Initialized to dash_options.

  • exceptions: table with one column called “value” with exceptions.

  • constraints: table with one row and one column called “value”. It contains the list of constraints. Internally this is a PickleType, so the list must be pickle serializable.

Parameters
  • path (str or pathlib.Path) – location of the database file. If the file does not exist, it will be created.

  • params (pd.DataFrame) – see The params Argument.

  • comparison_plot_data – (numpy.ndarray or pandas.Series or pandas.DataFrame): Contains the data for the comparison plot. Later updates will only deliver the value column where as this input has an index and other invariant information.

  • dash_options (dict) – Dictionary with the dashboard options.

  • optimization_status (str) – One of “scheduled”, “running”, “success”, “failure”.

  • gradient_status (float) – Progress of gradient calculation between 0 and 1.

  • constraints (list) – List of constraints.

Returns

database (sqlalchemy.MetaData). The engine that connects to the database can be accessed via database.bind.

The log_options Argument

log_options is a dictionary with keyword arguments that influence the logging behavior. The following options are available:

  • "readme": A string with a description of the optimization. This can be helpful to send a message to your future self who might have forgotten why (s)he ran this particular optimization.