{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import estimagic as em\n", "import numpy as np\n", "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Numerical differentiation\n", "\n", "Using simple examples, This tutorial shows you how to numerical differentiation with estimagic. More details on the topics covered here can be found in the [how to guides](../how_to_guides/index.md)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Basic usage of `first_derivative`" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def sphere(params):\n", " return params @ params" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0., 2., 4., 6., 8.])" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd = em.first_derivative(\n", " func=sphere,\n", " params=np.arange(5),\n", ")\n", "\n", "fd[\"derivative\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Basic usage of `second_derivative`" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[2.001, 0. , 0. , 0. , 0. ],\n", " [0. , 2. , 0. , 0. , 0. ],\n", " [0. , 0. , 2. , 0. , 0. ],\n", " [0. , 0. , 0. , 2. , 0. ],\n", " [0. , 0. , 0. , 0. , 2. ]])" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sd = em.second_derivative(\n", " func=sphere,\n", " params=np.arange(5),\n", ")\n", "\n", "sd[\"derivative\"].round(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## `params` do not have to be vectors\n", "\n", "In estimagic, params can be arbitrary [pytrees](https://jax.readthedocs.io/en/latest/pytrees.html). Examples are (nested) dictionaries of numbers, arrays and pandas objects. " ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "def dict_sphere(params):\n", " return params[\"a\"] ** 2 + params[\"b\"] ** 2 + (params[\"c\"] ** 2).sum()" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'a': array(0.),\n", " 'b': array(2.),\n", " 'c': 0 4.0\n", " 1 6.0\n", " 2 8.0\n", " dtype: float64}" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd = em.first_derivative(\n", " func=dict_sphere,\n", " params={\"a\": 0, \"b\": 1, \"c\": pd.Series([2, 3, 4])},\n", ")\n", "\n", "fd[\"derivative\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Description of the output\n", "\n", "The output of `first_derivative` when using a general pytree is straight-forward. Nevertheless, this explanation requires terminolgy of pytrees. Please refer to the [JAX documentation of pytrees](https://jax.readthedocs.io/en/latest/pytrees.html).\n", "\n", "The output tree of `first_derivative` has the same structure as the params tree. Equivalent to the numpy case, where the gradient is a vector of shape `(len(params),)`. If, however, the params tree contains non-scalar entries like `numpy.ndarray`'s, `pandas.Series`', or `pandas.DataFrame`'s, the output is not expanded but a block is created instead. In the above example, the entry `params[\"c\"]` is a `pandas.Series` with 3 entries. Thus, the first derivative output contains the corresponding 3x1-block of the gradient at the position `[\"c\"]`:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 4.0\n", "1 6.0\n", "2 8.0\n", "dtype: float64" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd[\"derivative\"][\"c\"]" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'a': {'a': array(2.00072215),\n", " 'b': array(0.),\n", " 'c': 0 0.0\n", " 1 0.0\n", " 2 0.0\n", " dtype: float64},\n", " 'b': {'a': array(0.),\n", " 'b': array(1.9999955),\n", " 'c': 0 0.0\n", " 1 0.0\n", " 2 0.0\n", " dtype: float64},\n", " 'c': {'a': 0 0.0\n", " 1 0.0\n", " 2 0.0\n", " dtype: float64,\n", " 'b': 0 0.0\n", " 1 0.0\n", " 2 0.0\n", " dtype: float64,\n", " 'c': 0 1 2\n", " 0 1.999995 0.000004 -0.000003\n", " 1 0.000004 2.000001 0.000000\n", " 2 -0.000003 0.000000 1.999997}}" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sd = em.second_derivative(\n", " func=dict_sphere,\n", " params={\"a\": 0, \"b\": 1, \"c\": pd.Series([2, 3, 4])},\n", ")\n", "\n", "sd[\"derivative\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Description of the output\n", "\n", "The output of `second_derivative` when using a general pytrees looks more complex but is easy once we remember that the second derivative is equivalent to applying the first derivative twice. This explanation requires terminolgy of pytrees. Please refer to the [JAX documentation of pytrees](https://jax.readthedocs.io/en/latest/pytrees.html).\n", "\n", "The output tree is a product of the params tree with itself. This is equivalent to the numpy case, where the hessian is a matrix of shape `(len(params), len(params))`. If, however, the params tree contains non-scalar entries like `numpy.ndarray`'s, `pandas.Series`', or `pandas.DataFrame`'s, the output is not expanded but a block is created instead. In the above example, the entry `params[\"c\"]` is a 3-dimensional `pandas.Series`. Thus, the second derivative output contains the corresponding 3x3-block of the hessian at the position `[\"c\"][\"c\"]`:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
012
02.00.0-0.0
10.02.00.0
2-0.00.02.0
\n", "
" ], "text/plain": [ " 0 1 2\n", "0 2.0 0.0 -0.0\n", "1 0.0 2.0 0.0\n", "2 -0.0 0.0 2.0" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sd[\"derivative\"][\"c\"][\"c\"].round(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## There are many options\n", "\n", "You can choose which finite difference method to use, whether we should respect parameter bounds, or whether to evaluate the function in parallel. Let's go through some basic examples. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## You can choose the difference method" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([-0. , 2. , 4. , 6. , 7.99999994])" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd = em.first_derivative(\n", " func=sphere, params=np.arange(5), method=\"backward\" # default: 'central'\n", ")\n", "\n", "fd[\"derivative\"]" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[2.006, 0. , 0. , 0. , 0. ],\n", " [0. , 2. , 0. , 0. , 0. ],\n", " [0. , 0. , 2. , 0. , 0. ],\n", " [0. , 0. , 0. , 2. , 0. ],\n", " [0. , 0. , 0. , 0. , 2. ]])" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sd = em.second_derivative(\n", " func=sphere, params=np.arange(5), method=\"forward\" # default: 'central_cross'\n", ")\n", "\n", "sd[\"derivative\"].round(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## You can add bounds " ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0. , 2. , 4. , 6. , 8.00000006])" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "params = np.arange(5)\n", "\n", "fd = em.first_derivative(\n", " func=sphere,\n", " params=params,\n", " lower_bounds=params, # forces first_derivative to use forward differences\n", " upper_bounds=params + 1,\n", ")\n", "\n", "fd[\"derivative\"]" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[2.006, 0. , 0. , 0. , 0. ],\n", " [0. , 2. , 0. , 0. , 0. ],\n", " [0. , 0. , 2. , 0. , 0. ],\n", " [0. , 0. , 0. , 2. , 0. ],\n", " [0. , 0. , 0. , 0. , 2. ]])" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sd = em.second_derivative(\n", " func=sphere,\n", " params=params,\n", " lower_bounds=params, # forces first_derivative to use forward differences\n", " upper_bounds=params + 1,\n", ")\n", "\n", "sd[\"derivative\"].round(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Or use parallelized numerical derivatives" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0., 2., 4., 6., 8.])" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fd = em.first_derivative(\n", " func=sphere,\n", " params=np.arange(5),\n", " n_cores=4,\n", ")\n", "\n", "fd[\"derivative\"]" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[2.001, 0. , 0. , 0. , 0. ],\n", " [0. , 2. , 0. , 0. , 0. ],\n", " [0. , 0. , 2. , 0. , 0. ],\n", " [0. , 0. , 0. , 2. , 0. ],\n", " [0. , 0. , 0. , 0. , 2. ]])" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sd = em.second_derivative(\n", " func=sphere,\n", " params=params,\n", " n_cores=4,\n", ")\n", "\n", "sd[\"derivative\"].round(3)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.8" }, "vscode": { "interpreter": { "hash": "40d3a090f54c6569ab1632332b64b2c03c39dcf918b08424e98f38b5ae0af88f" } } }, "nbformat": 4, "nbformat_minor": 4 }