{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/seap-udea/multimin/blob/master/examples/multimin_cmog.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

\n", "\n", "# High-performance MoG calculations\n", "\n", "The idea of fitting a MoG to a dataset is to have a semi-analytical description of the probability distribution of the dataset, even though the original distribution is unknown. Although **Multimin** allows generating nearly pure functions for the MoGs fitted through the `get_function` method, these routines are written in Python, which is a less efficient language than, for example, C. **Multimin** includes an optional tool that allows creating optimized routines with C code. To run these routines, it is necessary to download and install the [GNU Scientific Library](https://ftp.wayne.edu/gnu/gsl/)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Installation and importing\n", "\n", "If you're running this in Google Colab or need to install the package, uncomment and run the following cell:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2026-02-15T18:15:21.886866Z", "iopub.status.busy": "2026-02-15T18:15:21.886696Z", "iopub.status.idle": "2026-02-15T18:15:22.250164Z", "shell.execute_reply": "2026-02-15T18:15:22.249706Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Not running in Colab, skipping installation\n" ] } ], "source": [ "try:\n", " from google.colab import drive\n", " %pip install -Uq multimin\n", "except ImportError:\n", " print(\"Not running in Colab, skipping installation\")\n", " %load_ext autoreload\n", " %autoreload 2\n", "!mkdir -p gallery/\n", "\n", "# Uncomment to install from GitHub (development version)\n", "# !pip install git+https://github.com/seap-udea/MultiMin.git" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2026-02-15T18:15:22.253434Z", "iopub.status.busy": "2026-02-15T18:15:22.253306Z", "iopub.status.idle": "2026-02-15T18:15:23.802503Z", "shell.execute_reply": "2026-02-15T18:15:23.802060Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Welcome to MultiMin v0.11.2. \u00a1Al infinito y m\u00e1s all\u00e1!\n" ] } ], "source": [ "import multimin as mn\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "import warnings\n", "warnings.filterwarnings(\"ignore\")\n", "\n", "deg = np.pi / 180\n", "figprefix = \"cmog\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Verifying that GSL is installed" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To verify that the library is installed, you can use the `imultimin` command from the command line:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "execution": { "iopub.execute_input": "2026-02-15T18:15:23.810376Z", "iopub.status.busy": "2026-02-15T18:15:23.810216Z", "iopub.status.idle": "2026-02-15T18:15:25.092028Z", "shell.execute_reply": "2026-02-15T18:15:25.091185Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Verifying MultiMin installation...\r\n", "Welcome to MultiMin v0.11.2. \u00a1Al infinito y m\u00e1s all\u00e1!\r\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u2705 MultiMin package found: /Users/jzuluaga/dev/multimin/src/multimin/__init__.py\r\n", "\u2705 Shared library found: /Users/jzuluaga/dev/multimin/src/multimin/lib/multimin.dylib\r\n" ] } ], "source": [ "!imultimin --verify" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If it is not installed you can try to install it:\n", "\n", "```bash\n", "!imultimin --install-gsl \n", "```\n", "\n", "And follow the instructions." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3D distribution with one variable on a finite domain\n", "\n", "Let's try it in an infinite domain:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "execution": { "iopub.execute_input": "2026-02-15T18:15:25.095275Z", "iopub.status.busy": "2026-02-15T18:15:25.094785Z", "iopub.status.idle": "2026-02-15T18:15:25.418141Z", "shell.execute_reply": "2026-02-15T18:15:25.417758Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "MixtureOfGaussians.rvs executed in 0.2905158996582031 seconds\n" ] } ], "source": [ "weights = [0.5,0.5]\n", "mus = [[1.0, 0.5, -0.5], [1.0, -0.5, +0.5],]\n", "sigmas = [[1, 1.2, 2.3], [0.8, 0.2, 3.3]]\n", "angles = [\n", " [10*deg, 30*deg, 20*deg],\n", " [-20*deg, 0*deg, 30*deg],\n", "] \n", "Sigmas = mn.Stats.calc_covariance_from_rotation(sigmas, angles)\n", "\n", "MoG = mn.MixtureOfGaussians(mus=mus, weights=weights, Sigmas=Sigmas)\n", "\n", "sample = MoG.rvs(5000)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Normally:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "execution": { "iopub.execute_input": "2026-02-15T18:15:25.419319Z", "iopub.status.busy": "2026-02-15T18:15:25.419243Z", "iopub.status.idle": "2026-02-15T18:15:25.427123Z", "shell.execute_reply": "2026-02-15T18:15:25.426754Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "from multimin import Util\n", "\n", "def mog(X):\n", "\n", " mu1_1 = 1.0\n", " mu1_2 = 0.5\n", " mu1_3 = -0.5\n", " mu1 = [mu1_1, mu1_2, mu1_3]\n", " Sigma1 = [[1.132931, -0.33952, 0.223726], [-0.33952, 2.282436, -1.675999], [0.223726, -1.675999, 4.314632]]\n", " n1 = Util.nmd(X, mu1, Sigma1)\n", "\n", " mu2_1 = 1.0\n", " mu2_2 = -0.5\n", " mu2_3 = 0.5\n", " mu2 = [mu2_1, mu2_2, mu2_3]\n", " Sigma2 = [[0.621908, 0.102606, 0.0], [0.102606, 0.058092, 0.0], [0.0, 0.0, 10.89]]\n", " n2 = Util.nmd(X, mu2, Sigma2)\n", "\n", " w1 = 0.5\n", " w2 = 0.5\n", "\n", " return (\n", " w1*n1\n", " + w2*n2\n", " )\n" ] } ], "source": [ "function, pymog = MoG.get_function()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can get the **C-MoG** version:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "execution": { "iopub.execute_input": "2026-02-15T18:15:25.428207Z", "iopub.status.busy": "2026-02-15T18:15:25.428141Z", "iopub.status.idle": "2026-02-15T18:15:25.435475Z", "shell.execute_reply": "2026-02-15T18:15:25.435127Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "import numpy as np\n", "from multimin import Util, cmog\n", "\n", "def mog(X):\n", " weights = np.array([0.5, 0.5], dtype=np.float64)\n", " mus = np.array([\n", " [1, 0.5, -0.5],\n", " [1, -0.5, 0.5],\n", " ], dtype=np.float64)\n", " Sigmas = np.array([\n", " [\n", " [1.13293, -0.33952, 0.223726],\n", " [-0.33952, 2.28244, -1.676],\n", " [0.223726, -1.676, 4.31463],\n", " ],\n", " [\n", " [0.621908, 0.102606, 0],\n", " [0.102606, 0.0580922, 0],\n", " [0, 0, 10.89],\n", " ],\n", " ], dtype=np.float64)\n", " return cmog.mog_c(X, weights, mus, Sigmas)\n" ] } ], "source": [ "function, cmog = MoG.get_function(cmog=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And test it:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "execution": { "iopub.execute_input": "2026-02-15T18:15:25.436624Z", "iopub.status.busy": "2026-02-15T18:15:25.436561Z", "iopub.status.idle": "2026-02-15T18:15:25.465579Z", "shell.execute_reply": "2026-02-15T18:15:25.465229Z" } }, "outputs": [ { "data": { "text/plain": [ "(array([0.00478285]), np.float64(0.004782856043975975))" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X = np.array([[0.0, 0.3, 1.2]])\n", "cmog(X), pymog(X)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For a single point let's see the performance of both approaches:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "execution": { "iopub.execute_input": "2026-02-15T18:15:25.466805Z", "iopub.status.busy": "2026-02-15T18:15:25.466729Z", "iopub.status.idle": "2026-02-15T18:15:43.479305Z", "shell.execute_reply": "2026-02-15T18:15:43.476486Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "69.4 \u03bcs \u00b1 6.67 \u03bcs per loop (mean \u00b1 std. dev. of 7 runs, 10,000 loops each)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "13 \u03bcs \u00b1 392 ns per loop (mean \u00b1 std. dev. of 7 runs, 100,000 loops each)\n" ] } ], "source": [ "X = [0.2, 0.3, 0.5]\n", "%timeit pymog(X)\n", "%timeit cmog(X)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And for many points:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "execution": { "iopub.execute_input": "2026-02-15T18:15:43.514401Z", "iopub.status.busy": "2026-02-15T18:15:43.513817Z", "iopub.status.idle": "2026-02-15T18:15:50.408129Z", "shell.execute_reply": "2026-02-15T18:15:50.407524Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "454 \u03bcs \u00b1 20.2 \u03bcs per loop (mean \u00b1 std. dev. of 7 runs, 1,000 loops each)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "339 \u03bcs \u00b1 47.5 \u03bcs per loop (mean \u00b1 std. dev. of 7 runs, 1,000 loops each)\n" ] } ], "source": [ "X = np.random.normal(size=(10000,3))\n", "%timeit pymog(X)\n", "%timeit cmog(X)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fitting 3D data with a finite domain on one variable\n", "\n", "It also works for truncated MoGs." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "execution": { "iopub.execute_input": "2026-02-15T18:15:50.418459Z", "iopub.status.busy": "2026-02-15T18:15:50.418302Z", "iopub.status.idle": "2026-02-15T18:15:50.468475Z", "shell.execute_reply": "2026-02-15T18:15:50.467991Z" } }, "outputs": [], "source": [ "weights = [0.5, 0.5]\n", "mus = [[0.0, 0.3, 0.0], [0.0, 0.7, 0.0]] # two bumps along the bounded variable (y)\n", "sigmas = [[0.6, 0.15, 0.6], [0.6, 0.15, 0.6]]\n", "Sigmas = [np.diag(s)**2 for s in sigmas]\n", "\n", "MoG_3d = mn.MixtureOfGaussians(\n", " mus=mus,\n", " weights=weights,\n", " Sigmas=Sigmas,\n", " domain=[None, [0, 1], None], # only variable 1 in [0, 1]\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Normally:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "execution": { "iopub.execute_input": "2026-02-15T18:15:50.472802Z", "iopub.status.busy": "2026-02-15T18:15:50.472679Z", "iopub.status.idle": "2026-02-15T18:15:50.583099Z", "shell.execute_reply": "2026-02-15T18:15:50.582619Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "import numpy as np\n", "from multimin import Util\n", "\n", "def mog(X):\n", "\n", " a = [-np.inf, 0.0, -np.inf]\n", " b = [np.inf, 1.0, np.inf]\n", "\n", " mu1_1 = 0.0\n", " mu1_2 = 0.3\n", " mu1_3 = 0.0\n", " mu1 = [mu1_1, mu1_2, mu1_3]\n", " Sigma1 = [[0.36, 0.0, 0.0], [0.0, 0.0225, 0.0], [0.0, 0.0, 0.36]]\n", " Z1 = 0.977248\n", " n1 = Util.tnmd(X, mu1, Sigma1, a, b, Z=Z1)\n", "\n", " mu2_1 = 0.0\n", " mu2_2 = 0.7\n", " mu2_3 = 0.0\n", " mu2 = [mu2_1, mu2_2, mu2_3]\n", " Sigma2 = [[0.36, 0.0, 0.0], [0.0, 0.0225, 0.0], [0.0, 0.0, 0.36]]\n", " Z2 = 0.977248\n", " n2 = Util.tnmd(X, mu2, Sigma2, a, b, Z=Z2)\n", "\n", " w1 = 0.5\n", " w2 = 0.5\n", "\n", " return (\n", " w1*n1\n", " + w2*n2\n", " )\n" ] } ], "source": [ "function, pymog = MoG_3d.get_function()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can get the **C-MoG** version:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "execution": { "iopub.execute_input": "2026-02-15T18:15:50.588903Z", "iopub.status.busy": "2026-02-15T18:15:50.588658Z", "iopub.status.idle": "2026-02-15T18:15:50.610707Z", "shell.execute_reply": "2026-02-15T18:15:50.610293Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "import numpy as np\n", "from multimin import Util, cmog\n", "\n", "def mog(X):\n", " weights = np.array([0.5, 0.5], dtype=np.float64)\n", " mus = np.array([\n", " [0, 0.3, 0],\n", " [0, 0.7, 0],\n", " ], dtype=np.float64)\n", " Sigmas = np.array([\n", " [\n", " [0.36, 0, 0],\n", " [0, 0.0225, 0],\n", " [0, 0, 0.36],\n", " ],\n", " [\n", " [0.36, 0, 0],\n", " [0, 0.0225, 0],\n", " [0, 0, 0.36],\n", " ],\n", " ], dtype=np.float64)\n", " a = np.array([-np.inf, 0, -np.inf], dtype=np.float64)\n", " b = np.array([np.inf, 1, np.inf], dtype=np.float64)\n", " Zs = np.array([0.977248, 0.977248], dtype=np.float64)\n", " return cmog.tmog_c(X, weights, mus, Sigmas, a, b, Zs)" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "function, cmog = MoG_3d.get_function(cmog=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And test it:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "execution": { "iopub.execute_input": "2026-02-15T18:15:50.611919Z", "iopub.status.busy": "2026-02-15T18:15:50.611755Z", "iopub.status.idle": "2026-02-15T18:15:50.628737Z", "shell.execute_reply": "2026-02-15T18:15:50.628356Z" } }, "outputs": [ { "data": { "text/plain": [ "array([0.08374225])" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X = np.array([[0.0, 0.3, 1.2]])\n", "cmog(X)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For a single point let's see the performance of both approaches:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "execution": { "iopub.execute_input": "2026-02-15T18:15:50.634870Z", "iopub.status.busy": "2026-02-15T18:15:50.634779Z", "iopub.status.idle": "2026-02-15T18:15:58.857248Z", "shell.execute_reply": "2026-02-15T18:15:58.856742Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "80.8 \u03bcs \u00b1 6.14 \u03bcs per loop (mean \u00b1 std. dev. of 7 runs, 10,000 loops each)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "17.4 \u03bcs \u00b1 181 ns per loop (mean \u00b1 std. dev. of 7 runs, 10,000 loops each)\n" ] } ], "source": [ "X = [0.2, 0.3, 0.5]\n", "%timeit pymog(X)\n", "%timeit cmog(X)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And for many points:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "execution": { "iopub.execute_input": "2026-02-15T18:15:58.864513Z", "iopub.status.busy": "2026-02-15T18:15:58.864396Z", "iopub.status.idle": "2026-02-15T18:16:08.442142Z", "shell.execute_reply": "2026-02-15T18:16:08.439176Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "948 \u03bcs \u00b1 58.4 \u03bcs per loop (mean \u00b1 std. dev. of 7 runs, 1,000 loops each)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "224 \u03bcs \u00b1 22.9 \u03bcs per loop (mean \u00b1 std. dev. of 7 runs, 1,000 loops each)\n" ] } ], "source": [ "X = np.random.normal(size=(10000,3))\n", "%timeit pymog(X)\n", "%timeit cmog(X)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "**MultiMin** \u2013 Multivariate Gaussian fitting with finite domains \n", "\u00a9 2026 Jorge I. Zuluaga" ] } ], "metadata": { "kernelspec": { "display_name": ".multimin", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.5" } }, "nbformat": 4, "nbformat_minor": 4 }