Open In Colab

ec9b18d74380482e99fa860e42403c74

High-performance MoG calculations

The idea of fitting a MoG to a dataset is to have a semi-analytical description of the probability distribution of the dataset, even though the original distribution is unknown. Although Multimin allows generating nearly pure functions for the MoGs fitted through the get_function method, these routines are written in Python, which is a less efficient language than, for example, C. Multimin includes an optional tool that allows creating optimized routines with C code. To run these routines, it is necessary to download and install the GNU Scientific Library.

Installation and importing

If you’re running this in Google Colab or need to install the package, uncomment and run the following cell:

[1]:
try:
    from google.colab import drive
    %pip install -Uq multimin
except ImportError:
    print("Not running in Colab, skipping installation")
    %load_ext autoreload
    %autoreload 2
!mkdir -p gallery/

# Uncomment to install from GitHub (development version)
# !pip install git+https://github.com/seap-udea/MultiMin.git
Not running in Colab, skipping installation
[2]:
import multimin as mn
import numpy as np
import matplotlib.pyplot as plt

import warnings
warnings.filterwarnings("ignore")

deg = np.pi / 180
figprefix = "cmog"
Welcome to MultiMin v0.11.2. ¡Al infinito y más allá!

Verifying that GSL is installed

To verify that the library is installed, you can use the imultimin command from the command line:

[3]:
!imultimin --verify
Verifying MultiMin installation...
Welcome to MultiMin v0.11.2. ¡Al infinito y más allá!
✅ MultiMin package found: /Users/jzuluaga/dev/multimin/src/multimin/__init__.py
✅ Shared library found: /Users/jzuluaga/dev/multimin/src/multimin/lib/multimin.dylib

If it is not installed you can try to install it:

!imultimin --install-gsl

And follow the instructions.

3D distribution with one variable on a finite domain

Let’s try it in an infinite domain:

[4]:
weights = [0.5,0.5]
mus = [[1.0, 0.5, -0.5], [1.0, -0.5, +0.5],]
sigmas = [[1, 1.2, 2.3], [0.8, 0.2, 3.3]]
angles = [
    [10*deg, 30*deg, 20*deg],
    [-20*deg, 0*deg, 30*deg],
]
Sigmas = mn.Stats.calc_covariance_from_rotation(sigmas, angles)

MoG = mn.MixtureOfGaussians(mus=mus, weights=weights, Sigmas=Sigmas)

sample = MoG.rvs(5000)
MixtureOfGaussians.rvs executed in 0.2905158996582031 seconds

Normally:

[5]:
function, pymog = MoG.get_function()
from multimin import Util

def mog(X):

    mu1_1 = 1.0
    mu1_2 = 0.5
    mu1_3 = -0.5
    mu1 = [mu1_1, mu1_2, mu1_3]
    Sigma1 = [[1.132931, -0.33952, 0.223726], [-0.33952, 2.282436, -1.675999], [0.223726, -1.675999, 4.314632]]
    n1 = Util.nmd(X, mu1, Sigma1)

    mu2_1 = 1.0
    mu2_2 = -0.5
    mu2_3 = 0.5
    mu2 = [mu2_1, mu2_2, mu2_3]
    Sigma2 = [[0.621908, 0.102606, 0.0], [0.102606, 0.058092, 0.0], [0.0, 0.0, 10.89]]
    n2 = Util.nmd(X, mu2, Sigma2)

    w1 = 0.5
    w2 = 0.5

    return (
        w1*n1
        + w2*n2
    )

Now we can get the C-MoG version:

[6]:
function, cmog = MoG.get_function(cmog=True)
import numpy as np
from multimin import Util, cmog

def mog(X):
    weights = np.array([0.5, 0.5], dtype=np.float64)
    mus = np.array([
        [1, 0.5, -0.5],
        [1, -0.5, 0.5],
    ], dtype=np.float64)
    Sigmas = np.array([
        [
            [1.13293, -0.33952, 0.223726],
            [-0.33952, 2.28244, -1.676],
            [0.223726, -1.676, 4.31463],
        ],
        [
            [0.621908, 0.102606, 0],
            [0.102606, 0.0580922, 0],
            [0, 0, 10.89],
        ],
    ], dtype=np.float64)
    return cmog.mog_c(X, weights, mus, Sigmas)

And test it:

[7]:
X = np.array([[0.0, 0.3, 1.2]])
cmog(X), pymog(X)
[7]:
(array([0.00478285]), np.float64(0.004782856043975975))

For a single point let’s see the performance of both approaches:

[8]:
X = [0.2, 0.3, 0.5]
%timeit pymog(X)
%timeit cmog(X)
69.4 μs ± 6.67 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
13 μs ± 392 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

And for many points:

[9]:
X = np.random.normal(size=(10000,3))
%timeit pymog(X)
%timeit cmog(X)
454 μs ± 20.2 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
339 μs ± 47.5 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Fitting 3D data with a finite domain on one variable

It also works for truncated MoGs.

[10]:
weights = [0.5, 0.5]
mus = [[0.0, 0.3, 0.0], [0.0, 0.7, 0.0]]  # two bumps along the bounded variable (y)
sigmas = [[0.6, 0.15, 0.6], [0.6, 0.15, 0.6]]
Sigmas = [np.diag(s)**2 for s in sigmas]

MoG_3d = mn.MixtureOfGaussians(
    mus=mus,
    weights=weights,
    Sigmas=Sigmas,
    domain=[None, [0, 1], None],  # only variable 1 in [0, 1]
)

Normally:

[11]:
function, pymog = MoG_3d.get_function()
import numpy as np
from multimin import Util

def mog(X):

    a = [-np.inf, 0.0, -np.inf]
    b = [np.inf, 1.0, np.inf]

    mu1_1 = 0.0
    mu1_2 = 0.3
    mu1_3 = 0.0
    mu1 = [mu1_1, mu1_2, mu1_3]
    Sigma1 = [[0.36, 0.0, 0.0], [0.0, 0.0225, 0.0], [0.0, 0.0, 0.36]]
    Z1 = 0.977248
    n1 = Util.tnmd(X, mu1, Sigma1, a, b, Z=Z1)

    mu2_1 = 0.0
    mu2_2 = 0.7
    mu2_3 = 0.0
    mu2 = [mu2_1, mu2_2, mu2_3]
    Sigma2 = [[0.36, 0.0, 0.0], [0.0, 0.0225, 0.0], [0.0, 0.0, 0.36]]
    Z2 = 0.977248
    n2 = Util.tnmd(X, mu2, Sigma2, a, b, Z=Z2)

    w1 = 0.5
    w2 = 0.5

    return (
        w1*n1
        + w2*n2
    )

Now we can get the C-MoG version:

[12]:
function, cmog = MoG_3d.get_function(cmog=True)
import numpy as np
from multimin import Util, cmog

def mog(X):
    weights = np.array([0.5, 0.5], dtype=np.float64)
    mus = np.array([
        [0, 0.3, 0],
        [0, 0.7, 0],
    ], dtype=np.float64)
    Sigmas = np.array([
        [
            [0.36, 0, 0],
            [0, 0.0225, 0],
            [0, 0, 0.36],
        ],
        [
            [0.36, 0, 0],
            [0, 0.0225, 0],
            [0, 0, 0.36],
        ],
    ], dtype=np.float64)
    a = np.array([-np.inf, 0, -np.inf], dtype=np.float64)
    b = np.array([np.inf, 1, np.inf], dtype=np.float64)
    Zs = np.array([0.977248, 0.977248], dtype=np.float64)
    return cmog.tmog_c(X, weights, mus, Sigmas, a, b, Zs)

And test it:

[13]:
X = np.array([[0.0, 0.3, 1.2]])
cmog(X)
[13]:
array([0.08374225])

For a single point let’s see the performance of both approaches:

[14]:
X = [0.2, 0.3, 0.5]
%timeit pymog(X)
%timeit cmog(X)
80.8 μs ± 6.14 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
17.4 μs ± 181 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

And for many points:

[15]:
X = np.random.normal(size=(10000,3))
%timeit pymog(X)
%timeit cmog(X)
948 μs ± 58.4 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
224 μs ± 22.9 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

MultiMin – Multivariate Gaussian fitting with finite domains
© 2026 Jorge I. Zuluaga