pysindy.optimizers.EvidenceGreedy

class pysindy.optimizers.EvidenceGreedy(alpha: float = 1.0, _sigma2: float = np.float64(4.930380657631324e-32), max_iter: int | None = None, normalize_columns: bool = True, copy_X: bool = True, initial_guess: ndarray | None = None, unbias: bool = False, verbose: bool = False)[source]

Sparse Regression by maximizing Bayesian evidence through greedy elimination of features

This optimizer performs backward model selection (i.e.feature elimination) driven by the Bayesian log evidence for a linear Gaussian model with an isotropic Gaussian prior on the coefficients. For each target dimension y_{tgt}, we assume

\[\begin{split}w &\sim \mathcal{N}\!\left(0,\ \alpha^{-1} I\right), \\ y_{tgt} \mid w &\sim \mathcal{N}\!\left(\Theta w,\ \sigma^2 I\right),\end{split}\]

where alpha is the prior precision on the coefficients (sigma_p^{-2}) and _sigma2 is the observation noise variance (sigma^2).

The algorithm:

  1. Start from the full support (all library terms active).

  2. At each step, temporarily remove each active term in turn.

  3. For each candidate support, compute the Bayesian log evidence \(\log p(y_{tgt} \mid \alpha, \sigma^2, \mathrm{support})\) using the precomputed statistics \(G=\Theta^\top\Theta\) and \(b_{tgt}=\Theta^\top y_{tgt}\).

  4. Accept the removal that yields the largest increase in evidence.

  5. Stop when no single removal increases the evidence.

Parameters:
  • alpha (float, default=1.0) – Prior precision on the coefficients (sigma_p^{-2}). Must be positive. The prior is defined in the feature space actually used by the optimizer. In particular, when normalize_columns=True, alpha controls an isotropic Gaussian prior on the coefficients in the normalized library. Changing normalize_columns without retuning alpha will generally change the effective strength of the regularization.

  • _sigma2 (float, default= (float precision**2)) – Observation noise variance (sigma^2). Must be positive.

  • max_iter (int or None) – Maximum number of elimination steps. If None, at most n_features - 1 removals are allowed.

  • normalize_columns (bool, default=True) –

    Passed to BaseOptimizer. If True, BOTH the columns of the library matrix and the target variables are normalized before regression. The Bayesian prior and ridge penalty are then applied in this normalized space. The learned coefficients are mapped back to the original scale when stored in coef_.

    Note that when normalize_columns=True, alpha is typically of order 1.0.

  • copy_X (bool, default=True) – Passed to BaseOptimizer. If True, input data are copied.

  • initial_guess (array-like of shape (n_targets, n_features) or None, ) – default=None Currently ignored by the greedy algorithm; present for API compatibility with BaseOptimizer.

  • unbias (bool, default=False) – Whether to perform an additional unregularized refit after support selection. For a Bayesian evidence interpretation the regularized posterior mean is natural, so the default is False.

  • verbose (bool, default=False) – If True, prints a short trace of evidence values during backward elimination for each target dimension.

Attributes:
  • coef_ (ndarray of shape (n_targets, n_features)) – Final coefficient matrix Xi. Row i contains the coefficients for the i-th target variable, with zeros outside the selected support.

  • ind_ (ndarray of bool of shape (n_targets, n_features)) – Boolean support mask corresponding to coef_. ind_[i, tgt] is True if the tgt-th library function is active in the equation for the i-th target.

  • history_ (list of ndarray) – Minimal coefficient history kept for compatibility with other optimizers. By convention history_[-1] is the final coefficient matrix coef_.

  • evidence_history_ (list of list of dict) – Per-target evidence traces. evidence_history_[i] is a list of dictionaries recording the support size and log evidence at each backward-elimination step for the i-th target, e.g.:

    {"step": k,
     "removed": tgt,
     "support_size": (number of active features after removal),
     "log_evidence": value}
    

Examples

>>> import numpy as np
>>> from scipy.integrate import odeint
>>> from pysindy import SINDy
>>> from pysindy.optimizers import EvidenceGreedy
>>>
>>> # Lorenz system
>>> lorenz = lambda z, t: [
...     10 * (z[1] - z[0]),
...     z[0] * (28 - z[2]) - z[1],
...     z[0] * z[1] - 8 / 3 * z[2],
... ]
>>> t = np.arange(0, 10, 0.01)
>>> x = odeint(lorenz, [-8, 8, 27], t)
>>>
>>> # Add noise to the measurements
>>> sigma_x = 1e-2
>>> x = x + sigma_x * np.random.normal(size=x.shape)
>>>
>>> opt = EvidenceGreedy(alpha=1e-6, max_iter=20, normalize_columns=False)
>>> model = BINDy(optimizer=opt)
>>> model.fit(x, t=t[1] - t[0])
>>> model.print()

Example output:

(x0)' = -9.979 x0 + 9.980 x1
(x1)' = 27.807 x0 - 0.963 x1 - 0.995 x0 x2
(x2)' = -2.658 x2 + 0.997 x0 x1

Methods

set_fit_request

Configure whether metadata should be requested to be passed to the fit method.

set_score_request

Configure whether metadata should be requested to be passed to the score method.

Attributes

max_iter

normalize_columns

initial_guess

copy_X

unbias

coef_

intercept_

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$', x_: bool | None | str = '$UNCHANGED$') EvidenceGreedy

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_ parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') EvidenceGreedy

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object