pysindy package

Subpackages

Submodules

pysindy.pysindy module

class pysindy.pysindy.SINDy(optimizer=None, feature_library=None, differentiation_method=None, feature_names=None, t_default=1, discrete_time=False)[source]

Bases: BaseEstimator

Sparse Identification of Nonlinear Dynamical Systems (SINDy). Uses sparse regression to learn a dynamical systems model from measurement data.

Parameters:
  • optimizer (optimizer object, optional) – Optimization method used to fit the SINDy model. This must be a class extending pysindy.optimizers.BaseOptimizer. The default is STLSQ.

  • feature_library (feature library object, optional) – Feature library object used to specify candidate right-hand side features. This must be a class extending pysindy.feature_library.base.BaseFeatureLibrary. The default option is PolynomialLibrary.

  • differentiation_method (differentiation object, optional) – Method for differentiating the data. This must be a class extending pysindy.differentiation_methods.base.BaseDifferentiation class. The default option is centered difference.

  • feature_names (list of string, length n_input_features, optional) – Names for the input features (e.g. ['x', 'y', 'z']). If None, will use ['x0', 'x1', ...].

  • t_default (float, optional (default 1)) – Default value for the time step.

  • discrete_time (boolean, optional (default False)) – If True, dynamical system is treated as a map. Rather than predicting derivatives, the right hand side functions step the system forward by one time step. If False, dynamical system is assumed to be a flow (right-hand side functions predict continuous time derivatives).

Attributes:
  • model (sklearn.multioutput.MultiOutputRegressor object) – The fitted SINDy model.

  • n_input_features_ (int) – The total number of input features.

  • n_output_features_ (int) – The total number of output features. This number is a function of self.n_input_features and the feature library being used.

  • n_control_features_ (int) – The total number of control input features.

Examples

>>> import numpy as np
>>> from scipy.integrate import solve_ivp
>>> from pysindy import SINDy
>>> lorenz = lambda z,t : [10*(z[1] - z[0]),
>>>                        z[0]*(28 - z[2]) - z[1],
>>>                        z[0]*z[1] - 8/3*z[2]]
>>> t = np.arange(0,2,.002)
>>> x = solve_ivp(lorenz, [-8,8,27], t)
>>> model = SINDy()
>>> model.fit(x, t=t[1]-t[0])
>>> model.print()
x0' = -10.000 1 + 10.000 x0
x1' = 27.993 1 + -0.999 x0 + -1.000 1 x1
x2' = -2.666 x1 + 1.000 1 x0
>>> model.coefficients()
array([[ 0.        ,  0.        ,  0.        ],
       [-9.99969193, 27.99344519,  0.        ],
       [ 9.99961547, -0.99905338,  0.        ],
       [ 0.        ,  0.        , -2.66645651],
       [ 0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.99990257],
       [ 0.        , -0.99980268,  0.        ],
       [ 0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ]])
>>> model.score(x, t=t[1]-t[0])
0.999999985520653
>>> import numpy as np
>>> from scipy.integrate import solve_ivp
>>> from pysindy import SINDy
>>> u = lambda t : np.sin(2 * t)
>>> lorenz_c = lambda z,t : [
            10 * (z[1] - z[0]) + u(t) ** 2,
            z[0] * (28 - z[2]) - z[1],
            z[0] * z[1] - 8 / 3 * z[2],
    ]
>>> t = np.arange(0,2,0.002)
>>> x = solve_ivp(lorenz_c, [-8,8,27], t)
>>> u_eval = u(t)
>>> model = SINDy()
>>> model.fit(x, u_eval, t=t[1]-t[0])
>>> model.print()
x0' = -10.000 x0 + 10.000 x1 + 1.001 u0^2
x1' = 27.994 x0 + -0.999 x1 + -1.000 x0 x2
x2' = -2.666 x2 + 1.000 x0 x1
>>> model.coefficients()
array([[ 0.        , -9.99969851,  9.99958359,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  1.00120331],
       [ 0.        , 27.9935177 , -0.99906375,  0.        ,  0.        ,
         0.        ,  0.        , -0.99980455,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        , -2.666437  ,  0.        ,
         0.        ,  0.99990137,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ]])
>>> model.score(x, u_eval, t=t[1]-t[0])
0.9999999855414495
fit(x, t=None, x_dot=None, u=None)[source]

Fit a SINDy model.

Parameters:
  • x (array-like or list of array-like, shape (n_samples, n_input_features)) – Training data. If training data contains multiple trajectories, x should be a list containing data for each trajectory. Individual trajectories may contain different numbers of samples.

  • t (float, numpy array of shape (n_samples,), or list of numpy arrays, optional (default None)) – If t is a float, it specifies the timestep between each sample. If array-like, it specifies the time at which each sample was collected. In this case the values in t must be strictly increasing. In the case of multi-trajectory training data, t may also be a list of arrays containing the collection times for each individual trajectory. If None, the default time step t_default will be used.

  • x_dot (array-like or list of array-like, shape (n_samples, n_input_features), optional (default None)) – Optional pre-computed derivatives of the training data. If not provided, the time derivatives of the training data will be computed using the specified differentiation method. If x_dot is provided, it must match the shape of the training data and these values will be used as the time derivatives.

  • u (array-like or list of array-like, shape (n_samples, n_control_features), optional (default None)) – Control variables/inputs. Include this variable to use sparse identification for nonlinear dynamical systems for control (SINDYc). If training data contains multiple trajectories (i.e. if x is a list of array-like), then u should be a list containing control variable data for each trajectory. Individual trajectories may contain different numbers of samples.

Returns:

self

Return type:

a fitted SINDy instance

predict(x, u=None)[source]

Predict the time derivatives using the SINDy model.

Parameters:
  • x (array-like or list of array-like, shape (n_samples, n_input_features)) – Samples.

  • u (array-like or list of array-like, shape(n_samples, n_control_features), (default None)) – Control variables. If multiple_trajectories==True then u must be a list of control variable data from each trajectory. If the model was fit with control variables then u is not optional.

Returns:

x_dot – Predicted time derivatives

Return type:

array-like or list of array-like, shape (n_samples, n_input_features)

equations(precision=3)[source]

Get the right hand sides of the SINDy model equations.

Parameters:

precision (int, optional (default 3)) – Number of decimal points to include for each coefficient in the equation.

Returns:

equations – List of strings representing the SINDy model equations for each input feature.

Return type:

list of strings

print(lhs=None, precision=3)[source]

Print the SINDy model equations.

Parameters:
  • lhs (list of strings, optional (default None)) – List of variables to print on the left-hand sides of the learned equations. By default self.input_features are used.

  • precision (int, optional (default 3)) – Precision to be used when printing out model coefficients.

score(x, t=None, x_dot=None, u=None, metric=<function r2_score>, **metric_kws)[source]

Returns a score for the time derivative prediction produced by the model.

Parameters:
  • x (array-like or list of array-like, shape (n_samples, n_input_features)) – Samples from which to make predictions.

  • t (float, numpy array of shape (n_samples,), or list of numpy arrays, optional (default None)) – Time step between samples or array of collection times. Optional, used to compute the time derivatives of the samples if x_dot is not provided. If None, the default time step t_default will be used.

  • x_dot (array-like or list of array-like, shape (n_samples, n_input_features), optional (default None)) – Optional pre-computed derivatives of the samples. If provided, these values will be used to compute the score. If not provided, the time derivatives of the training data will be computed using the specified differentiation method.

  • u (array-like or list of array-like, shape(n_samples, n_control_features), optional (default None)) – Control variables. If multiple_trajectories==True then u must be a list of control variable data from each trajectory. If the model was fit with control variables then u is not optional.

  • metric (callable, optional) – Metric function with which to score the prediction. Default is the R^2 coefficient of determination. See Scikit-learn for more options.

  • metric_kws (dict, optional) – Optional keyword arguments to pass to the metric function.

Returns:

score – Metric function value for the model prediction of x_dot.

Return type:

float

differentiate(x, t=None)[source]

Apply the model’s differentiation method (self.differentiation_method) to data.

Parameters:
  • x (array-like or list of array-like, shape (n_samples, n_input_features)) – Data to be differentiated.

  • t (int, numpy array of shape (n_samples,), or list of numpy arrays, optional (default None)) – Time step between samples or array of collection times. If None, the default time step t_default will be used.

Returns:

x_dot – Time derivatives computed by using the model’s differentiation method

Return type:

array-like or list of array-like, shape (n_samples, n_input_features)

coefficients()[source]

Get an array of the coefficients learned by SINDy model.

Returns:

coef – Learned coefficients of the SINDy model. Equivalent to \(\Xi^\top\) in the literature.

Return type:

np.ndarray, shape (n_input_features, n_output_features)

get_feature_names()[source]

Get a list of names of features used by SINDy model.

Returns:

feats – A list of strings giving the names of the features in the feature library, self.feature_library.

Return type:

list

simulate(x0, t, u=None, integrator='solve_ivp', stop_condition=None, interpolator=None, integrator_kws={'atol': 1e-12, 'method': 'LSODA', 'rtol': 1e-12}, interpolator_kws={})[source]

Simulate the SINDy model forward in time.

Parameters:
  • x0 (numpy array, size [n_features]) – Initial condition from which to simulate.

  • t (int or numpy array of size [n_samples]) – If the model is in continuous time, t must be an array of time points at which to simulate. If the model is in discrete time, t must be an integer indicating how many steps to predict.

  • u (function from R^1 to R^{n_control_features} or list/array, optional (default None)) – Control inputs. If the model is continuous time, i.e. self.discrete_time == False, this function should take in a time and output the values of each of the n_control_features control features as a list or numpy array. Alternatively, if the model is continuous time, u can also be an array of control inputs at each time step. In this case the array is fit with the interpolator specified by interpolator. If the model is discrete time, i.e. self.discrete_time == True, u should be a list (with len(u) == t) or array (with u.shape[0] == 1) giving the control inputs at each step.

  • integrator (string, optional (default solve_ivp)) – Function to use to integrate the system. Default is scipy.integrate.solve_ivp. The only options currently supported are solve_ivp and odeint.

  • stop_condition (function object, optional) – If model is in discrete time, optional function that gives a stopping condition for stepping the simulation forward.

  • interpolator (callable, optional (default interp1d)) – Function used to interpolate control inputs if u is an array. Default is scipy.interpolate.interp1d.

  • integrator_kws (dict, optional (default {})) – Optional keyword arguments to pass to the integrator

  • interpolator_kws (dict, optional (default {})) – Optional keyword arguments to pass to the control input interpolator

Returns:

x – Simulation results

Return type:

numpy array, shape (n_samples, n_features)

property complexity

Complexity of the model measured as the number of nonzero parameters.

set_fit_request(*, t: bool | None | str = '$UNCHANGED$', u: bool | None | str = '$UNCHANGED$', x: bool | None | str = '$UNCHANGED$', x_dot: bool | None | str = '$UNCHANGED$') SINDy

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • t (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for t parameter in fit.

  • u (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for u parameter in fit.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in fit.

  • x_dot (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_dot parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, u: bool | None | str = '$UNCHANGED$', x: bool | None | str = '$UNCHANGED$') SINDy

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • u (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for u parameter in predict.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, metric: bool | None | str = '$UNCHANGED$', t: bool | None | str = '$UNCHANGED$', u: bool | None | str = '$UNCHANGED$', x: bool | None | str = '$UNCHANGED$', x_dot: bool | None | str = '$UNCHANGED$') SINDy

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • metric (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for metric parameter in score.

  • t (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for t parameter in score.

  • u (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for u parameter in score.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in score.

  • x_dot (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_dot parameter in score.

Returns:

self – The updated object.

Return type:

object

pysindy.version module

Module contents

class pysindy.SINDy(optimizer=None, feature_library=None, differentiation_method=None, feature_names=None, t_default=1, discrete_time=False)[source]

Bases: BaseEstimator

Sparse Identification of Nonlinear Dynamical Systems (SINDy). Uses sparse regression to learn a dynamical systems model from measurement data.

Parameters:
  • optimizer (optimizer object, optional) – Optimization method used to fit the SINDy model. This must be a class extending pysindy.optimizers.BaseOptimizer. The default is STLSQ.

  • feature_library (feature library object, optional) – Feature library object used to specify candidate right-hand side features. This must be a class extending pysindy.feature_library.base.BaseFeatureLibrary. The default option is PolynomialLibrary.

  • differentiation_method (differentiation object, optional) – Method for differentiating the data. This must be a class extending pysindy.differentiation_methods.base.BaseDifferentiation class. The default option is centered difference.

  • feature_names (list of string, length n_input_features, optional) – Names for the input features (e.g. ['x', 'y', 'z']). If None, will use ['x0', 'x1', ...].

  • t_default (float, optional (default 1)) – Default value for the time step.

  • discrete_time (boolean, optional (default False)) – If True, dynamical system is treated as a map. Rather than predicting derivatives, the right hand side functions step the system forward by one time step. If False, dynamical system is assumed to be a flow (right-hand side functions predict continuous time derivatives).

Attributes:
  • model (sklearn.multioutput.MultiOutputRegressor object) – The fitted SINDy model.

  • n_input_features_ (int) – The total number of input features.

  • n_output_features_ (int) – The total number of output features. This number is a function of self.n_input_features and the feature library being used.

  • n_control_features_ (int) – The total number of control input features.

Examples

>>> import numpy as np
>>> from scipy.integrate import solve_ivp
>>> from pysindy import SINDy
>>> lorenz = lambda z,t : [10*(z[1] - z[0]),
>>>                        z[0]*(28 - z[2]) - z[1],
>>>                        z[0]*z[1] - 8/3*z[2]]
>>> t = np.arange(0,2,.002)
>>> x = solve_ivp(lorenz, [-8,8,27], t)
>>> model = SINDy()
>>> model.fit(x, t=t[1]-t[0])
>>> model.print()
x0' = -10.000 1 + 10.000 x0
x1' = 27.993 1 + -0.999 x0 + -1.000 1 x1
x2' = -2.666 x1 + 1.000 1 x0
>>> model.coefficients()
array([[ 0.        ,  0.        ,  0.        ],
       [-9.99969193, 27.99344519,  0.        ],
       [ 9.99961547, -0.99905338,  0.        ],
       [ 0.        ,  0.        , -2.66645651],
       [ 0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.99990257],
       [ 0.        , -0.99980268,  0.        ],
       [ 0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ]])
>>> model.score(x, t=t[1]-t[0])
0.999999985520653
>>> import numpy as np
>>> from scipy.integrate import solve_ivp
>>> from pysindy import SINDy
>>> u = lambda t : np.sin(2 * t)
>>> lorenz_c = lambda z,t : [
            10 * (z[1] - z[0]) + u(t) ** 2,
            z[0] * (28 - z[2]) - z[1],
            z[0] * z[1] - 8 / 3 * z[2],
    ]
>>> t = np.arange(0,2,0.002)
>>> x = solve_ivp(lorenz_c, [-8,8,27], t)
>>> u_eval = u(t)
>>> model = SINDy()
>>> model.fit(x, u_eval, t=t[1]-t[0])
>>> model.print()
x0' = -10.000 x0 + 10.000 x1 + 1.001 u0^2
x1' = 27.994 x0 + -0.999 x1 + -1.000 x0 x2
x2' = -2.666 x2 + 1.000 x0 x1
>>> model.coefficients()
array([[ 0.        , -9.99969851,  9.99958359,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  1.00120331],
       [ 0.        , 27.9935177 , -0.99906375,  0.        ,  0.        ,
         0.        ,  0.        , -0.99980455,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        , -2.666437  ,  0.        ,
         0.        ,  0.99990137,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ]])
>>> model.score(x, u_eval, t=t[1]-t[0])
0.9999999855414495
fit(x, t=None, x_dot=None, u=None)[source]

Fit a SINDy model.

Parameters:
  • x (array-like or list of array-like, shape (n_samples, n_input_features)) – Training data. If training data contains multiple trajectories, x should be a list containing data for each trajectory. Individual trajectories may contain different numbers of samples.

  • t (float, numpy array of shape (n_samples,), or list of numpy arrays, optional (default None)) – If t is a float, it specifies the timestep between each sample. If array-like, it specifies the time at which each sample was collected. In this case the values in t must be strictly increasing. In the case of multi-trajectory training data, t may also be a list of arrays containing the collection times for each individual trajectory. If None, the default time step t_default will be used.

  • x_dot (array-like or list of array-like, shape (n_samples, n_input_features), optional (default None)) – Optional pre-computed derivatives of the training data. If not provided, the time derivatives of the training data will be computed using the specified differentiation method. If x_dot is provided, it must match the shape of the training data and these values will be used as the time derivatives.

  • u (array-like or list of array-like, shape (n_samples, n_control_features), optional (default None)) – Control variables/inputs. Include this variable to use sparse identification for nonlinear dynamical systems for control (SINDYc). If training data contains multiple trajectories (i.e. if x is a list of array-like), then u should be a list containing control variable data for each trajectory. Individual trajectories may contain different numbers of samples.

Returns:

self

Return type:

a fitted SINDy instance

predict(x, u=None)[source]

Predict the time derivatives using the SINDy model.

Parameters:
  • x (array-like or list of array-like, shape (n_samples, n_input_features)) – Samples.

  • u (array-like or list of array-like, shape(n_samples, n_control_features), (default None)) – Control variables. If multiple_trajectories==True then u must be a list of control variable data from each trajectory. If the model was fit with control variables then u is not optional.

Returns:

x_dot – Predicted time derivatives

Return type:

array-like or list of array-like, shape (n_samples, n_input_features)

equations(precision=3)[source]

Get the right hand sides of the SINDy model equations.

Parameters:

precision (int, optional (default 3)) – Number of decimal points to include for each coefficient in the equation.

Returns:

equations – List of strings representing the SINDy model equations for each input feature.

Return type:

list of strings

print(lhs=None, precision=3)[source]

Print the SINDy model equations.

Parameters:
  • lhs (list of strings, optional (default None)) – List of variables to print on the left-hand sides of the learned equations. By default self.input_features are used.

  • precision (int, optional (default 3)) – Precision to be used when printing out model coefficients.

score(x, t=None, x_dot=None, u=None, metric=<function r2_score>, **metric_kws)[source]

Returns a score for the time derivative prediction produced by the model.

Parameters:
  • x (array-like or list of array-like, shape (n_samples, n_input_features)) – Samples from which to make predictions.

  • t (float, numpy array of shape (n_samples,), or list of numpy arrays, optional (default None)) – Time step between samples or array of collection times. Optional, used to compute the time derivatives of the samples if x_dot is not provided. If None, the default time step t_default will be used.

  • x_dot (array-like or list of array-like, shape (n_samples, n_input_features), optional (default None)) – Optional pre-computed derivatives of the samples. If provided, these values will be used to compute the score. If not provided, the time derivatives of the training data will be computed using the specified differentiation method.

  • u (array-like or list of array-like, shape(n_samples, n_control_features), optional (default None)) – Control variables. If multiple_trajectories==True then u must be a list of control variable data from each trajectory. If the model was fit with control variables then u is not optional.

  • metric (callable, optional) –

    Metric function with which to score the prediction. Default is the R^2 coefficient of determination. See Scikit-learn for more options.

  • metric_kws (dict, optional) – Optional keyword arguments to pass to the metric function.

Returns:

score – Metric function value for the model prediction of x_dot.

Return type:

float

differentiate(x, t=None)[source]

Apply the model’s differentiation method (self.differentiation_method) to data.

Parameters:
  • x (array-like or list of array-like, shape (n_samples, n_input_features)) – Data to be differentiated.

  • t (int, numpy array of shape (n_samples,), or list of numpy arrays, optional (default None)) – Time step between samples or array of collection times. If None, the default time step t_default will be used.

Returns:

x_dot – Time derivatives computed by using the model’s differentiation method

Return type:

array-like or list of array-like, shape (n_samples, n_input_features)

coefficients()[source]

Get an array of the coefficients learned by SINDy model.

Returns:

coef – Learned coefficients of the SINDy model. Equivalent to \(\Xi^\top\) in the literature.

Return type:

np.ndarray, shape (n_input_features, n_output_features)

get_feature_names()[source]

Get a list of names of features used by SINDy model.

Returns:

feats – A list of strings giving the names of the features in the feature library, self.feature_library.

Return type:

list

simulate(x0, t, u=None, integrator='solve_ivp', stop_condition=None, interpolator=None, integrator_kws={'atol': 1e-12, 'method': 'LSODA', 'rtol': 1e-12}, interpolator_kws={})[source]

Simulate the SINDy model forward in time.

Parameters:
  • x0 (numpy array, size [n_features]) – Initial condition from which to simulate.

  • t (int or numpy array of size [n_samples]) – If the model is in continuous time, t must be an array of time points at which to simulate. If the model is in discrete time, t must be an integer indicating how many steps to predict.

  • u (function from R^1 to R^{n_control_features} or list/array, optional (default None)) – Control inputs. If the model is continuous time, i.e. self.discrete_time == False, this function should take in a time and output the values of each of the n_control_features control features as a list or numpy array. Alternatively, if the model is continuous time, u can also be an array of control inputs at each time step. In this case the array is fit with the interpolator specified by interpolator. If the model is discrete time, i.e. self.discrete_time == True, u should be a list (with len(u) == t) or array (with u.shape[0] == 1) giving the control inputs at each step.

  • integrator (string, optional (default solve_ivp)) – Function to use to integrate the system. Default is scipy.integrate.solve_ivp. The only options currently supported are solve_ivp and odeint.

  • stop_condition (function object, optional) – If model is in discrete time, optional function that gives a stopping condition for stepping the simulation forward.

  • interpolator (callable, optional (default interp1d)) – Function used to interpolate control inputs if u is an array. Default is scipy.interpolate.interp1d.

  • integrator_kws (dict, optional (default {})) – Optional keyword arguments to pass to the integrator

  • interpolator_kws (dict, optional (default {})) – Optional keyword arguments to pass to the control input interpolator

Returns:

x – Simulation results

Return type:

numpy array, shape (n_samples, n_features)

property complexity

Complexity of the model measured as the number of nonzero parameters.

set_fit_request(*, t: bool | None | str = '$UNCHANGED$', u: bool | None | str = '$UNCHANGED$', x: bool | None | str = '$UNCHANGED$', x_dot: bool | None | str = '$UNCHANGED$') SINDy

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • t (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for t parameter in fit.

  • u (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for u parameter in fit.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in fit.

  • x_dot (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_dot parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, u: bool | None | str = '$UNCHANGED$', x: bool | None | str = '$UNCHANGED$') SINDy

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • u (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for u parameter in predict.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, metric: bool | None | str = '$UNCHANGED$', t: bool | None | str = '$UNCHANGED$', u: bool | None | str = '$UNCHANGED$', x: bool | None | str = '$UNCHANGED$', x_dot: bool | None | str = '$UNCHANGED$') SINDy

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • metric (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for metric parameter in score.

  • t (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for t parameter in score.

  • u (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for u parameter in score.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in score.

  • x_dot (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_dot parameter in score.

Returns:

self – The updated object.

Return type:

object

class pysindy.AxesArray(input_array: ndarray[Any, dtype[ScalarType]], axes: Dict[str, int | List[int]])[source]

Bases: NDArrayOperatorsMixin, ndarray

A numpy-like array that keeps track of the meaning of its axes.

Limitations:

  • Not all numpy functions, such as np.flatten(), have an implementation for AxesArray. In such cases a regular numpy array is returned.

  • For functions that are implemented for AxesArray, such as np.reshape(), use the numpy function rather than the bound method (e.g. arr.reshape)

  • Such functions may raise ValueError where numpy would not, when it is impossible to determine the output axis labels.

Current array function implementations:

  • np.concatenate

  • np.reshape

  • np.transpose

  • np.linalg.solve

  • np.einsum

  • np.tensordot

Indexing:

AxesArray supports all of the basic and advanced indexing of numpy arrays, with the addition that new axes can be inserted with a string name for the axis. E.g. arr = arr[..., "lineno"] will add a length-one axis at the end, along with the properties arr.ax_lineno and arr.n_lineno. If None or np.newaxis are passed, the axis is named “unk”.

Parameters:
  • input_array – the data to create the array.

  • axes – A dictionary of axis labels to shape indices. Axes labels must be of the format “ax_name”. indices can be either an int or a list of ints.

Attributes:
  • axes – dictionary of axis name to dimension index/indices

  • ax_<ax_name> – lookup ax_name in axes

  • n_<ax_name> – lookup shape of subarray defined by ax_name

Raises:
  • AxesWarning if axes does not match shape of input_array.

  • ValueError if assigning the same axis index to multiple meanings or – assigning an axis beyond ndim.

property axes
property shape

Shape of array. Unlike numpy ndarray, this is not assignable.

insert_axis(axis: Collection[int] | int, new_name: str) Dict[str, int | List[int]][source]

Create the constructor axes dict from this array, with new axis/axes

remove_axis(axis: Collection[int] | int) Dict[str, int | List[int]][source]

Create the constructor axes dict from this array, without axis/axes

class pysindy.BaseDifferentiation[source]

Bases: BaseEstimator

Base class for differentiation methods.

Simply forces differentiation methods to implement a _differentiate function.

Attributes:

smoothed_x_ – Methods that smooth x before differentiating save that value here. Methods that do not simply save x here.

class pysindy.FiniteDifference(order=2, d: int = 1, axis=0, is_uniform=False, drop_endpoints=False, periodic=False)[source]

Bases: BaseDifferentiation

Finite difference derivatives.

Parameters:
  • order (int, optional (default 2)) – The order of the finite difference method to be used. Currently only centered differences are implemented, for even order and left-off-centered differences for odd order.

  • d (int, optional (default 1)) – The order of derivative to take. Must be positive integer.

  • axis (int, optional (default 0)) – The axis to differentiate along.

  • is_uniform (boolean, optional (default False)) – Parameter to tell the differentiation that, although a N-dim grid is passed, it is uniform so can use dx instead of the full grid array.

  • drop_endpoints (boolean, optional (default False)) – Whether or not derivatives are computed for endpoints. If False, endpoints will be set to np.nan. Note that which points are endpoints depends on the method being used.

  • periodic (boolean, optional (default False)) – Whether to use periodic boundary conditions for endpoints. Use forward differences for periodic=False and periodic boundaries with centered differences for periodic=True on the boundaries. No effect if drop_endpoints=True

Examples

>>> import numpy as np
>>> from pysindy.differentiation import FiniteDifference
>>> t = np.linspace(0, 1, 5)
>>> X = np.vstack((np.sin(t), np.cos(t))).T
>>> fd = FiniteDifference()
>>> fd._differentiate(X, t)
array([[ 1.00114596,  0.00370551],
       [ 0.95885108, -0.24483488],
       [ 0.8684696 , -0.47444711],
       [ 0.72409089, -0.67456051],
       [ 0.53780339, -0.84443737]])
class pysindy.SINDyDerivative(save_smooth=True, **kwargs)[source]

Bases: BaseDifferentiation

Wrapper class for differentiation classes from the derivative package. This class is meant to provide all the same functionality as the dxdt method.

This class also has _differentiate and __call__ methods which are used by PySINDy.

Parameters:

derivative_kws (dictionary, optional) –

Keyword arguments to be passed to the dxdt method.

Notes

See the derivative documentation for acceptable keywords.

set_params(**params)[source]

Set the parameters of this estimator. Modification of the pysindy method to allow unknown kwargs. This allows using the full range of derivative parameters that are not defined as member variables in sklearn grid search.

Return type:

self

get_params(deep=True)[source]

Get parameters.

class pysindy.SmoothedFiniteDifference(smoother=<function savgol_filter>, smoother_kws={}, save_smooth=True, **kwargs)[source]

Bases: FiniteDifference

Smoothed finite difference derivatives.

Perform differentiation by smoothing input data then applying a finite difference method.

Parameters:
  • smoother (function, optional (default savgol_filter)) – Function to perform smoothing. Must be compatible with the following call signature: x_smoothed = smoother(x, **smoother_kws)

  • smoother_kws (dict, optional (default {})) – Arguments passed to smoother when it is invoked.

  • save_smooth (bool) – Whether to save the smoothed coordinate values or not.

  • **kwargs (kwargs) – Additional parameters passed to the pysindy.FiniteDifference.__init__ function.

Examples

>>> import numpy as np
>>> from pysindy.differentiation import SmoothedFiniteDifference
>>> t = np.linspace(0,1,10)
>>> X = np.vstack((np.sin(t),np.cos(t))).T
>>> sfd = SmoothedFiniteDifference(smoother_kws={'window_length': 5})
>>> sfd._differentiate(X, t)
array([[ 1.00013114e+00,  7.38006789e-04],
       [ 9.91779070e-01, -1.10702304e-01],
       [ 9.73376491e-01, -2.20038119e-01],
       [ 9.43001496e-01, -3.26517615e-01],
       [ 9.00981354e-01, -4.29066632e-01],
       [ 8.47849424e-01, -5.26323977e-01],
       [ 7.84260982e-01, -6.17090177e-01],
       [ 7.11073255e-01, -7.00180971e-01],
       [ 6.29013295e-01, -7.74740601e-01],
       [ 5.39752150e-01, -8.41980082e-01]])
class pysindy.SpectralDerivative(d=1, axis=0)[source]

Bases: BaseDifferentiation

Spectral derivatives. Assumes uniform grid, and utilizes FFT to approximate a derivative. Works well for derivatives in periodic dimensions. Equivalent to a maximal-order finite difference, but runs in O(NlogN).

Parameters:
  • d (int) – The order of derivative to take

  • axis (int, optional (default 0)) – The axis to differentiate along

Examples

>>> import numpy as np
>>> from pysindy.differentiation import SpectralDerivative
>>> t = np.arange(0,1,0.1)
>>> X = np.vstack((np.sin(t), np.cos(t))).T
>>> sd = SpectralDerivative()
>>> sd._differentiate(X, t)
array([[ 6.28318531e+00,  2.69942771e-16],
   [ 5.08320369e+00, -3.69316366e+00],
   [ 1.94161104e+00, -5.97566433e+00],
   [-1.94161104e+00, -5.97566433e+00],
   [-5.08320369e+00, -3.69316366e+00],
   [-6.28318531e+00,  7.10542736e-16],
   [-5.08320369e+00,  3.69316366e+00],
   [-1.94161104e+00,  5.97566433e+00],
   [ 1.94161104e+00,  5.97566433e+00],
   [ 5.08320369e+00,  3.69316366e+00]])
class pysindy.ConcatLibrary(libraries: list)[source]

Bases: BaseFeatureLibrary

Concatenate multiple libraries into one library. All settings provided to individual libraries will be applied.

Parameters:

libraries (list of libraries) – Library instances to be applied to the input matrix.

Attributes:
  • n_features_in_ (int) – The total number of input features.

  • n_output_features_ (int) – The total number of output features. The number of output features is the sum of the numbers of output features for each of the concatenated libraries.

Examples

>>> import numpy as np
>>> from pysindy.feature_library import FourierLibrary, CustomLibrary
>>> from pysindy.feature_library import ConcatLibrary
>>> x = np.array([[0.,-1],[1.,0.],[2.,-1.]])
>>> functions = [lambda x : np.exp(x), lambda x,y : np.sin(x+y)]
>>> lib_custom = CustomLibrary(library_functions=functions)
>>> lib_fourier = FourierLibrary()
>>> lib_concat = ConcatLibrary([lib_custom, lib_fourier])
>>> lib_concat.fit()
>>> lib.transform(x)
fit(x_full, y=None)[source]

Compute number of output features.

Parameters:

x (array-like, shape (n_samples, n_features)) – The data.

Returns:

self

Return type:

instance

transform(x_full)[source]

Transform data with libs provided below.

Parameters:

x (array-like, shape [n_samples, n_features]) – The data to transform, row by row.

Returns:

xp – The matrix of features, where NP is the number of features generated from applying the custom functions to the inputs.

Return type:

np.ndarray, shape [n_samples, NP]

get_feature_names(input_features=None)[source]

Return feature names for output features.

Parameters:

input_features (list of string, length n_features, optional) – String names for input features if available. By default, “x0”, “x1”, … “xn_features” is used.

Returns:

output_feature_names

Return type:

list of string, length n_output_features

calc_trajectory(diff_method, x, t)[source]
class pysindy.TensoredLibrary(libraries: list, inputs_per_library: Sequence[Sequence[int]] | None = None)[source]

Bases: BaseFeatureLibrary

Tensor multiple libraries together into one library. All settings provided to individual libraries will be applied.

Parameters:
  • libraries (list of libraries) – Library instances to be applied to the input matrix.

  • inputs_per_library (Sequence of Sequences of ints (default None)) – list that specifies which input indexes should be passed as inputs for each of the individual feature libraries. length must equal the number of feature libraries. Default is that all inputs are used for every library.

Attributes:
  • libraries_ (list of libraries) – Library instances to be applied to the input matrix.

  • n_features_in_ (int) – The total number of input features.

  • n_output_features_ (int) – The total number of output features. The number of output features is the product of the numbers of output features for each of the libraries that were tensored together.

Examples

>>> import numpy as np
>>> from pysindy.feature_library import FourierLibrary, CustomLibrary
>>> from pysindy.feature_library import TensoredLibrary
>>> x = np.array([[0.,-1],[1.,0.],[2.,-1.]])
>>> functions = [lambda x : np.exp(x), lambda x,y : np.sin(x+y)]
>>> lib_custom = CustomLibrary(library_functions=functions)
>>> lib_fourier = FourierLibrary()
>>> lib_tensored = lib_custom * lib_fourier
>>> lib_tensored.fit(x)
>>> lib_tensored.transform(x)
fit(x_full, y=None)[source]

Compute number of output features.

Parameters:

x (array-like, shape (n_samples, n_features)) – The data.

Returns:

self

Return type:

instance

transform(x_full)[source]

Transform data with libs provided below.

Parameters:

x (array-like, shape [n_samples, n_features]) – The data to transform, row by row.

Returns:

xp – The matrix of features, where NP is the number of features generated from applying the custom functions to the inputs.

Return type:

np.ndarray, shape [n_samples, NP]

get_feature_names(input_features=None)[source]

Return feature names for output features.

Parameters:

input_features (list of string, length n_features, optional) – String names for input features if available. By default, “x0”, “x1”, … “xn_features” is used.

Returns:

output_feature_names

Return type:

list of string, length n_output_features

calc_trajectory(diff_method, x, t)[source]
class pysindy.GeneralizedLibrary(libraries: list, tensor_array=None, inputs_per_library: Sequence[Sequence[int]] | None = None, exclude_libraries=[])[source]

Bases: BaseFeatureLibrary

Put multiple libraries into one library. All settings provided to individual libraries will be applied. Note that this class allows one to specifically choose which input variables are used for each library, and take tensor products of any pair of libraries. Tensored libraries inherit the same input variables specified for the individual libraries.

Parameters:
  • libraries (list of libraries) – Library instances to be applied to the input matrix.

  • tensor_array (2D list of booleans, optional, (default None)) – Default is to not tensor any of the libraries together. Shape equal to the # of tensor libraries and the # feature libraries. Indicates which pairs of libraries to tensor product together and add to the overall library. For instance if you have 5 libraries, and want to do two tensor products, you could use the list [[1, 0, 0, 1, 0], [0, 1, 0, 1, 1]] to indicate that you want two tensored libraries from tensoring libraries 0 and 3 and libraries 1, 3, and 4.

  • inputs_per_library (Sequence of Seqeunces of ints (default None)) – list that specifies which input indexes should be passed as inputs for each of the individual feature libraries. length must equal the number of feature libraries. Default is that all inputs are used for every library.

Attributes:
  • self.libraries_full_ (list[BaseFeatureLibrary]) – The fitted libraries

  • n_features_in_ (int) – The total number of input features.

  • n_output_features_ (int) – The total number of output features. The number of output features is the sum of the numbers of output features for each of the concatenated libraries.

Examples

>>> import numpy as np
>>> from pysindy.feature_library import FourierLibrary, CustomLibrary
>>> from pysindy.feature_library import GeneralizedLibrary
>>> x = np.array([[0.,-1],[1.,0.],[2.,-1.]])
>>> functions = [lambda x : np.exp(x), lambda x,y : np.sin(x+y)]
>>> lib_custom = CustomLibrary(library_functions=functions)
>>> lib_fourier = FourierLibrary()
>>> lib_generalized = GeneralizedLibrary([lib_custom, lib_fourier])
>>> lib_generalized.fit(x)
>>> lib_generalized.transform(x)
fit(x_full, y=None)[source]

Compute number of output features.

Parameters:

x (array-like, shape (n_samples, n_features)) – The data.

Returns:

self

Return type:

instance

transform(x_full)[source]

Transform data with libs provided below.

Parameters:

x (array-like, shape [n_samples, n_features]) – The data to transform, row by row.

Returns:

xp – The matrix of features, where NP is the number of features generated from applying the custom functions to the inputs.

Return type:

np.ndarray, shape [n_samples, NP]

get_feature_names(input_features=None)[source]

Return feature names for output features.

Parameters:

input_features (list of string, length n_features, optional) – String names for input features if available. By default, “x0”, “x1”, … “xn_features” is used.

Returns:

output_feature_names

Return type:

list of string, length n_output_features

calc_trajectory(diff_method, x, t)[source]
get_spatial_grid()[source]
class pysindy.CustomLibrary(library_functions, function_names=None, interaction_only=True, include_bias=False)[source]

Bases: BaseFeatureLibrary

Generate a library with custom functions.

Parameters:
  • library_functions (list of mathematical functions) – Functions to include in the library. Default is to use same functions for all variables. Can also be used so that each variable has an associated library, in this case library_functions is shape (n_input_features, num_library_functions)

  • function_names (list of functions, optional (default None)) – List of functions used to generate feature names for each library function. Each name function must take a string input (representing a variable name), and output a string depiction of the respective mathematical function applied to that variable. For example, if the first library function is sine, the name function might return \(\sin(x)\) given \(x\) as input. The function_names list must be the same length as library_functions. If no list of function names is provided, defaults to using \([ f_0(x),f_1(x), f_2(x), \ldots ]\).

  • interaction_only (boolean, optional (default True)) – Whether to omit self-interaction terms. If True, function evaulations of the form \(f(x,x)\) and \(f(x,y,x)\) will be omitted, but those of the form \(f(x,y)\) and \(f(x,y,z)\) will be included. If False, all combinations will be included.

  • include_bias (boolean, optional (default False)) – If True (default), then include a bias column, the feature in which all polynomial powers are zero (i.e. a column of ones - acts as an intercept term in a linear model). This is hard to do with just lambda functions, because if the system is not 1D, lambdas will generate duplicates.

Attributes:
  • functions (list of functions) – Mathematical library functions to be applied to each input feature.

  • function_names (list of functions) – Functions for generating string representations of each library function.

  • n_features_in_ (int) – The total number of input features.

  • n_output_features_ (int) – The total number of output features. The number of output features is the product of the number of library functions and the number of input features.

Examples

>>> import numpy as np
>>> from pysindy.feature_library import CustomLibrary
>>> x = np.array([[0.,-1],[1.,0.],[2.,-1.]])
>>> functions = [lambda x : np.exp(x), lambda x,y : np.sin(x+y)]
>>> lib = CustomLibrary(library_functions=functions).fit(x)
>>> lib.transform(x)
array([[ 1.        ,  0.36787944, -0.84147098],
       [ 2.71828183,  1.        ,  0.84147098],
       [ 7.3890561 ,  0.36787944,  0.84147098]])
>>> lib.get_feature_names()
['f0(x0)', 'f0(x1)', 'f1(x0,x1)']
get_feature_names(input_features=None)[source]

Return feature names for output features.

Parameters:

input_features (list of string, length n_features, optional) – String names for input features if available. By default, “x0”, “x1”, … “xn_features” is used.

Returns:

output_feature_names

Return type:

list of string, length n_output_features

fit(x_full, y=None)[source]

Compute number of output features.

Parameters:

x (array-like, shape (n_samples, n_features)) – Measurement data.

Returns:

self

Return type:

instance

transform(x_full)[source]

Transform data to custom features

Parameters:

x (array-like, shape (n_samples, n_features)) – The data to transform, row by row.

Returns:

xp – The matrix of features, where n_output_features is the number of features generated from applying the custom functions to the inputs.

Return type:

np.ndarray, shape (n_samples, n_output_features)

class pysindy.FourierLibrary(n_frequencies=1, include_sin=True, include_cos=True)[source]

Bases: BaseFeatureLibrary

Generate a library with trigonometric functions.

Parameters:
  • n_frequencies (int, optional (default 1)) – Number of frequencies to include in the library. The library will include functions \(\sin(x), \sin(2x), \dots \sin(n_{frequencies}x)\) for each input feature \(x\) (depending on which of sine and/or cosine features are included).

  • include_sin (boolean, optional (default True)) – If True, include sine terms in the library.

  • include_cos (boolean, optional (default True)) – If True, include cosine terms in the library.

Attributes:
  • n_features_in_ (int) – The total number of input features.

  • n_output_features_ (int) – The total number of output features. The number of output features is 2 * n_input_features_ * n_frequencies if both sines and cosines are included. Otherwise it is n_input_features * n_frequencies.

Examples

>>> import numpy as np
>>> from pysindy.feature_library import FourierLibrary
>>> x = np.array([[0.],[1.],[2.]])
>>> lib = FourierLibrary(n_frequencies=2).fit(x)
>>> lib.transform(x)
array([[ 0.        ,  1.        ,  0.        ,  1.        ],
       [ 0.84147098,  0.54030231,  0.90929743, -0.41614684],
       [ 0.90929743, -0.41614684, -0.7568025 , -0.65364362]])
>>> lib.get_feature_names()
['sin(1 x0)', 'cos(1 x0)', 'sin(2 x0)', 'cos(2 x0)']
get_feature_names(input_features=None)[source]

Return feature names for output features

Parameters:

input_features (list of string, length n_features, optional) – String names for input features if available. By default, “x0”, “x1”, … “xn_features” is used.

Returns:

output_feature_names

Return type:

list of string, length n_output_features

fit(x_full, y=None)[source]

Compute number of output features.

Parameters:

x (array-like, shape (n_samples, n_features)) – The data.

Returns:

self

Return type:

instance

transform(x_full)[source]

Transform data to Fourier features

Parameters:

x (array-like, shape (n_samples, n_features)) – The data to transform, row by row.

Returns:

xp – The matrix of features, where n_output_features is the number of Fourier features generated from the inputs.

Return type:

np.ndarray, shape (n_samples, n_output_features)

class pysindy.IdentityLibrary[source]

Bases: BaseFeatureLibrary

Generate an identity library which maps all input features to themselves.

Attributes:
  • n_features_in_ (int) – The total number of input features.

  • n_output_features_ (int) – The total number of output features. The number of output features is equal to the number of input features.

Examples

>>> import numpy as np
>>> from pysindy.feature_library import IdentityLibrary
>>> x = np.array([[0,-1],[0.5,-1.5],[1.,-2.]])
>>> lib = IdentityLibrary().fit(x)
>>> lib.transform(x)
array([[ 0. , -1. ],
       [ 0.5, -1.5],
       [ 1. , -2. ]])
>>> lib.get_feature_names()
['x0', 'x1']
get_feature_names(input_features=None)[source]

Return feature names for output features

Parameters:

input_features (list of string, length n_features, optional) – String names for input features if available. By default, “x0”, “x1”, … “xn_features” is used.

Returns:

output_feature_names

Return type:

list of string, length n_output_features

fit(x_full, y=None)[source]

Compute number of output features.

Parameters:

x (array-like, shape (n_samples, n_features)) – The data.

Returns:

self

Return type:

instance

transform(x_full)[source]

Perform identity transformation (return a copy of the input).

Parameters:

x (array-like, shape (n_samples, n_features)) – The data to transform, row by row.

Returns:

x – The matrix of features, which is just a copy of the input data.

Return type:

np.ndarray, shape (n_samples, n_features)

class pysindy.PolynomialLibrary(degree=2, include_interaction=True, interaction_only=False, include_bias=True, order='C')[source]

Bases: PolynomialFeatures, BaseFeatureLibrary

Generate polynomial and interaction features.

This is the same as sklearn.preprocessing.PolynomialFeatures, but also adds the option to omit interaction features from the library.

Parameters:
  • degree (integer, optional (default 2)) – The degree of the polynomial features.

  • include_interaction (boolean, optional (default True)) – Determines whether interaction features are produced. If false, features are all of the form x[i] ** k.

  • interaction_only (boolean, optional (default False)) – If true, only interaction features are produced: features that are products of at most degree distinct input features (so not x[1] ** 2, x[0] * x[2] ** 3, etc.).

  • include_bias (boolean, optional (default True)) – If True (default), then include a bias column, the feature in which all polynomial powers are zero (i.e. a column of ones - acts as an intercept term in a linear model).

  • order (str in {'C', 'F'}, optional (default 'C')) – Order of output array in the dense case. ‘F’ order is faster to compute, but may slow down subsequent estimators.

Attributes:
  • powers_ (array, shape (n_output_features, n_input_features)) – powers_[i, j] is the exponent of the jth input in the ith output.

  • n_features_in_ (int) – The total number of input features.

  • n_output_features_ (int) – The total number of output features. This number is computed by iterating over all appropriately sized combinations of input features.

property powers_

Exponent for each of the inputs in the output.

get_feature_names(input_features=None)[source]

Return feature names for output features.

Parameters:

input_features (list of string, length n_features, optional) – String names for input features if available. By default, “x0”, “x1”, … “xn_features” is used.

Returns:

output_feature_names

Return type:

list of string, length n_output_features

fit(x_full, y=None)[source]

Compute number of output features.

Parameters:

x (array-like, shape (n_samples, n_features)) – The data.

Returns:

self

Return type:

instance

transform(x_full)[source]

Transform data to polynomial features.

Parameters:

x_full ({array-like, sparse matrix} of shape (n_samples, n_features)) – The data to transform, row by row.

Returns:

xp – shape (n_samples, n_output_features) The matrix of features, where n_output_features is the number of polynomial features generated from the combination of inputs.

Return type:

np.ndarray or CSR/CSC sparse matrix,

set_fit_request(*, x_full: bool | None | str = '$UNCHANGED$') PolynomialLibrary

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

x_full (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_full parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_transform_request(*, x_full: bool | None | str = '$UNCHANGED$') PolynomialLibrary

Request metadata passed to the transform method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to transform if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to transform.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

x_full (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_full parameter in transform.

Returns:

self – The updated object.

Return type:

object

class pysindy.PDELibrary(function_library: ~pysindy.feature_library.base.BaseFeatureLibrary | None = None, derivative_order=0, spatial_grid=None, temporal_grid=None, interaction_only=True, include_bias=False, include_interaction=True, implicit_terms=False, multiindices=None, differentiation_method=<class 'pysindy.differentiation.finite_difference.FiniteDifference'>, diff_kwargs={}, is_uniform=None, periodic=None)[source]

Bases: BaseFeatureLibrary

Generate a PDE library with custom functions.

Parameters:
  • function_library (BaseFeatureLibrary, optional (default) – PolynomialLibrary(degree=3,include_bias=False)) SINDy library with output features representing library_functions to include in the library, in place of library_functions.

  • derivative_order (int, optional (default 0)) – Order of derivative to take on each input variable, can be arbitrary non-negative integer.

  • spatial_grid (np.ndarray, optional (default None)) – The spatial grid for computing derivatives

  • temporal_grid (np.ndarray, optional (default None)) – The temporal grid if using SINDy-PI with PDEs.

  • interaction_only (boolean, optional (default True)) – Whether to omit self-interaction terms. If True, function evaulations of the form \(f(x,x)\) and \(f(x,y,x)\) will be omitted, but those of the form \(f(x,y)\) and \(f(x,y,z)\) will be included. If False, all combinations will be included.

  • include_bias (boolean, optional (default False)) – If True (default), then include a bias column, the feature in which all polynomial powers are zero (i.e. a column of ones - acts as an intercept term in a linear model). This is hard to do with just lambda functions, because if the system is not 1D, lambdas will generate duplicates.

  • include_interaction (boolean, optional (default True)) – This is a different than the use for the PolynomialLibrary. If true, it generates all the mixed derivative terms. If false, the library will consist of only pure no-derivative terms and pure derivative terms, with no mixed terms.

  • implicit_terms (boolean) – Flag to indicate if SINDy-PI (temporal derivatives) is being used for the right-hand side of the SINDy fit.

  • multiindices (list of integer arrays, (default None)) – Overrides the derivative_order to customize the included derivative orders. Each integer array indicates the order of differentiation along the corresponding axis for each derivative term.

  • differentiation_method (callable, (default FiniteDifference)) –

    Spatial differentiation method.

    diff_kwargs: dictionary, (default {})

    Keyword options to supply to differtiantion_method.

Attributes:
  • n_features_in_ (int) – The total number of input features.

  • n_output_features_ (int) – The total number of output features. The number of output features is the product of the number of library functions and the number of input features.

Examples

>>> import numpy as np
>>> from pysindy.feature_library import PDELibrary
get_feature_names(input_features=None)[source]

Return feature names for output features.

Parameters:

input_features (list of string, length n_features, optional) – String names for input features if available. By default, “x0”, “x1”, … “xn_features” is used.

Returns:

output_feature_names

Return type:

list of string, length n_output_features

fit(x_full, y=None)[source]

Compute number of output features.

Parameters:

x (array-like, shape (n_samples, n_features)) – Measurement data.

Returns:

self

Return type:

instance

transform(x_full)[source]

Transform data to pde features

Parameters:

x (array-like, shape (n_samples, n_features)) – The data to transform, row by row.

Returns:

xp – The matrix of features, where n_output_features is the number of features generated from the tensor product of the derivative terms and the library_functions applied to combinations of the inputs.

Return type:

np.ndarray, shape (n_samples, n_output_features)

get_spatial_grid()[source]
class pysindy.WeakPDELibrary(function_library: ~pysindy.feature_library.base.BaseFeatureLibrary | None = None, derivative_order=0, spatiotemporal_grid=None, interaction_only=True, include_bias=False, include_interaction=True, K=100, H_xt=None, p=4, num_pts_per_domain=None, implicit_terms=False, multiindices=None, differentiation_method=<class 'pysindy.differentiation.finite_difference.FiniteDifference'>, diff_kwargs={}, is_uniform=None, periodic=None)[source]

Bases: BaseFeatureLibrary

Generate a weak formulation library with custom functions and,

optionally, any spatial derivatives in arbitrary dimensions.

The features in the weak formulation are integrals of derivatives of input data multiplied by a test function phi, which are evaluated on K subdomains randomly sampled across the spatiotemporal grid. Each subdomain is initial generated with a size H_xt along each axis, and is then shrunk such that the left and right boundaries lie on spatiotemporal grid points. The expressions are integrated by parts to remove as many derivatives from the input data as possible and put the derivatives onto the test functions.

The weak integral features are calculated assuming the function f(x) to integrate against derivatives of the test function dphi(x) is linear between grid points provided by the data: f(x)=f_i+(x-x_i)/(x_{i+1}-x_i)*(f_{i+1}-f_i) Thus f(x)*dphi(x) is approximated as a piecewise polynomial. The piecewise components are integrated analytically. To improve performance, the complete integral is expressed as a dot product of weights against the input data f_i, which enables vectorized evaulations.

Parameters:
  • function_library (BaseFeatureLibrary, optional (default) – PolynomialLibrary(degree=3,include_bias=False)) SINDy library with output features representing library_functions to include in the library, in place of library_functions.

  • derivative_order (int, optional (default 0)) – Order of derivative to take on each input variable, can be arbitrary non-negative integer.

  • spatiotemporal_grid (np.ndarray (default None)) – The spatiotemporal grid for computing derivatives. This variable must be specified with at least one dimension corresponding to a temporal grid, so that integration by parts can be done in the weak formulation.

  • interaction_only (boolean, optional (default True)) – Whether to omit self-interaction terms. If True, function evaulations of the form \(f(x,x)\) and \(f(x,y,x)\) will be omitted, but those of the form \(f(x,y)\) and \(f(x,y,z)\) will be included. If False, all combinations will be included.

  • include_bias (boolean, optional (default False)) – If True (default), then include a bias column, the feature in which all polynomial powers are zero (i.e. a column of ones - acts as an intercept term in a linear model). This is hard to do with just lambda functions, because if the system is not 1D, lambdas will generate duplicates.

  • include_interaction (boolean, optional (default True)) – This is a different than the use for the PolynomialLibrary. If true, it generates all the mixed derivative terms. If false, the library will consist of only pure no-derivative terms and pure derivative terms, with no mixed terms.

  • K (int, optional (default 100)) – Number of domain centers, corresponding to subdomain squares of length Hxt. If K is not specified, defaults to 100.

  • H_xt (array of floats, optional (default None)) – Half of the length of the square subdomains in each spatiotemporal direction. If H_xt is not specified, defaults to H_xt = L_xt / 20, where L_xt is the length of the full domain in each spatiotemporal direction. If H_xt is specified as a scalar, this value will be applied to all dimensions of the subdomains.

  • p (int, optional (default 4)) – Positive integer to define the polynomial degree of the spatial weights used for weak/integral SINDy.

  • num_pts_per_domain (int, deprecated (default None)) – Included here to retain backwards compatibility with older code that uses this parameter. However, it merely raises a DeprecationWarning and then is ignored.

  • implicit_terms (boolean) – Flag to indicate if SINDy-PI (temporal derivatives) is being used for the right-hand side of the SINDy fit.

  • multiindices (list of integer arrays, (default None)) – Overrides the derivative_order to customize the included derivative orders. Each integer array indicates the order of differentiation along the corresponding axis for each derivative term.

  • differentiation_method (callable, (default FiniteDifference)) –

    Spatial differentiation method.

    diff_kwargs: dictionary, (default {})

    Keyword options to supply to differtiantion_method.

Attributes:
  • n_features_in_ (int) – The total number of input features.

  • n_output_features_ (int) – The total number of output features. The number of output features is the product of the number of library functions and the number of input features.

Examples

>>> import numpy as np
>>> from pysindy.feature_library import WeakPDELibrary
>>> x = np.array([[0.,-1],[1.,0.],[2.,-1.]])
>>> functions = [lambda x : np.exp(x), lambda x,y : np.sin(x+y)]
>>> lib = WeakPDELibrary(library_functions=functions).fit(x)
>>> lib.transform(x)
array([[ 1.        ,  0.36787944, -0.84147098],
       [ 2.71828183,  1.        ,  0.84147098],
       [ 7.3890561 ,  0.36787944,  0.84147098]])
>>> lib.get_feature_names()
['f0(x0)', 'f0(x1)', 'f1(x0,x1)']
convert_u_dot_integral(u)[source]

Takes a full set of spatiotemporal fields u(x, t) and finds the weak form of u_dot.

get_feature_names(input_features=None)[source]

Return feature names for output features.

Parameters:

input_features (list of string, length n_features, optional) – String names for input features if available. By default, “x0”, “x1”, … “xn_features” is used.

Returns:

output_feature_names

Return type:

list of string, length n_output_features

fit(x_full, y=None)[source]

Compute number of output features.

Parameters:

x (array-like, shape (n_samples, n_features)) – Measurement data.

Returns:

self

Return type:

instance

transform(x_full)[source]

Transform data to custom features

Parameters:

x (array-like, shape (n_samples, n_features)) – The data to transform, row by row.

Returns:

xp – The matrix of features, where n_output_features is the number of features generated from applying the custom functions to the inputs.

Return type:

np.ndarray, shape (n_samples, n_output_features)

calc_trajectory(diff_method, x, t)[source]
class pysindy.SINDyPILibrary(library_functions=None, t=None, x_dot_library_functions=None, function_names=None, interaction_only=True, differentiation_method=None, include_bias=False)[source]

Bases: BaseFeatureLibrary

WARNING: This library is deprecated in PySINDy versions > 1.7. Please use the PDE or WeakPDE libraries instead.

Generate a library with custom functions. The Library takes custom libraries for X and Xdot respectively, and then tensor-products them together. For a 3D system, a library of constant and linear terms in x_dot, i.e. [1, x_dot0, …, x_dot3], is good enough for most problems and implicit terms. The function names list should include both X and Xdot functions, without the mixed terms.

Parameters:
  • library_functions (list of mathematical functions) – Functions to include in the library. Each function will be applied to each input variable x.

  • x_dot_library_functions (list of mathematical functions) – Functions to include in the library. Each function will be applied to each input variable x_dot.

  • t (np.ndarray of time slices) – Time base to compute Xdot from X for the implicit terms

  • differentiation_method (differentiation object, optional) – Method for differentiating the data. This must be a class extending pysindy.differentiation_methods.base.BaseDifferentiation class. The default option is centered difference.

  • function_names (list of functions, optional (default None)) – List of functions used to generate feature names for each library function. Each name function must take a string input (representing a variable name), and output a string depiction of the respective mathematical function applied to that variable. For example, if the first library function is sine, the name function might return \(\sin(x)\) given \(x\) as input. The function_names list must be the same length as library_functions. If no list of function names is provided, defaults to using \([ f_0(x),f_1(x), f_2(x), \ldots ]\). For SINDy-PI, function_names should include the names of the functions in both the x and x_dot libraries (library_functions and x_dot_library_functions), but not the mixed terms, which are computed in the code.

  • interaction_only (boolean, optional (default True)) – Whether to omit self-interaction terms. If True, function evaulations of the form \(f(x,x)\) and \(f(x,y,x)\) will be omitted, but those of the form \(f(x,y)\) and \(f(x,y,z)\) will be included. If False, all combinations will be included.

  • include_bias (boolean, optional (default False)) – If True (default), then include a bias column, the feature in which all polynomial powers are zero (i.e. a column of ones - acts as an intercept term in a linear model). This is hard to do with just lambda functions, because if the system is not 1D, lambdas will generate duplicates.

Attributes:
  • functions (list of functions) – Mathematical library functions to be applied to each input feature.

  • function_names (list of functions) – Functions for generating string representations of each library function.

  • n_features_in_ (int) – The total number of input features.

  • n_output_features_ (int) – The total number of output features. The number of output features is the product of the number of library functions and the number of input features.

Examples

>>> import numpy as np
>>> from pysindy.feature_library import SINDyPILibrary
>>> t = np.linspace(0, 1, 5)
>>> x = np.ones((5, 2))
>>> functions = [lambda x: 1, lambda x : np.exp(x),
                 lambda x,y : np.sin(x+y)]
>>> x_dot_functions = [lambda x: 1, lambda x : x]
>>> function_names = [lambda x: '',
                      lambda x : 'exp(' + x + ')',
                      lambda x, y : 'sin(' + x + y + ')',
                      lambda x: '',
              lambda x : x]
>>> lib = ps.SINDyPILibrary(library_functions=functions,
                            x_dot_library_functions=x_dot_functions,
                            function_names=function_names, t=t
                            ).fit(x)
>>> lib.transform(x)
        [[ 1.00000000e+00  2.71828183e+00  2.71828183e+00  9.09297427e-01
           2.22044605e-16  6.03579815e-16  6.03579815e-16  2.01904588e-16
           2.22044605e-16  6.03579815e-16  6.03579815e-16  2.01904588e-16]
         [ 1.00000000e+00  2.71828183e+00  2.71828183e+00  9.09297427e-01
           0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
           0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00]
         [ 1.00000000e+00  2.71828183e+00  2.71828183e+00  9.09297427e-01
           0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
           0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00]
         [ 1.00000000e+00  2.71828183e+00  2.71828183e+00  9.09297427e-01
           0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00
           0.00000000e+00  0.00000000e+00  0.00000000e+00  0.00000000e+00]
         [ 1.00000000e+00  2.71828183e+00  2.71828183e+00  9.09297427e-01
          -2.22044605e-16 -6.03579815e-16 -6.03579815e-16 -2.01904588e-16
          -2.22044605e-16 -6.03579815e-16 -6.03579815e-16 -2.01904588e-16]]
>>> lib.get_feature_names()
    ['', 'exp(x0)', 'exp(x1)', 'sin(x0x1)', 'x0_dot', 'exp(x0)x0_dot',
     'exp(x1)x0_dot', 'sin(x0x1)x0_dot', 'x1_dot', 'exp(x0)x1_dot',
     'exp(x1)x1_dot', 'sin(x0x1)x1_dot']
get_feature_names(input_features=None)[source]

Return feature names for output features.

Parameters:

input_features (list of string, length n_features, optional) – String names for input features if available. By default, “x0”, “x1”, … “xn_features” is used.

Returns:

output_feature_names

Return type:

list of string, length n_output_features

fit(x_full, y=None)[source]

Compute number of output features.

Parameters:

x (array-like, shape (n_samples, n_features)) – Measurement data.

Returns:

self

Return type:

instance

transform(x_full)[source]

Transform data to custom features

Parameters:

x (array-like, shape (n_samples, n_features)) – The data to transform, row by row.

Returns:

xp – The matrix of features, where n_output_features is the number of features generated from applying the custom functions to the inputs.

Return type:

np.ndarray, shape (n_samples, n_output_features)

class pysindy.ParameterizedLibrary(parameter_library: BaseFeatureLibrary = PolynomialLibrary(degree=1), feature_library: BaseFeatureLibrary = PolynomialLibrary(), num_parameters: int = 3, num_features: int = 3)[source]

Bases: GeneralizedLibrary

A tensor product of two libraries with different inputs. Typically, this is a feature library of the input data and a parameter library of input control, making the SINDyCP method. If the input libraries are weak, the temporal derivatives are automatically rescaled by the appropriate domain volumes.

Parameters:
  • parameter_library (BaseFeatureLibrary, optional (default PolynomialLibrary).) –

  • features. (Specifies the library function to apply to the input data) –

  • feature_library (BaseFeatureLibrary, optional (default PolynomialLibrary).) –

  • features.

  • num_parameters (int, optional (default 3)) –

  • control. (Specifies the number of features in the input) –

  • num_features (int, optional (default 3)) –

  • data. (Specifies the number of features in the input) –

Attributes:

see GeneralizedLibrary

Examples

>>> import numpy as np
>>> from pysindy.feature_library import ParameterizedLibrary,PolynomialLibrary
>>> from pysindy import AxesArray
>>> xs=[np.random.random((5,3)) for n in range(3)]
>>> us=[np.random.random((5,3)) for n in range(3)]
>>> feature_lib=PolynomialLibrary(degree=3)
>>> parameter_lib=PolynomialLibrary(degree=1)
>>> lib=ParameterizedLibrary(feature_library=feature_lib,
>>>     parameter_library=parameter_lib,num_features=3,num_parameters=3)
>>> xus=[AxesArray(np.concatenate([xs[i],us[i]],axis=-1)) for i in range(3)]
>>> lib.fit(xus)
>>> lib.transform(xus)
calc_trajectory(diff_method, x, t)[source]
class pysindy.BaseOptimizer(max_iter=20, normalize_columns=False, initial_guess=None, copy_X=True, unbias: bool = True)[source]

Bases: LinearRegression, ComplexityMixin

Base class for SINDy optimizers. Subclasses must implement a _reduce method for carrying out the bulk of the work of fitting a model.

Parameters:
  • normalize_columns (boolean, optional (default False)) – Normalize the columns of x (the SINDy library terms) before regression by dividing by the L2-norm.

  • copy_X (boolean, optional (default True)) – If True, X will be copied; else, it may be overwritten.

  • initial_guess (np.ndarray, shape (n_features,) or (n_targets, n_features),) – optional (default None) Initial guess for coefficients coef_. If None, the initial guess is obtained via a least-squares fit.

  • unbias (Whether to perform an extra step of unregularized linear) – regression to unbias the coefficients for the identified support. If an optimizer (self.optimizer) applies any type of regularization, that regularization may bias coefficients, improving the conditioning of the problem but harming the quality of the fit. Setting unbias==True enables an extra step wherein unregularized linear regression is applied, but only for the coefficients in the support identified by the optimizer. This helps to remove the bias introduced by regularization.

Attributes:
  • coef_ (array, shape (n_features,) or (n_targets, n_features)) – Weight vector(s).

  • ind_ (array, shape (n_features,) or (n_targets, n_features)) – Array of bools indicating which coefficients of the weight vector have not been masked out.

  • history_ (list) – History of coef_ over iterations of the optimization algorithm.

  • Theta_ (np.ndarray, shape (n_samples, n_features)) – The Theta matrix to be used in the optimization. We save it as an attribute because access to the full library of terms is sometimes needed for various applications.

fit(x_, y, sample_weight=None, **reduce_kws)[source]

Fit the model.

Parameters:
  • x (array-like, shape (n_samples, n_features)) – Training data

  • y (array-like, shape (n_samples,) or (n_samples, n_targets)) – Target values

  • sample_weight (float or numpy array of shape (n_samples,), optional) – Individual weights for each sample

  • reduce_kws (dict) – Optional keyword arguments to pass to the _reduce method (implemented by subclasses)

Returns:

self

Return type:

returns an instance of self

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$', x_: bool | None | str = '$UNCHANGED$') BaseOptimizer

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_ parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') BaseOptimizer

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class pysindy.EnsembleOptimizer(opt: BaseOptimizer, bagging: bool = False, library_ensemble: bool = False, n_models: int = 20, n_subset: int | None = None, n_candidates_to_drop: int = 1, replace: bool = True, ensemble_aggregator: Callable | None = None)[source]

Bases: BaseOptimizer

Wrapper class for ensembling methods.

Parameters:
  • opt (BaseOptimizer) – The underlying optimizer to run on each ensemble

  • bagging (boolean, optional (default False)) – This parameter is used to allow for “ensembling”, i.e. the generation of many SINDy models (n_models) by choosing a random temporal subset of the input data (n_subset) for each sparse regression. This often improves robustness because averages (bagging) or medians (bragging) of all the models are usually quite high-performing. The user can also generate “distributions” of many models, and calculate how often certain library terms are included in a model.

  • library_ensemble (boolean, optional (default False)) – This parameter is used to allow for “library ensembling”, i.e. the generation of many SINDy models (n_models) by choosing a random subset of the candidate library terms to truncate. So, n_models are generated by solving n_models sparse regression problems on these “reduced” libraries. Once again, this often improves robustness because averages (bagging) or medians (bragging) of all the models are usually quite high-performing. The user can also generate “distributions” of many models, and calculate how often certain library terms are included in a model.

  • n_models (int, optional (default 20)) – Number of models to generate via ensemble

  • n_subset (int, optional (default len(time base))) – Number of time points to use for ensemble

  • n_candidates_to_drop (int, optional (default 1)) – Number of candidate terms in the feature library to drop during library ensembling.

  • replace (boolean, optional (default True)) – If ensemble true, whether or not to time sample with replacement.

  • ensemble_aggregator (callable, optional (default numpy.median)) – Method to aggregate model coefficients across different samples. This method argument is only used if ensemble or library_ensemble is True. The method should take in a list of 2D arrays and return a 2D array of the same shape as the arrays in the list. Example: lambda x: np.median(x, axis=0)

Attributes:
  • coef_ (array, shape (n_features,) or (n_targets, n_features)) – Regularized weight vector(s). This is the v in the objective function.

  • coef_full_ (array, shape (n_features,) or (n_targets, n_features)) – Weight vector(s) that are not subjected to the regularization. This is the w in the objective function.

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$', x_: bool | None | str = '$UNCHANGED$') EnsembleOptimizer

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_ parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') EnsembleOptimizer

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class pysindy.WrappedOptimizer(optimizer, *args, **kwargs)[source]

Bases: BaseOptimizer

Wrapper class for generic optimizers/sparse regression methods

Enables single target regressors (i.e. those whose predictions are 1-dimensional) to perform multi target regression (i.e. predictions are 2-dimensional). Also allows unbiasing & normalization for optimizers that would otherwise not include it.

Parameters:
  • optimizer (estimator object) – wrapped optimizer/sparse regression method

  • optimizer – The optimizer/sparse regressor to be wrapped, implementing fit and predict. optimizer should also have the attribute coef_. Any optimizer that supports a fit_intercept argument should be initialized to False.

predict(x)[source]

Predict using the linear model.

Parameters:

X (array-like or sparse matrix, shape (n_samples, n_features)) – Samples.

Returns:

C – Returns predicted values.

Return type:

array, shape (n_samples,)

property complexity
set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$', x_: bool | None | str = '$UNCHANGED$') WrappedOptimizer

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_ parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, x: bool | None | str = '$UNCHANGED$') WrappedOptimizer

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') WrappedOptimizer

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class pysindy.SR3(threshold=0.1, thresholds=None, nu=1.0, tol=1e-05, thresholder='L0', trimming_fraction=0.0, trimming_step_size=1.0, max_iter=30, copy_X=True, initial_guess=None, normalize_columns=False, verbose=False, unbias=False)[source]

Bases: BaseOptimizer

Sparse relaxed regularized regression.

Attempts to minimize the objective function

\[0.5\|y-Xw\|^2_2 + \lambda R(u) + (0.5 / \nu)\|w-u\|^2_2\]

where \(R(u)\) is a regularization function. See the following references for more details:

Zheng, Peng, et al. “A unified framework for sparse relaxed regularized regression: SR3.” IEEE Access 7 (2018): 1404-1423.

Champion, K., Zheng, P., Aravkin, A. Y., Brunton, S. L., & Kutz, J. N. (2020). A unified sparse optimization framework to learn parsimonious physics-informed models from data. IEEE Access, 8, 169259-169271.

Parameters:
  • threshold (float, optional (default 0.1)) – Determines the strength of the regularization. When the regularization function R is the L0 norm, the regularization is equivalent to performing hard thresholding, and lambda is chosen to threshold at the value given by this parameter. This is equivalent to choosing lambda = threshold^2 / (2 * nu).

  • nu (float, optional (default 1)) – Determines the level of relaxation. Decreasing nu encourages w and v to be close, whereas increasing nu allows the regularized coefficients v to be farther from w.

  • tol (float, optional (default 1e-5)) – Tolerance used for determining convergence of the optimization algorithm.

  • thresholder (string, optional (default 'L0')) – Regularization function to use. Currently implemented options are ‘L0’ (L0 norm), ‘L1’ (L1 norm), ‘L2’ (L2 norm) and ‘CAD’ (clipped absolute deviation). Note by ‘L2 norm’ we really mean the squared L2 norm, i.e. ridge regression

  • trimming_fraction (float, optional (default 0.0)) – Fraction of the data samples to trim during fitting. Should be a float between 0.0 and 1.0. If 0.0, trimming is not performed.

  • trimming_step_size (float, optional (default 1.0)) – Step size to use in the trimming optimization procedure.

  • max_iter (int, optional (default 30)) – Maximum iterations of the optimization algorithm.

  • initial_guess (np.ndarray, shape (n_features) or (n_targets, n_features), optional (default None)) – Initial guess for coefficients coef_. If None, least-squares is used to obtain an initial guess.

  • normalize_columns (boolean, optional (default False)) – Normalize the columns of x (the SINDy library terms) before regression by dividing by the L2-norm. Note that the ‘normalize’ option in sklearn is deprecated in sklearn versions >= 1.0 and will be removed.

  • copy_X (boolean, optional (default True)) – If True, X will be copied; else, it may be overwritten.

  • thresholds (np.ndarray, shape (n_targets, n_features), optional (default None)) – Array of thresholds for each library function coefficient. Each row corresponds to a measurement variable and each column to a function from the feature library. Recall that SINDy seeks a matrix \(\Xi\) such that \(\dot{X} \approx \Theta(X)\Xi\). thresholds[i, j] should specify the threshold to be used for the (j + 1, i + 1) entry of \(\Xi\). That is to say it should give the threshold to be used for the (j + 1)st library function in the equation for the (i + 1)st measurement variable.

  • verbose (bool, optional (default False)) – If True, prints out the different error terms every max_iter / 10 iterations.

  • unbias (bool (default False)) – See base class for definition. Most options are incompatible with unbiasing.

Attributes:
  • coef_ (array, shape (n_features,) or (n_targets, n_features)) – Regularized weight vector(s). This is the v in the objective function.

  • coef_full_ (array, shape (n_features,) or (n_targets, n_features)) – Weight vector(s) that are not subjected to the regularization. This is the w in the objective function.

  • history_ (list) – History of sparse coefficients. history_[k] contains the sparse coefficients (v in the optimization objective function) at iteration k.

Examples

>>> import numpy as np
>>> from scipy.integrate import odeint
>>> from pysindy import SINDy
>>> from pysindy.optimizers import SR3
>>> lorenz = lambda z,t : [10 * (z[1] - z[0]),
>>>                        z[0] * (28 - z[2]) - z[1],
>>>                        z[0] * z[1] - 8 / 3 * z[2]]
>>> t = np.arange(0, 2, .002)
>>> x = odeint(lorenz, [-8, 8, 27], t)
>>> opt = SR3(threshold=0.1, nu=1)
>>> model = SINDy(optimizer=opt)
>>> model.fit(x, t=t[1] - t[0])
>>> model.print()
x0' = -10.004 1 + 10.004 x0
x1' = 27.994 1 + -0.993 x0 + -1.000 1 x1
x2' = -2.662 x1 + 1.000 1 x0
enable_trimming(trimming_fraction)[source]

Enable the trimming of potential outliers.

Parameters:

trimming_fraction (float) – The fraction of samples to be trimmed. Must be between 0 and 1.

disable_trimming()[source]

Disable trimming of potential outliers.

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$', x_: bool | None | str = '$UNCHANGED$') SR3

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_ parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') SR3

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class pysindy.STLSQ(threshold=0.1, alpha=0.05, max_iter=20, ridge_kw=None, normalize_columns=False, copy_X=True, initial_guess=None, verbose=False, sparse_ind=None, unbias=True)[source]

Bases: BaseOptimizer

Sequentially thresholded least squares algorithm. Defaults to doing Sequentially thresholded Ridge regression.

Attempts to minimize the objective function \(\|y - Xw\|^2_2 + \alpha \|w\|^2_2\) by iteratively performing least squares and masking out elements of the weight array w that are below a given threshold.

See the following reference for more details:

Brunton, Steven L., Joshua L. Proctor, and J. Nathan Kutz. “Discovering governing equations from data by sparse identification of nonlinear dynamical systems.” Proceedings of the national academy of sciences 113.15 (2016): 3932-3937.

Parameters:
  • threshold (float, optional (default 0.1)) – Minimum magnitude for a coefficient in the weight vector. Coefficients with magnitude below the threshold are set to zero.

  • alpha (float, optional (default 0.05)) – Optional L2 (ridge) regularization on the weight vector.

  • max_iter (int, optional (default 20)) – Maximum iterations of the optimization algorithm.

  • ridge_kw (dict, optional (default None)) – Optional keyword arguments to pass to the ridge regression.

  • normalize_columns (boolean, optional (default False)) – Normalize the columns of x (the SINDy library terms) before regression by dividing by the L2-norm. Note that the ‘normalize’ option in sklearn is deprecated in sklearn versions >= 1.0 and will be removed.

  • copy_X (boolean, optional (default True)) – If True, X will be copied; else, it may be overwritten.

  • initial_guess (np.ndarray, shape (n_features) or (n_targets, n_features),) – optional (default None) Initial guess for coefficients coef_. If None, least-squares is used to obtain an initial guess.

  • verbose (bool, optional (default False)) – If True, prints out the different error terms every iteration.

  • sparse_ind (list, optional (default None)) – Indices to threshold and perform ridge regression upon. If None, sparse thresholding and ridge regression is applied to all indices.

Attributes:
  • coef_ (array, shape (n_features,) or (n_targets, n_features)) – Weight vector(s).

  • ind_ (array, shape (n_features,) or (n_targets, n_features)) – Array of 0s and 1s indicating which coefficients of the weight vector have not been masked out, i.e. the support of self.coef_.

  • history_ (list) – History of coef_. history_[k] contains the values of coef_ at iteration k of sequentially thresholded least-squares.

Examples

>>> import numpy as np
>>> from scipy.integrate import odeint
>>> from pysindy import SINDy
>>> from pysindy.optimizers import STLSQ
>>> lorenz = lambda z,t : [10*(z[1] - z[0]),
>>>                        z[0]*(28 - z[2]) - z[1],
>>>                        z[0]*z[1] - 8/3*z[2]]
>>> t = np.arange(0,2,.002)
>>> x = odeint(lorenz, [-8,8,27], t)
>>> opt = STLSQ(threshold=.1, alpha=.5)
>>> model = SINDy(optimizer=opt)
>>> model.fit(x, t=t[1]-t[0])
>>> model.print()
x0' = -9.999 1 + 9.999 x0
x1' = 27.984 1 + -0.996 x0 + -1.000 1 x1
x2' = -2.666 x1 + 1.000 1 x0
property complexity
set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$', x_: bool | None | str = '$UNCHANGED$') STLSQ

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_ parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') STLSQ

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class pysindy.ConstrainedSR3(threshold=0.1, nu=1.0, tol=1e-05, thresholder='l0', max_iter=30, trimming_fraction=0.0, trimming_step_size=1.0, constraint_lhs=None, constraint_rhs=None, constraint_order='target', normalize_columns=False, copy_X=True, initial_guess=None, thresholds=None, equality_constraints=False, inequality_constraints=False, constraint_separation_index=0, verbose=False, verbose_cvxpy=False, unbias=False)[source]

Bases: SR3

Sparse relaxed regularized regression with linear (in)equality constraints.

Attempts to minimize the objective function

\[0.5\|y-Xw\|^2_2 + \lambda R(u) + (0.5 / \nu)\|w-u\|^2_2\]
\[\text{subject to } Cw = d\]

over u and w, where \(R(u)\) is a regularization function, C is a constraint matrix, and d is a vector of values. See the following reference for more details:

Champion, Kathleen, et al. “A unified sparse optimization framework to learn parsimonious physics-informed models from data.” IEEE Access 8 (2020): 169259-169271.

Parameters:
  • threshold (float, optional (default 0.1)) – Determines the strength of the regularization. When the regularization function R is the l0 norm, the regularization is equivalent to performing hard thresholding, and lambda is chosen to threshold at the value given by this parameter. This is equivalent to choosing lambda = threshold^2 / (2 * nu).

  • nu (float, optional (default 1)) – Determines the level of relaxation. Decreasing nu encourages w and v to be close, whereas increasing nu allows the regularized coefficients v to be farther from w.

  • tol (float, optional (default 1e-5)) – Tolerance used for determining convergence of the optimization algorithm.

  • thresholder (string, optional (default 'l0')) – Regularization function to use. Currently implemented options are ‘l0’ (l0 norm), ‘l1’ (l1 norm), ‘l2’ (l2 norm), ‘cad’ (clipped absolute deviation), ‘weighted_l0’ (weighted l0 norm), ‘weighted_l1’ (weighted l1 norm), and ‘weighted_l2’ (weighted l2 norm).

  • max_iter (int, optional (default 30)) – Maximum iterations of the optimization algorithm.

  • constraint_lhs (numpy ndarray, optional (default None)) – Shape should be (n_constraints, n_features * n_targets), The left hand side matrix C of Cw <= d. There should be one row per constraint.

  • constraint_rhs (numpy ndarray, shape (n_constraints,), optional (default None)) – The right hand side vector d of Cw <= d.

  • constraint_order (string, optional (default "target")) – The format in which the constraints constraint_lhs were passed. Must be one of “target” or “feature”. “target” indicates that the constraints are grouped by target: i.e. the first n_features columns correspond to constraint coefficients on the library features for the first target (variable), the next n_features columns to the library features for the second target (variable), and so on. “feature” indicates that the constraints are grouped by library feature: the first n_targets columns correspond to the first library feature, the next n_targets columns to the second library feature, and so on.

  • normalize_columns (boolean, optional (default False)) – Normalize the columns of x (the SINDy library terms) before regression by dividing by the L2-norm. Note that the ‘normalize’ option in sklearn is deprecated in sklearn versions >= 1.0 and will be removed. Note that this parameter is incompatible with the constraints!

  • copy_X (boolean, optional (default True)) – If True, X will be copied; else, it may be overwritten.

  • initial_guess (np.ndarray, optional (default None)) – Shape should be (n_features) or (n_targets, n_features). Initial guess for coefficients coef_, (v in the mathematical equations) If None, least-squares is used to obtain an initial guess.

  • thresholds (np.ndarray, shape (n_targets, n_features), optional (default None)) – Array of thresholds for each library function coefficient. Each row corresponds to a measurement variable and each column to a function from the feature library. Recall that SINDy seeks a matrix \(\Xi\) such that \(\dot{X} \approx \Theta(X)\Xi\). thresholds[i, j] should specify the threshold to be used for the (j + 1, i + 1) entry of \(\Xi\). That is to say it should give the threshold to be used for the (j + 1)st library function in the equation for the (i + 1)st measurement variable.

  • inequality_constraints (bool, optional (default False)) – If True, CVXPY methods are used to solve the problem.

  • verbose (bool, optional (default False)) – If True, prints out the different error terms every max_iter / 10 iterations.

  • verbose_cvxpy (bool, optional (default False)) – Boolean flag which is passed to CVXPY solve function to indicate if output should be verbose or not. Only relevant for optimizers that use the CVXPY package in some capabity.

  • unbias (bool (default False)) – See base class for definition. Most options are incompatible with unbiasing.

Attributes:
  • coef_ (array, shape (n_features,) or (n_targets, n_features)) – Regularized weight vector(s). This is the v in the objective function.

  • coef_full_ (array, shape (n_features,) or (n_targets, n_features)) – Weight vector(s) that are not subjected to the regularization. This is the w in the objective function.

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$', x_: bool | None | str = '$UNCHANGED$') ConstrainedSR3

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_ parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ConstrainedSR3

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class pysindy.StableLinearSR3(threshold=0.1, nu=1.0, tol=1e-05, thresholder='l1', max_iter=30, trimming_fraction=0.0, trimming_step_size=1.0, constraint_lhs=None, constraint_rhs=None, constraint_order='target', normalize_columns=False, copy_X=True, initial_guess=None, thresholds=None, equality_constraints=False, inequality_constraints=False, constraint_separation_index=0, verbose=False, verbose_cvxpy=False, gamma=-1e-08, unbias=False)[source]

Bases: ConstrainedSR3

Sparse relaxed regularized regression for building a-priori stable linear models. This requires making a matrix negative definite, which can be challenging. Here we use a similar method to the TrappingOptimizer algorithm. Linear equality and linear inequality constraints are both allowed, as in the ConstrainedSR3 optimizer.

Attempts to minimize the objective function

\[0.5\|y-Xw\|^2_2 + \lambda R(u) + (0.5 / \nu)\|w-u\|^2_2\]
\[\text{subject to } Cu = d, Du = e, w negative definite\]

over u and w, where \(R(u)\) is a regularization function, C and D are constraint matrices, and d and e are vectors of values. NOTE: This optimizer is intended for building purely linear models that are guaranteed to be stable.

Parameters:
  • threshold (float, optional (default 0.1)) – Determines the strength of the regularization. When the regularization function R is the l0 norm, the regularization is equivalent to performing hard thresholding, and lambda is chosen to threshold at the value given by this parameter. This is equivalent to choosing lambda = threshold^2 / (2 * nu).

  • nu (float, optional (default 1)) – Determines the level of relaxation. Decreasing nu encourages w and v to be close, whereas increasing nu allows the regularized coefficients v to be farther from w.

  • tol (float, optional (default 1e-5)) – Tolerance used for determining convergence of the optimization algorithm.

  • thresholder (string, optional (default 'l1')) – Regularization function to use. Currently implemented options are ‘l1’ (l1 norm), ‘l2’ (l2 norm), ‘cad’ (clipped absolute deviation), ‘weighted_l1’ (weighted l1 norm), and ‘weighted_l2’ (weighted l2 norm). Note that the thresholder must be convex here.

  • max_iter (int, optional (default 30)) – Maximum iterations of the optimization algorithm.

  • constraint_lhs (numpy ndarray, optional (default None)) – Shape should be (n_constraints, n_features * n_targets), The left hand side matrix C of Cw <= d. There should be one row per constraint.

  • constraint_rhs (numpy ndarray, shape (n_constraints,), optional (default None)) – The right hand side vector d of Cw <= d.

  • constraint_order (string, optional (default "target")) – The format in which the constraints constraint_lhs were passed. Must be one of “target” or “feature”. “target” indicates that the constraints are grouped by target: i.e. the first n_features columns correspond to constraint coefficients on the library features for the first target (variable), the next n_features columns to the library features for the second target (variable), and so on. “feature” indicates that the constraints are grouped by library feature: the first n_targets columns correspond to the first library feature, the next n_targets columns to the second library feature, and so on.

  • normalize_columns (boolean, optional (default False)) – Normalize the columns of x (the SINDy library terms) before regression by dividing by the L2-norm. Note that the ‘normalize’ option in sklearn is deprecated in sklearn versions >= 1.0 and will be removed. Note that this parameter is incompatible with the constraints!

  • copy_X (boolean, optional (default True)) – If True, X will be copied; else, it may be overwritten.

  • initial_guess (np.ndarray, optional (default None)) – Shape should be (n_features) or (n_targets, n_features). Initial guess for coefficients coef_, (v in the mathematical equations) If None, least-squares is used to obtain an initial guess.

  • thresholds (np.ndarray, shape (n_targets, n_features), optional (default None)) – Array of thresholds for each library function coefficient. Each row corresponds to a measurement variable and each column to a function from the feature library. Recall that SINDy seeks a matrix \(\Xi\) such that \(\dot{X} \approx \Theta(X)\Xi\). thresholds[i, j] should specify the threshold to be used for the (j + 1, i + 1) entry of \(\Xi\). That is to say it should give the threshold to be used for the (j + 1)st library function in the equation for the (i + 1)st measurement variable.

  • inequality_constraints (bool, optional (default False)) – If True, CVXPY methods are used to solve the problem.

  • verbose (bool, optional (default False)) – If True, prints out the different error terms every max_iter / 10 iterations.

  • verbose_cvxpy (bool, optional (default False)) – Boolean flag which is passed to CVXPY solve function to indicate if output should be verbose or not. Only relevant for optimizers that use the CVXPY package in some capabity.

  • arguments (See base class for additional) –

Attributes:
  • coef_ (array, shape (n_features,) or (n_targets, n_features)) – Regularized weight vector(s). This is the v in the objective function.

  • coef_full_ (array, shape (n_features,) or (n_targets, n_features)) – Weight vector(s) that are not subjected to the regularization. This is the w in the objective function.

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$', x_: bool | None | str = '$UNCHANGED$') StableLinearSR3

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_ parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') StableLinearSR3

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class pysindy.TrappingSR3(evolve_w=True, threshold=0.1, eps_solver=1e-07, relax_optim=True, inequality_constraints=False, eta=None, alpha_A=None, alpha_m=None, gamma=-0.1, tol=1e-05, tol_m=1e-05, thresholder='l1', thresholds=None, max_iter=30, accel=False, normalize_columns=False, copy_X=True, m0=None, A0=None, objective_history=None, constraint_lhs=None, constraint_rhs=None, constraint_order='target', verbose=False, verbose_cvxpy=False, unbias=False)[source]

Bases: SR3

Trapping variant of sparse relaxed regularized regression. This optimizer can be used to identify systems with globally stable (bounded) solutions.

Attempts to minimize one of two related objective functions

\[0.5\|y-Xw\|^2_2 + \lambda R(w) + 0.5\|Pw-A\|^2_2/\eta + \delta_0(Cw-d) + \delta_{\Lambda}(A)\]

or

\[0.5\|y-Xw\|^2_2 + \lambda R(w) + \delta_0(Cw-d) + 0.5 * maximumeigenvalue(A)/\eta\]

where \(R(w)\) is a regularization function, which must be convex, \(\delta_0\) is an indicator function that provides a hard constraint of CW = d, and :math:delta_{Lambda} is a term to project the \(A\) matrix onto the space of negative definite matrices. See the following references for more details:

Kaptanoglu, Alan A., et al. “Promoting global stability in data-driven models of quadratic nonlinear dynamics.” arXiv preprint arXiv:2105.01843 (2021).

Zheng, Peng, et al. “A unified framework for sparse relaxed regularized regression: Sr3.” IEEE Access 7 (2018): 1404-1423.

Champion, Kathleen, et al. “A unified sparse optimization framework to learn parsimonious physics-informed models from data.” IEEE Access 8 (2020): 169259-169271.

Parameters:
  • evolve_w (bool, optional (default True)) – If false, don’t update w and just minimize over (m, A)

  • threshold (float, optional (default 0.1)) – Determines the strength of the regularization. When the regularization function R is the L0 norm, the regularization is equivalent to performing hard thresholding, and lambda is chosen to threshold at the value given by this parameter. This is equivalent to choosing lambda = threshold^2 / (2 * nu).

  • eta (float, optional (default 1.0e20)) – Determines the strength of the stability term ||Pw-A||^2 in the optimization. The default value is very large so that the algorithm default is to ignore the stability term. In this limit, this should be approximately equivalent to the ConstrainedSR3 method.

  • alpha_m (float, optional (default eta * 0.1)) – Determines the step size in the prox-gradient descent over m. For convergence, need alpha_m <= eta / ||w^T * PQ^T * PQ * w||. Typically 0.01 * eta <= alpha_m <= 0.1 * eta.

  • alpha_A (float, optional (default eta)) – Determines the step size in the prox-gradient descent over A. For convergence, need alpha_A <= eta, so typically alpha_A = eta is used.

  • gamma (float, optional (default 0.1)) – Determines the negative interval that matrix A is projected onto. For most applications gamma = 0.1 - 1.0 works pretty well.

  • tol (float, optional (default 1e-5)) – Tolerance used for determining convergence of the optimization algorithm over w.

  • tol_m (float, optional (default 1e-5)) – Tolerance used for determining convergence of the optimization algorithm over m.

  • thresholder (string, optional (default 'L1')) – Regularization function to use. For current trapping SINDy, only the L1 and L2 norms are implemented. Note that other convex norms could be straightforwardly implemented, but L0 requires reformulation because of nonconvexity.

  • thresholds (np.ndarray, shape (n_targets, n_features), optional (default None)) – Array of thresholds for each library function coefficient. Each row corresponds to a measurement variable and each column to a function from the feature library. Recall that SINDy seeks a matrix \(\Xi\) such that \(\dot{X} \approx \Theta(X)\Xi\). thresholds[i, j] should specify the threshold to be used for the (j + 1, i + 1) entry of \(\Xi\). That is to say it should give the threshold to be used for the (j + 1)st library function in the equation for the (i + 1)st measurement variable.

  • eps_solver (float, optional (default 1.0e-7)) – If threshold != 0, this specifies the error tolerance in the CVXPY (OSQP) solve. Default is 1.0e-3 in OSQP.

  • relax_optim (bool, optional (default True)) – If relax_optim = True, use the relax-and-split method. If False, try a direct minimization on the largest eigenvalue.

  • inequality_constraints (bool, optional (default False)) – If True, relax_optim must be false or relax_optim = True AND threshold != 0, so that the CVXPY methods are used.

  • max_iter (int, optional (default 30)) – Maximum iterations of the optimization algorithm.

  • accel (bool, optional (default False)) – Whether or not to use accelerated prox-gradient descent for (m, A).

  • m0 (np.ndarray, shape (n_targets), optional (default None)) – Initial guess for vector m in the optimization. Otherwise each component of m is randomly initialized in [-1, 1].

  • A0 (np.ndarray, shape (n_targets, n_targets), optional (default None)) – Initial guess for vector A in the optimization. Otherwise A is initialized as A = diag(gamma).

  • copy_X (boolean, optional (default True)) – If True, X will be copied; else, it may be overwritten.

  • normalize_columns (boolean, optional (default False)) – Normalize the columns of x (the SINDy library terms) before regression by dividing by the L2-norm. Note that the ‘normalize’ option in sklearn is deprecated in sklearn versions >= 1.0 and will be removed.

  • verbose (bool, optional (default False)) – If True, prints out the different error terms every max_iter / 10 iterations.

  • verbose_cvxpy (bool, optional (default False)) – Boolean flag which is passed to CVXPY solve function to indicate if output should be verbose or not. Only relevant for optimizers that use the CVXPY package in some capabity.

  • unbias (bool (default False)) – See base class for definition. Most options are incompatible with unbiasing.

Attributes:
  • coef_ (array, shape (n_features,) or (n_targets, n_features)) – Regularized weight vector(s). This is the v in the objective function.

  • history_ (list) – History of sparse coefficients. history_[k] contains the sparse coefficients (v in the optimization objective function) at iteration k.

  • objective_history_ (list) – History of the value of the objective at each step. Note that the trapping SINDy problem is nonconvex, meaning that this value may increase and decrease as the algorithm works.

  • A_history_ (list) – History of the auxiliary variable A that approximates diag(PW).

  • m_history_ (list) – History of the shift vector m that determines the origin of the trapping region.

  • PW_history_ (list) – History of PW = A^S, the quantity we are attempting to make negative definite.

  • PWeigs_history_ (list) – History of diag(PW), a list of the eigenvalues of A^S at each iteration. Tracking this allows us to ascertain if A^S is indeed being pulled towards the space of negative definite matrices.

  • PL_unsym_ (np.ndarray, shape (n_targets, n_targets, n_targets, n_features)) – Unsymmetrized linear coefficient part of the P matrix in ||Pw - A||^2

  • PL_ (np.ndarray, shape (n_targets, n_targets, n_targets, n_features)) – Linear coefficient part of the P matrix in ||Pw - A||^2

  • PQ_ (np.ndarray, shape (n_targets, n_targets,) – n_targets, n_targets, n_features) Quadratic coefficient part of the P matrix in ||Pw - A||^2

Examples

>>> import numpy as np
>>> from scipy.integrate import odeint
>>> from pysindy import SINDy
>>> from pysindy.optimizers import TrappingSR3
>>> lorenz = lambda z,t : [10*(z[1] - z[0]),
>>>                        z[0]*(28 - z[2]) - z[1],
>>>                        z[0]*z[1] - 8/3*z[2]]
>>> t = np.arange(0,2,.002)
>>> x = odeint(lorenz, [-8,8,27], t)
>>> opt = TrappingSR3(threshold=0.1)
>>> model = SINDy(optimizer=opt)
>>> model.fit(x, t=t[1]-t[0])
>>> model.print()
x0' = -10.004 1 + 10.004 x0
x1' = 27.994 1 + -0.993 x0 + -1.000 1 x1
x2' = -2.662 x1 + 1.000 1 x0
set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$', x_: bool | None | str = '$UNCHANGED$') TrappingSR3

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_ parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') TrappingSR3

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class pysindy.SSR(alpha=0.05, max_iter=20, ridge_kw=None, normalize_columns=False, copy_X=True, criteria='coefficient_value', kappa=None, verbose=False, unbias=True)[source]

Bases: BaseOptimizer

Stepwise sparse regression (SSR) greedy algorithm.

Attempts to minimize the objective function \(\|y - Xw\|^2_2 + \alpha \|w\|^2_2\) by iteratively eliminating the smallest coefficient

See the following reference for more details:

Boninsegna, Lorenzo, Feliks Nüske, and Cecilia Clementi. “Sparse learning of stochastic dynamical equations.” The Journal of chemical physics 148.24 (2018): 241723.

Parameters:
  • max_iter (int, optional (default 20)) – Maximum iterations of the optimization algorithm.

  • normalize_columns (boolean, optional (default False)) – Normalize the columns of x (the SINDy library terms) before regression by dividing by the L2-norm. Note that the ‘normalize’ option in sklearn is deprecated in sklearn versions >= 1.0 and will be removed.

  • copy_X (boolean, optional (default True)) – If True, X will be copied; else, it may be overwritten.

  • kappa (float, optional (default None)) – If passed, compute the MSE errors with an extra L0 term with strength equal to kappa times the condition number of Theta.

  • criteria (string, optional (default "coefficient_value")) – The criteria to use for truncating a coefficient each iteration. Must be “coefficient_value” or “model_residual”. “coefficient_value”: zero out the smallest coefficient). “model_residual”: choose the N-1 term model with the smallest residual error.

  • alpha (float, optional (default 0.05)) – Optional L2 (ridge) regularization on the weight vector.

  • ridge_kw (dict, optional (default None)) – Optional keyword arguments to pass to the ridge regression.

  • verbose (bool, optional (default False)) – If True, prints out the different error terms every iteration.

Attributes:
  • coef_ (array, shape (n_features,) or (n_targets, n_features)) – Weight vector(s).

  • history_ (list) – History of coef_. history_[k] contains the values of coef_ at iteration k of SSR

  • err_history_ (list) – History of coef_. history_[k] contains the MSE of each coef_ at iteration k of SSR

Examples

>>> import numpy as np
>>> from scipy.integrate import odeint
>>> from pysindy import SINDy
>>> from pysindy.optimizers import SSR
>>> lorenz = lambda z,t : [10 * (z[1] - z[0]),
>>>                        z[0] * (28 - z[2]) - z[1],
>>>                        z[0] * z[1] - 8 / 3 * z[2]]
>>> t = np.arange(0, 2, .002)
>>> x = odeint(lorenz, [-8, 8, 27], t)
>>> opt = SSR(alpha=.5)
>>> model = SINDy(optimizer=opt)
>>> model.fit(x, t=t[1] - t[0])
>>> model.print()
x0' = -9.999 1 + 9.999 x0
x1' = 27.984 1 + -0.996 x0 + -1.000 1 x1
x2' = -2.666 x1 + 1.000 1 x0
set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$', x_: bool | None | str = '$UNCHANGED$') SSR

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_ parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') SSR

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class pysindy.FROLS(normalize_columns=False, copy_X=True, kappa=None, max_iter=10, alpha=0.05, ridge_kw=None, verbose=False, unbias=True)[source]

Bases: BaseOptimizer

Forward Regression Orthogonal Least-Squares (FROLS) optimizer.

Attempts to minimize the objective function \(\|y - Xw\|^2_2 + \alpha \|w\|^2_2\) by iteractively selecting the most correlated function in the library. This is a greedy algorithm.

See the following reference for more details:

Billings, Stephen A. Nonlinear system identification: NARMAX methods in the time, frequency, and spatio-temporal domains. John Wiley & Sons, 2013.

Parameters:
  • normalize_columns (boolean, optional (default False)) – Normalize the columns of x (the SINDy library terms) before regression by dividing by the L2-norm. Note that the ‘normalize’ option in sklearn is deprecated in sklearn versions >= 1.0 and will be removed.

  • copy_X (boolean, optional (default True)) – If True, X will be copied; else, it may be overwritten.

  • kappa (float, optional (default None)) – If passed, compute the MSE errors with an extra L0 term with strength equal to kappa times the condition number of Theta.

  • max_iter (int, optional (default 10)) – Maximum iterations of the optimization algorithm. This determines the number of nonzero terms chosen by the FROLS algorithm.

  • alpha (float, optional (default 0.05)) – Optional L2 (ridge) regularization on the weight vector.

  • ridge_kw (dict, optional (default None)) – Optional keyword arguments to pass to the ridge regression.

  • verbose (bool, optional (default False)) – If True, prints out the different error terms every iteration.

Attributes:
  • coef_ (array, shape (n_features,) or (n_targets, n_features)) – Weight vector(s).

  • history_ (list) – History of coef_. history_[k] contains the values of coef_ at iteration k of FROLS.

Examples

>>> import numpy as np
>>> from scipy.integrate import odeint
>>> from pysindy import SINDy
>>> from pysindy.optimizers import FROLS
>>> lorenz = lambda z,t : [10 * (z[1] - z[0]),
>>>                        z[0] * (28 - z[2]) - z[1],
>>>                        z[0] * z[1] - 8 / 3 * z[2]]
>>> t = np.arange(0, 2, .002)
>>> x = odeint(lorenz, [-8, 8, 27], t)
>>> opt = FROLS(threshold=.1, alpha=.5)
>>> model = SINDy(optimizer=opt)
>>> model.fit(x, t=t[1] - t[0])
>>> model.print()
x0' = -9.999 1 + 9.999 x0
x1' = 27.984 1 + -0.996 x0 + -1.000 1 x1
x2' = -2.666 x1 + 1.000 1 x0
set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$', x_: bool | None | str = '$UNCHANGED$') FROLS

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_ parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') FROLS

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class pysindy.SINDyPI(threshold=0.1, tol=1e-05, thresholder='l1', max_iter=10000, copy_X=True, thresholds=None, model_subset=None, normalize_columns=False, verbose_cvxpy=False, unbias=False)[source]

Bases: SR3

SINDy-PI optimizer

Attempts to minimize the objective function

\[0.5\|X-Xw\|^2_2 + \lambda R(w)\]

over w where \(R(v)\) is a regularization function. See the following reference for more details:

Kaheman, Kadierdan, J. Nathan Kutz, and Steven L. Brunton. SINDy-PI: a robust algorithm for parallel implicit sparse identification of nonlinear dynamics. Proceedings of the Royal Society A 476.2242 (2020): 20200279.

Parameters:
  • threshold (float, optional (default 0.1)) – Determines the strength of the regularization. When the regularization function R is the l0 norm, the regularization is equivalent to performing hard thresholding, and lambda is chosen to threshold at the value given by this parameter. This is equivalent to choosing lambda = threshold^2 / (2 * nu).

  • tol (float, optional (default 1e-5)) – Tolerance used for determining convergence of the optimization algorithm.

  • thresholder (string, optional (default 'l1')) – Regularization function to use. Currently implemented options are ‘l1’ (l1 norm), ‘weighted_l1’ (weighted l1 norm), l2, and ‘weighted_l2’ (weighted l2 norm)

  • max_iter (int, optional (default 10000)) – Maximum iterations of the optimization algorithm.

  • normalize_columns (boolean, optional (default False)) – This parameter normalizes the columns of Theta before the optimization is done. This tends to standardize the columns to similar magnitudes, often improving performance.

  • copy_X (boolean, optional (default True)) – If True, X will be copied; else, it may be overwritten.

  • thresholds (np.ndarray, shape (n_targets, n_features), optional (default None)) – Array of thresholds for each library function coefficient. Each row corresponds to a measurement variable and each column to a function from the feature library. Recall that SINDy seeks a matrix \(\Xi\) such that \(\dot{X} \approx \Theta(X)\Xi\). thresholds[i, j] should specify the threshold to be used for the (j + 1, i + 1) entry of \(\Xi\). That is to say it should give the threshold to be used for the (j + 1)st library function in the equation for the (i + 1)st measurement variable.

  • model_subset (np.ndarray, shape(n_models), optional (default None)) – List of indices to compute models for. If list is not provided, the default is to compute SINDy-PI models for all possible candidate functions. This can take a long time for 4D systems or larger.

  • verbose_cvxpy (bool, optional (default False)) – Boolean flag which is passed to CVXPY solve function to indicate if output should be verbose or not. Only relevant for optimizers that use the CVXPY package in some capabity.

Attributes:
  • coef_ (array, shape (n_features,) or (n_targets, n_features)) – Regularized weight vector(s). This is the v in the objective function.

  • unbias (bool) – Required to be false, maintained for supertype compatibility

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$', x_: bool | None | str = '$UNCHANGED$') SINDyPI

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_ parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') SINDyPI

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class pysindy.MIOSR(target_sparsity=5, group_sparsity=None, alpha=0.01, regression_timeout=10, constraint_lhs=None, constraint_rhs=None, constraint_order='target', normalize_columns=False, copy_X=True, initial_guess=None, verbose=False, unbias=False)[source]

Bases: BaseOptimizer

Mixed-Integer Optimized Sparse Regression.

Solves the sparsity constrained regression problem to provable optimality .. math:

\|y-Xw\|^2_2 + \lambda R(u)
\[\text{subject to } \|w\|_0 \leq k\]

by using type-1 specially ordered sets (SOS1) to encode the support of the coefficients. Can optionally add additional constraints on the coefficients or access the gurobi model directly for advanced usage. See the following reference for additional details:

Bertsimas, D. and Gurnee, W., 2022. Learning Sparse Nonlinear Dynamics via Mixed-Integer Optimization. arXiv preprint arXiv:2206.00176.

Parameters:
  • target_sparsity (int, optional (default 5)) – The maximum number of nonzero coefficients across all dimensions. If set, the model will fit all dimensions jointly, potentially reducing statistical efficiency.

  • group_sparsity (int tuple, optional (default None)) – Tuple of length n_targets constraining the number of nonzero coefficients for each target dimension.

  • alpha (float, optional (default 0.01)) – Optional L2 (ridge) regularization on the weight vector.

  • regression_timeout (int, optional (default 10)) – The timeout (in seconds) of the gurobi optimizer to solve and prove optimality (either per dimension or jointly depending on the above sparsity settings).

  • constraint_lhs (numpy ndarray, optional (default None)) – Shape should be (n_constraints, n_features * n_targets), The left hand side matrix C of Cw <= d. There should be one row per constraint.

  • constraint_rhs (numpy ndarray, shape (n_constraints,), optional (default None)) – The right hand side vector d of Cw <= d.

  • constraint_order (string, optional (default "target")) – The format in which the constraints constraint_lhs were passed. Must be one of “target” or “feature”. “target” indicates that the constraints are grouped by target: i.e. the first n_features columns correspond to constraint coefficients on the library features for the first target (variable), the next n_features columns to the library features for the second target (variable), and so on. “feature” indicates that the constraints are grouped by library feature: the first n_targets columns correspond to the first library feature, the next n_targets columns to the second library feature, and so on.

  • normalize_columns (boolean, optional (default False)) – Normalize the columns of x (the SINDy library terms) before regression by dividing by the L2-norm. Note that the ‘normalize’ option in sklearn is deprecated in sklearn versions >= 1.0 and will be removed. Note that this parameter is incompatible with the constraints!

  • copy_X (boolean, optional (default True)) – If True, X will be copied; else, it may be overwritten.

  • initial_guess (np.ndarray, shape (n_features) or (n_targets, n_features), optional (default None)) – Initial guess for coefficients coef_ to warmstart the optimizer.

  • verbose (bool, optional (default False)) – If True, prints out the Gurobi solver log.

  • unbias (bool) – Required to be false, maintained for supertype compatibility

Attributes:
  • coef_ (array, shape (n_features,) or (n_targets, n_features)) – Weight vector(s).

  • ind_ (array, shape (n_features,) or (n_targets, n_features)) – Array of 0s and 1s indicating which coefficients of the weight vector have not been masked out, i.e. the support of self.coef_.

  • model (gurobipy.model) – The raw gurobi model being solved.

property complexity
set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$', x_: bool | None | str = '$UNCHANGED$') MIOSR

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_ parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') MIOSR

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object