pysindy.optimizers.EnsembleOptimizer

class pysindy.optimizers.EnsembleOptimizer(opt: BaseOptimizer, bagging: bool = False, library_ensemble: bool = False, n_models: int = 20, n_subset: int = None, n_candidates_to_drop: int = 1, replace: bool = True, ensemble_aggregator: Callable = None)[source]

Wrapper class for ensembling methods.

Parameters:
  • opt (BaseOptimizer) – The underlying optimizer to run on each ensemble

  • bagging (boolean, optional (default False)) – This parameter is used to allow for “ensembling”, i.e. the generation of many SINDy models (n_models) by choosing a random temporal subset of the input data (n_subset) for each sparse regression. This often improves robustness because averages (bagging) or medians (bragging) of all the models are usually quite high-performing. The user can also generate “distributions” of many models, and calculate how often certain library terms are included in a model.

  • library_ensemble (boolean, optional (default False)) – This parameter is used to allow for “library ensembling”, i.e. the generation of many SINDy models (n_models) by choosing a random subset of the candidate library terms to truncate. So, n_models are generated by solving n_models sparse regression problems on these “reduced” libraries. Once again, this often improves robustness because averages (bagging) or medians (bragging) of all the models are usually quite high-performing. The user can also generate “distributions” of many models, and calculate how often certain library terms are included in a model.

  • n_models (int, optional (default 20)) – Number of models to generate via ensemble

  • n_subset (int, optional (default len(time base))) – Number of time points to use for ensemble. When bagging with replacement (bootstrap), a value equal to the original number of samples is standard. See: B. Efron (1979), “Bootstrap Methods: Another Look at the Jackknife”, The Annals of Statistics.

  • n_candidates_to_drop (int, optional (default 1)) – Number of candidate terms in the feature library to drop during library ensembling.

  • replace (boolean, optional (default True)) – If ensemble true, whether or not to time sample with replacement.

  • ensemble_aggregator (callable, optional (default numpy.median)) – Method to aggregate model coefficients across different samples. This method argument is only used if ensemble or library_ensemble is True. The method should take in a list of 2D arrays and return a 2D array of the same shape as the arrays in the list. Example: lambda x: np.median(x, axis=0)

Attributes:
  • coef_ (array, shape (n_features,) or (n_targets, n_features)) – Regularized weight vector(s). This is the v in the objective function.

  • coef_full_ (array, shape (n_features,) or (n_targets, n_features)) – Weight vector(s) that are not subjected to the regularization. This is the w in the objective function.

Methods

set_fit_request

Configure whether metadata should be requested to be passed to the fit method.

set_score_request

Configure whether metadata should be requested to be passed to the score method.

Attributes

max_iter

normalize_columns

initial_guess

copy_X

unbias

coef_

intercept_

set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$', x_: bool | None | str = '$UNCHANGED$') EnsembleOptimizer

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x_ parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') EnsembleOptimizer

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object