Stability selection

Meinshausen–Bühlmann (2010) stability selection wraps any *PathRegressor in a bootstrap-driven feature-selection meta-procedure. For each bootstrap iteration:

  1. Subsample the data (default half-sample without replacement).

  2. Fit the underlying path estimator on the subsample.

  3. Record the active set at each λ.

After n_bootstraps iterations, average the per-(feature, λ) activity masks to get selection probabilities Π_j(λ_k). The stable set is {j : max_k Π_j(λ_k) threshold}.

Why

Stability selection sidesteps the brittle “pick a single λ” decision. The MB Theorem 1 result gives an explicit error-control bound on expected false positives in the stable set, depending only on the threshold and the typical active-set size — not on the chosen λ.

Skein’s stability-selection module is the headline M5.x differentiator: neither glmnet’s cv.glmnet nor skglm ships a clean implementation, and grpreg has nothing comparable. The bootstrap loop is embarrassingly parallel — pass n_jobs=-1 to use every core.

Compatibility

StabilitySelection accepts any skein path estimator as base_estimator:

  • Scalar LS / GLM / Cox (MCPPathRegressor, LogisticMCPPathRegressor, CoxMCPPathRegressor, etc.).

  • Group penalties (GroupLassoPathRegressor, GroupSCADPathRegressor, LogisticSparseGroupMCPPathRegressor, etc.). For grouped variants, “selected” is evaluated at the group level — features in the same group share the same selection probability.

  • Multi-task / multinomial (MultiTaskLassoPathRegressor, MultinomialMCPPathClassifier, etc.). The 3D coefs_ shape (n_lambdas, K, p) is collapsed across the K axis with an “any-class active” rule.

  • Cox is auto-detected via the ties attribute; pass outcomes as y=(time, event).

Tuning

  • n_bootstraps (default 100): MB’s bound improves asymptotically. 50–200 is typical; smaller for quick exploratory runs, larger for publication.

  • sample_fraction (default 0.5): MB recommend half-sample subsampling without replacement.

  • threshold (default 0.6): must be > 0.5 for the MB error-control bound. Higher → fewer false positives but tighter recall.

  • n_jobs (default None = serial): parallel bootstraps via joblib. -1 uses all cores. For a fixed random_state the bootstrap indices are pre-generated, so n_jobs > 1 produces identical selection probabilities.

λ-grid choice

The shared λ-grid for all bootstraps is drawn from a single full-data fit of the base estimator. Avoid very small lambda_min_ratio (e.g. 1e-4) — at near-OLS λs every bootstrap selects every feature, which inflates max_probabilities_ to 1 indiscriminately. Stick to lambda_min_ratio1e-2 and n_lambdas ≤ 50 for clean stability plots.

API

class skein_glm.StabilitySelection(base_estimator, *, n_bootstraps=100, sample_fraction=0.5, threshold=0.6, random_state=None, n_jobs=None, active_eps=1e-08)[source]

Bases: BaseEstimator, TransformerMixin

Meinshausen-Bühlmann (2010) stability selection.

Wraps any skein *PathRegressor (LS, group, GLM, multi-task, multinomial, Cox) and turns it into a feature-selection procedure via subsample-bootstrap.

Parameters:
  • base_estimator – A skein path estimator with a fit method, lambdas_ and coefs_ attributes after fitting, and (optionally) a groups attribute for grouped variants. Examples: MCPPathRegressor(), GroupLassoPathRegressor(groups=...), LogisticMCPPathRegressor(), CoxMCPPathRegressor().

  • n_bootstraps (int, default 100) – Number of subsample-bootstrap iterations. The MB error-control bound improves with B (asymptotically); 50–200 is typical.

  • sample_fraction (float, default 0.5) – Fraction of n to subsample without replacement at each iteration. MB recommend 0.5 (half-sample subsampling).

  • threshold (float, default 0.6) – Selection-probability cutoff for the stable set. Must be > 0.5 for the MB bound to apply. Higher → fewer false positives.

  • random_state (int or None, default None) – Seed for the bootstrap subsampling RNG.

  • n_jobs (int or None, default None) – Number of parallel workers via joblib. -1 uses all cores. None is serial.

  • active_eps (float, default 1e-8) – Threshold for declaring |β_j| “nonzero”. Looser than the select_by_ic default (1e-12) to absorb mild solver drift across bootstrap subsamples.

selection_probabilities_

Π_j(λ_k) — fraction of bootstraps in which feature j was active at lambdas_[k].

Type:

ndarray of shape (n_lambdas, n_features)

max_probabilities_

max_k Π_j(λ_k). The MB selection criterion.

Type:

ndarray of shape (n_features,)

stable_features_

Indices into [0, n_features) of features in the stable set (max_probabilities_ >= threshold).

Type:

ndarray of int64

lambdas_

The shared λ-grid used across all bootstraps. Drawn once from a full-data fit of base_estimator.

Type:

ndarray of shape (n_lambdas,)

n_features_in_
Type:

int

Notes

For grouped path estimators (GroupLassoPathRegressor etc.), “selected” is evaluated at the group level: a feature j is counted as selected at λ_k if any coefficient in its group is nonzero. The output selection_probabilities_ is still (n_lambdas, n_features) — features in the same group share the same probability.

For multi-task / multinomial path estimators with coefs_ shape (n_lambdas, K, p), “selected” means any of the K class/task coefficients is nonzero (the row-grouped active-feature rule).

Cox (fit(x, time, event)) is detected by whether the base estimator has a ties attribute; fit then expects stability.fit(x, (time, event)) — pass the outcomes as a tuple.

Examples

>>> from skein_glm import MCPPathRegressor, StabilitySelection
>>> ss = StabilitySelection(
...     base_estimator=MCPPathRegressor(gamma=3.0, n_lambdas=50),
...     n_bootstraps=100, threshold=0.6, n_jobs=-1, random_state=0,
... )
>>> ss.fit(X, y)
>>> stable_X = ss.transform(X)
>>> ss.stable_features_
array([0, 2, 4, 5])

References

Meinshausen, N. and Bühlmann, P. (2010). “Stability selection”. Journal of the Royal Statistical Society B 72(4), 417-473.

fit(x, y)[source]

Run the bootstrap loop and store selection probabilities.

For Cox base estimators, pass outcomes as y=(time, event).

Return type:

StabilitySelection

transform(x)[source]

Return x restricted to the stable feature set.

fit_transform(x, y, **fit_params)[source]

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Input samples.

  • y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).

  • **fit_params (dict) – Additional fit parameters. Pass only if the estimator accepts additional params in its fit method.

Returns:

X_new – Transformed array.

Return type:

ndarray array of shape (n_samples, n_features_new)

set_fit_request(*, x='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
Returns:

self – The updated object.

Return type:

object

set_transform_request(*, x='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the transform method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to transform if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to transform.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
Returns:

self – The updated object.

Return type:

object

Example

import numpy as np
from skein_glm import MCPPathRegressor, StabilitySelection

rng = np.random.default_rng(0)
X = rng.standard_normal((300, 50))
y = X[:, [0, 5, 12]] @ np.array([1.5, -1.0, 0.8]) + 0.3 * rng.standard_normal(300)

ss = StabilitySelection(
    base_estimator=MCPPathRegressor(gamma=3.0, n_lambdas=20, lambda_min_ratio=1e-2),
    n_bootstraps=100,
    threshold=0.9,
    n_jobs=-1,
    random_state=0,
).fit(X, y)

print(ss.stable_features_)
# array([ 0,  5, 12])

X_stable = ss.transform(X)
# X_stable.shape == (300, 3)

References

Meinshausen, N. and Bühlmann, P. (2010). Stability selection. Journal of the Royal Statistical Society B 72(4), 417-473.