Stability selection¶

Meinshausen–Bühlmann (2010) stability selection wraps any *PathRegressor in a bootstrap-driven feature-selection meta-procedure. For each bootstrap iteration:

Subsample the data (default half-sample without replacement).
Fit the underlying path estimator on the subsample.
Record the active set at each λ.

After n_bootstraps iterations, average the per-(feature, λ) activity masks to get selection probabilities Π_j(λ_k). The stable set is {j : max_k Π_j(λ_k) ≥ threshold}.

Why¶

Stability selection sidesteps the brittle “pick a single λ” decision. The MB Theorem 1 result gives an explicit error-control bound on expected false positives in the stable set, depending only on the threshold and the typical active-set size — not on the chosen λ.

Skein’s stability-selection module is the headline M5.x differentiator: neither glmnet’s cv.glmnet nor skglm ships a clean implementation, and grpreg has nothing comparable. The bootstrap loop is embarrassingly parallel — pass n_jobs=-1 to use every core.

Compatibility¶

StabilitySelection accepts any skein path estimator as base_estimator:

Scalar LS / GLM / Cox (MCPPathRegressor, LogisticMCPPathRegressor, CoxMCPPathRegressor, etc.).
Group penalties (GroupLassoPathRegressor, GroupSCADPathRegressor, LogisticSparseGroupMCPPathRegressor, etc.). For grouped variants, “selected” is evaluated at the group level — features in the same group share the same selection probability.
Multi-task / multinomial (MultiTaskLassoPathRegressor, MultinomialMCPPathClassifier, etc.). The 3D coefs_ shape (n_lambdas, K, p) is collapsed across the K axis with an “any-class active” rule.
Cox is auto-detected via the ties attribute; pass outcomes as y=(time, event).

Tuning¶

n_bootstraps (default 100): MB’s bound improves asymptotically. 50–200 is typical; smaller for quick exploratory runs, larger for publication.
sample_fraction (default 0.5): MB recommend half-sample subsampling without replacement.
threshold (default 0.6): must be > 0.5 for the MB error-control bound. Higher → fewer false positives but tighter recall.
n_jobs (default None = serial): parallel bootstraps via joblib. -1 uses all cores. For a fixed random_state the bootstrap indices are pre-generated, so n_jobs > 1 produces identical selection probabilities.

λ-grid choice¶

The shared λ-grid for all bootstraps is drawn from a single full-data fit of the base estimator. Avoid very small lambda_min_ratio (e.g. 1e-4) — at near-OLS λs every bootstrap selects every feature, which inflates max_probabilities_ to 1 indiscriminately. Stick to lambda_min_ratio ≥ 1e-2 and n_lambdas ≤ 50 for clean stability plots.

API¶

class skein_glm.StabilitySelection(base_estimator, *, n_bootstraps=100, sample_fraction=0.5, threshold=0.6, random_state=None, n_jobs=None, active_eps=1e-08)[source]¶

Bases: BaseEstimator, TransformerMixin

Meinshausen-Bühlmann (2010) stability selection.

Wraps any skein *PathRegressor (LS, group, GLM, multi-task, multinomial, Cox) and turns it into a feature-selection procedure via subsample-bootstrap.

Parameters:

base_estimator – A skein path estimator with a fit method, lambdas_ and coefs_ attributes after fitting, and (optionally) a groups attribute for grouped variants. Examples: MCPPathRegressor(), GroupLassoPathRegressor(groups=...), LogisticMCPPathRegressor(), CoxMCPPathRegressor().
n_bootstraps (int, default 100) – Number of subsample-bootstrap iterations. The MB error-control bound improves with B (asymptotically); 50–200 is typical.
sample_fraction (float, default 0.5) – Fraction of n to subsample without replacement at each iteration. MB recommend 0.5 (half-sample subsampling).
threshold (float, default 0.6) – Selection-probability cutoff for the stable set. Must be > 0.5 for the MB bound to apply. Higher → fewer false positives.
random_state (int or None, default None) – Seed for the bootstrap subsampling RNG.
n_jobs (int or None, default None) – Number of parallel workers via joblib. -1 uses all cores. None is serial.
active_eps (float, default 1e-8) – Threshold for declaring |β_j| “nonzero”. Looser than the select_by_ic default (1e-12) to absorb mild solver drift across bootstrap subsamples.

selection_probabilities_¶

Π_j(λ_k) — fraction of bootstraps in which feature j was active at lambdas_[k].

Type:: ndarray of shape (n_lambdas, n_features)

max_probabilities_¶

max_k Π_j(λ_k). The MB selection criterion.

Type:: ndarray of shape (n_features,)

stable_features_¶

Indices into [0, n_features) of features in the stable set (max_probabilities_ >= threshold).

Type:: ndarray of int64

lambdas_¶

The shared λ-grid used across all bootstraps. Drawn once from a full-data fit of base_estimator.

Type:: ndarray of shape (n_lambdas,)

n_features_in_¶

Type:: int

Notes

For grouped path estimators (GroupLassoPathRegressor etc.), “selected” is evaluated at the group level: a feature j is counted as selected at λ_k if any coefficient in its group is nonzero. The output selection_probabilities_ is still (n_lambdas, n_features) — features in the same group share the same probability.

For multi-task / multinomial path estimators with coefs_ shape (n_lambdas, K, p), “selected” means any of the K class/task coefficients is nonzero (the row-grouped active-feature rule).

Cox (fit(x, time, event)) is detected by whether the base estimator has a ties attribute; fit then expects stability.fit(x, (time, event)) — pass the outcomes as a tuple.

Examples

>>> from skein_glm import MCPPathRegressor, StabilitySelection
>>> ss = StabilitySelection(
...     base_estimator=MCPPathRegressor(gamma=3.0, n_lambdas=50),
...     n_bootstraps=100, threshold=0.6, n_jobs=-1, random_state=0,
... )
>>> ss.fit(X, y)
>>> stable_X = ss.transform(X)
>>> ss.stable_features_
array([0, 2, 4, 5])

References

Meinshausen, N. and Bühlmann, P. (2010). “Stability selection”. Journal of the Royal Statistical Society B 72(4), 417-473.

fit(x, y)[source]¶

Run the bootstrap loop and store selection probabilities.

For Cox base estimators, pass outcomes as y=(time, event).

Return type:: StabilitySelection

transform(x)[source]¶: Return x restricted to the stable feature set.

fit_transform(x, y, **fit_params)[source]¶

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:

X (array-like of shape (n_samples, n_features)) – Input samples.
y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).
**fit_params (dict) – Additional fit parameters. Pass only if the estimator accepts additional params in its fit method.

Returns:

X_new – Transformed array.

Return type:

ndarray array of shape (n_samples, n_features_new)

set_fit_request(*, x='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in fit.
self (StabilitySelection)

Returns:

self – The updated object.

Return type:

object

set_transform_request(*, x='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the transform method.

The options for each parameter are:

True: metadata is requested, and passed to transform if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to transform.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

x (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for x parameter in transform.
self (StabilitySelection)

Returns:

self – The updated object.

Return type:

object

Example¶

import numpy as np
from skein_glm import MCPPathRegressor, StabilitySelection

rng = np.random.default_rng(0)
X = rng.standard_normal((300, 50))
y = X[:, [0, 5, 12]] @ np.array([1.5, -1.0, 0.8]) + 0.3 * rng.standard_normal(300)

ss = StabilitySelection(
    base_estimator=MCPPathRegressor(gamma=3.0, n_lambdas=20, lambda_min_ratio=1e-2),
    n_bootstraps=100,
    threshold=0.9,
    n_jobs=-1,
    random_state=0,
).fit(X, y)

print(ss.stable_features_)
# array([ 0,  5, 12])

X_stable = ss.transform(X)
# X_stable.shape == (300, 3)

References¶

Meinshausen, N. and Bühlmann, P. (2010). Stability selection. Journal of the Royal Statistical Society B 72(4), 417-473.