Stability selection¶
Meinshausen–Bühlmann (2010) stability selection wraps any *PathRegressor
in a bootstrap-driven feature-selection meta-procedure. For each
bootstrap iteration:
Subsample the data (default half-sample without replacement).
Fit the underlying path estimator on the subsample.
Record the active set at each λ.
After n_bootstraps iterations, average the per-(feature, λ) activity
masks to get selection probabilities Π_j(λ_k). The stable set
is {j : max_k Π_j(λ_k) ≥ threshold}.
Why¶
Stability selection sidesteps the brittle “pick a single λ” decision. The MB Theorem 1 result gives an explicit error-control bound on expected false positives in the stable set, depending only on the threshold and the typical active-set size — not on the chosen λ.
Skein’s stability-selection module is the headline M5.x
differentiator: neither glmnet’s cv.glmnet nor skglm ships a
clean implementation, and grpreg has nothing comparable. The
bootstrap loop is embarrassingly parallel — pass n_jobs=-1 to use
every core.
Compatibility¶
StabilitySelection accepts any skein path estimator as base_estimator:
Scalar LS / GLM / Cox (
MCPPathRegressor,LogisticMCPPathRegressor,CoxMCPPathRegressor, etc.).Group penalties (
GroupLassoPathRegressor,GroupSCADPathRegressor,LogisticSparseGroupMCPPathRegressor, etc.). For grouped variants, “selected” is evaluated at the group level — features in the same group share the same selection probability.Multi-task / multinomial (
MultiTaskLassoPathRegressor,MultinomialMCPPathClassifier, etc.). The 3Dcoefs_shape(n_lambdas, K, p)is collapsed across the K axis with an “any-class active” rule.Cox is auto-detected via the
tiesattribute; pass outcomes asy=(time, event).
Tuning¶
n_bootstraps(default 100): MB’s bound improves asymptotically. 50–200 is typical; smaller for quick exploratory runs, larger for publication.sample_fraction(default 0.5): MB recommend half-sample subsampling without replacement.threshold(default 0.6): must be > 0.5 for the MB error-control bound. Higher → fewer false positives but tighter recall.n_jobs(defaultNone= serial): parallel bootstraps viajoblib.-1uses all cores. For a fixedrandom_statethe bootstrap indices are pre-generated, son_jobs > 1produces identical selection probabilities.
λ-grid choice¶
The shared λ-grid for all bootstraps is drawn from a single full-data
fit of the base estimator. Avoid very small lambda_min_ratio
(e.g. 1e-4) — at near-OLS λs every bootstrap selects every feature,
which inflates max_probabilities_ to 1 indiscriminately. Stick to
lambda_min_ratio ≥ 1e-2 and n_lambdas ≤ 50 for clean stability
plots.
API¶
- class skein_glm.StabilitySelection(base_estimator, *, n_bootstraps=100, sample_fraction=0.5, threshold=0.6, random_state=None, n_jobs=None, active_eps=1e-08)[source]¶
Bases:
BaseEstimator,TransformerMixinMeinshausen-Bühlmann (2010) stability selection.
Wraps any skein *PathRegressor (LS, group, GLM, multi-task, multinomial, Cox) and turns it into a feature-selection procedure via subsample-bootstrap.
- Parameters:
base_estimator – A skein path estimator with a
fitmethod,lambdas_andcoefs_attributes after fitting, and (optionally) agroupsattribute for grouped variants. Examples:MCPPathRegressor(),GroupLassoPathRegressor(groups=...),LogisticMCPPathRegressor(),CoxMCPPathRegressor().n_bootstraps (
int, default100) – Number of subsample-bootstrap iterations. The MB error-control bound improves with B (asymptotically); 50–200 is typical.sample_fraction (
float, default0.5) – Fraction of n to subsample without replacement at each iteration. MB recommend 0.5 (half-sample subsampling).threshold (
float, default0.6) – Selection-probability cutoff for the stable set. Must be > 0.5 for the MB bound to apply. Higher → fewer false positives.random_state (
intorNone, defaultNone) – Seed for the bootstrap subsampling RNG.n_jobs (
intorNone, defaultNone) – Number of parallel workers viajoblib.-1uses all cores.Noneis serial.active_eps (
float, default1e-8) – Threshold for declaring |β_j| “nonzero”. Looser than theselect_by_icdefault (1e-12) to absorb mild solver drift across bootstrap subsamples.
- selection_probabilities_¶
Π_j(λ_k)— fraction of bootstraps in which featurejwas active atlambdas_[k].- Type:
ndarrayofshape (n_lambdas,n_features)
- max_probabilities_¶
max_k Π_j(λ_k). The MB selection criterion.- Type:
ndarrayofshape (n_features,)
- stable_features_¶
Indices into
[0, n_features)of features in the stable set (max_probabilities_ >= threshold).- Type:
ndarrayofint64
- lambdas_¶
The shared λ-grid used across all bootstraps. Drawn once from a full-data fit of
base_estimator.- Type:
ndarrayofshape (n_lambdas,)
Notes
For grouped path estimators (
GroupLassoPathRegressoretc.), “selected” is evaluated at the group level: a featurejis counted as selected atλ_kif any coefficient in its group is nonzero. The outputselection_probabilities_is still(n_lambdas, n_features)— features in the same group share the same probability.For multi-task / multinomial path estimators with
coefs_shape(n_lambdas, K, p), “selected” means any of the K class/task coefficients is nonzero (the row-grouped active-feature rule).Cox (
fit(x, time, event)) is detected by whether the base estimator has atiesattribute;fitthen expectsstability.fit(x, (time, event))— pass the outcomes as a tuple.Examples
>>> from skein_glm import MCPPathRegressor, StabilitySelection >>> ss = StabilitySelection( ... base_estimator=MCPPathRegressor(gamma=3.0, n_lambdas=50), ... n_bootstraps=100, threshold=0.6, n_jobs=-1, random_state=0, ... ) >>> ss.fit(X, y) >>> stable_X = ss.transform(X) >>> ss.stable_features_ array([0, 2, 4, 5])
References
Meinshausen, N. and Bühlmann, P. (2010). “Stability selection”. Journal of the Royal Statistical Society B 72(4), 417-473.
- fit(x, y)[source]¶
Run the bootstrap loop and store selection probabilities.
For Cox base estimators, pass outcomes as
y=(time, event).- Return type:
- fit_transform(x, y, **fit_params)[source]¶
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters:
X (
array-likeofshape (n_samples,n_features)) – Input samples.y (
array-likeofshape (n_samples,)or(n_samples,n_outputs), default=None) – Target values (None for unsupervised transformations).**fit_params (
dict) – Additional fit parameters. Pass only if the estimator accepts additional params in its fit method.
- Returns:
X_new – Transformed array.
- Return type:
ndarray arrayofshape (n_samples,n_features_new)
- set_fit_request(*, x='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- set_transform_request(*, x='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
transformmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed totransformif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it totransform.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Example¶
import numpy as np
from skein_glm import MCPPathRegressor, StabilitySelection
rng = np.random.default_rng(0)
X = rng.standard_normal((300, 50))
y = X[:, [0, 5, 12]] @ np.array([1.5, -1.0, 0.8]) + 0.3 * rng.standard_normal(300)
ss = StabilitySelection(
base_estimator=MCPPathRegressor(gamma=3.0, n_lambdas=20, lambda_min_ratio=1e-2),
n_bootstraps=100,
threshold=0.9,
n_jobs=-1,
random_state=0,
).fit(X, y)
print(ss.stable_features_)
# array([ 0, 5, 12])
X_stable = ss.transform(X)
# X_stable.shape == (300, 3)
References¶
Meinshausen, N. and Bühlmann, P. (2010). Stability selection. Journal of the Royal Statistical Society B 72(4), 417-473.