Information-criterion selection

Pick the best λ from a fitted *PathRegressor by AIC, BIC, or EBIC. Single free function — no per-estimator wrapper explosion.

The criteria use the negative log-likelihood at each λ on the training data plus a complexity penalty:

  • AIC = 2k + 2·NLL

  • BIC = log(n)·k + 2·NLL

  • EBIC = BIC + 2γ·log C(p, k), with gamma_ebic [0, 1] (default 0.5; matches ncvreg::BIC’s high-dim recommendation).

Effective df is the number of nonzero coefficients per λ — the Zou-Hastie-Tibshirani unbiased estimator and the standard ncvreg/glmnet convention.

select_by_ic dispatches the per-family NLL by sniffing the path estimator’s class name. The five families currently supported:

  • LS (MCPPathRegressor, SCADPathRegressor, *Group*PathRegressor): NLL = (n/2) · log(RSS/n).

  • Logistic (Logistic*PathRegressor): NLL = Σ softplus(η) y·η.

  • Poisson (Poisson*PathRegressor): NLL = Σ exp(η) y·η.

  • Cox PH (Cox*PathRegressor): Breslow per-sample partial NLL × n, read from the path’s info_["final_losses"].

  • Multinomial (Multinomial*PathClassifier): per-λ Σ_i (logsumexp(η_i) η_{i, y_i}). Effective df is the row-grouped active-feature count (a feature is “active” if any of its K class coefficients is nonzero), the analog of the Zou-Hastie-Tibshirani df for row-grouped lasso.

skein_glm.ic.select_by_ic(path_model, x, *outcomes, criterion='bic', ebic_gamma=0.5, active_eps=1e-12)[source]

Pick the best λ from a fitted path estimator by AIC, BIC, or EBIC.

A single free function — no per-estimator wrapper. Dispatches on the path estimator’s class name to compute the right negative log-likelihood (LS, logistic, Poisson, or Cox), then adds a complexity penalty:

  • AIC = 2k + 2·NLL

  • BIC = log(n)·k + 2·NLL

  • EBIC = BIC + 2γ·log C(p, k), with γ ∈ [0, 1] (default 0.5; matches ncvreg::BIC’s high-dim recommendation)

where k is the number of nonzero coefficients per λ (the Zou-Hastie-Tibshirani unbiased df estimator, the standard ncvreg / glmnet convention).

Parameters:
  • path_model (*PathRegressor) – Any fitted path estimator (LS / logistic / Poisson / Cox).

  • x (array-like) – The design matrix used in the fit. Used to recompute the per-λ negative log-likelihood.

  • *outcomes – For non-Cox estimators: a single y array. For Cox: time, event. Mirrors each estimator’s fit signature.

  • criterion ({"aic", "bic", "ebic"}, default "bic") – Which information criterion to use.

  • ebic_gamma (float, default 0.5) – EBIC penalty parameter γ ∈ [0, 1]. Ignored for AIC/BIC.

  • active_eps (float, default 1e-12) – Threshold for counting a coefficient as “active” (nonzero).

Returns:

  • best_idx (int) – Index into path_model.lambdas_ of the IC-minimizing λ.

  • scores (ndarray of shape (n_lambdas,)) – Per-λ score vector (lower-is-better). The fitted β is path_model.coefs_[best_idx].

Return type:

tuple[int, ndarray[tuple[Any, …], dtype[float64]]]

Examples

>>> import skein_glm
>>> path = skein_glm.MCPPathRegressor(gamma=3.0, n_lambdas=50).fit(X, y)
>>> best_idx, scores = skein_glm.select_by_ic(path, X, y, criterion="bic")
>>> beta_best = path.coefs_[best_idx]
>>> intercept_best = path.intercepts_[best_idx]

For Cox PH:

>>> cox_path = skein_glm.CoxMCPPathRegressor(gamma=3.0, n_lambdas=50).fit(
...     X, time, event)
>>> best_idx, _ = skein_glm.select_by_ic(cox_path, X, time, event, criterion="ebic")