6. Counts and rates¶

Poisson regression handles non-negative integer outcomes — clicks, events, infections, defects. This tutorial covers the standard count case and then the much more useful rate model with exposure offsets.

Plain Poisson regression¶

For counts where every observation has the same denominator (or you don’t care to model exposure):

import numpy as np
import skein_glm

rng = np.random.default_rng(0)
n, p = 300, 20
X = rng.standard_normal((n, p))
true_beta = np.zeros(p)
true_beta[[0, 2, 5]] = [0.5, -0.4, 0.3]

eta = X @ true_beta
y = rng.poisson(np.exp(np.clip(eta, -3, 3))).astype(np.float64)

m = skein_glm.PoissonMCPPathRegressor(
    gamma=3.0, n_lambdas=30, lambda_min_ratio=1e-3,
).fit(X, y)

m.predict(X).shape         # (n, n_lambdas) — predicted means μ = exp(η)
m.decision_function(X).shape  # (n, n_lambdas) — linear predictor η

predict returns the conditional mean μ = exp(η), matching sklearn’s PoissonRegressor. decision_function returns η directly.

Rate models with offsets¶

In practice, observations rarely have the same exposure. Counts of hospital admissions are higher in states with more people; click counts depend on impression volume; defect counts depend on production-run length. The fix is a log-exposure offset:

E[y_i] = exposure_i · exp(X_i β) ⇔ log E[y_i] = log(exposure_i) + X_i β

The offset enters η as a known per-sample term that doesn’t get penalized.

# Synthetic problem: each observation has a different exposure.
exposure = rng.uniform(0.5, 5.0, n)
true_rate = np.exp(np.clip(eta, -3, 3))   # rate per unit exposure
y = rng.poisson(exposure * true_rate).astype(np.float64)

offset = np.log(exposure)

m = skein_glm.PoissonMCPPathRegressor(
    gamma=3.0, offset=offset,
    n_lambdas=30, lambda_min_ratio=1e-3,
).fit(X, y)

The coefficients now have rate-ratio interpretation: a one-unit increase in feature j multiplies the rate by exp(coef_[j]), holding exposure constant. Without the offset, β would absorb random exposure variation and you’d get biased estimates.

# At the smallest λ, β should approximate true_beta.
last = m.coefs_[-1]
print("estimated β:", last[[0, 2, 5]].round(3))
print("true β:     ", true_beta[[0, 2, 5]])

Predicting expected counts¶

When you predict from a model fit with offset, you’re predicting exp(X β) — i.e., the rate per unit exposure, not absolute counts. To get expected counts, multiply by exposure:

predicted_rate = m.predict(X)              # exp(X β); shape (n, n_lambdas)
predicted_count = exposure[:, None] * predicted_rate

When to use offsets¶

Anywhere the natural quantity is a rate, not a raw count:

Domain	Outcome	Offset
Epidemiology	new cases per region	log(person-years)
Click-through	clicks per ad slot	log(impressions)
Insurance	claims per policy	log(policy-years)
Manufacturing	defects per run	log(units produced)
Astronomy	photons per source	log(exposure time)

Without the offset, your “feature effect” is confounded with exposure variation. A coefficient that should have rate-ratio meaning becomes uninterpretable.

Offsets work everywhere in the Poisson family¶

Every Poisson estimator accepts offset — single-λ, path, CV, scalar penalty, group penalty, sparse-group:

# CV with offset
cv = skein_glm.PoissonMCPPathCV(
    gamma=3.0, offset=offset, cv=5, random_state=0,
    n_lambdas=20, lambda_min_ratio=1e-2,
).fit(X, y)
cv.lambda_best_

# Group penalty with offset
groups = np.repeat(np.arange(4), 5).astype(np.int64)
m_group = skein_glm.PoissonGroupLassoPathRegressor(
    groups=groups, offset=offset,
    n_lambdas=20, lambda_min_ratio=1e-2,
).fit(X, y)

CV slices offset[train_idx] per fold automatically — you don’t have to think about per-fold subsetting.

A note on the validation rule¶

Poisson requires y >= 0. skein checks this on every fit and raises a clean error if you pass negative values. Floating-point counts (from rounding) work fine; pass y.astype(np.float64) if your raw counts are integers.

Recap¶

Concept	API
Plain count regression	`PoissonMCPPathRegressor(...).fit(X, y)`
Rate model	add `offset=np.log(exposure)`
CV with offset	works automatically; folds are sliced
Group penalty + offset	every group/sparse-group Poisson estimator accepts `offset`
`predict` returns	rate per unit exposure (`exp(X β)`) — multiply by exposure to get expected counts
Coefficient interpretation	rate ratio: `exp(coef_[j])` is the rate multiplier

Next¶

→ 7. Stability selection — bootstrap-based feature selection without picking λ.