Skip to content

Ridge


🎯 Purpose

This QuickRef walks you through the core logic, fitting workflow, parameter tuning, and interpretation of Ridge Regression β€” the go-to model when you want to shrink coefficients without dropping features.


πŸ“¦ 1. When to Use Ridge

Scenario Why Ridge Works
High multicollinearity Stabilizes estimates by shrinking correlated weights
Many weak predictors Reduces variance without removing them
Overfitting with OLS Adds penalty to control complexity
All features need to stay Unlike Lasso, it keeps all coefficients

βš™οΈ 2. How It Works

  • Adds L2 penalty to loss function:

$$ \text{Loss} = RSS + \alpha \sum w_i^2 $$

  • Penalizes large coefficients, encouraging smoother models

πŸ› οΈ 3. Fitting Ridge in sklearn

from sklearn.linear_model import Ridge
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline

# Always scale features before fitting
model = make_pipeline(StandardScaler(), Ridge(alpha=1.0))
model.fit(X_train, y_train)

βœ”οΈ Use RidgeCV() to cross-validate alpha


πŸ” 4. Tuning Alpha

from sklearn.linear_model import RidgeCV
alphas = [0.1, 1.0, 10.0]
model = RidgeCV(alphas=alphas, cv=5).fit(X, y)
Alpha ↑ Effect
Low Behaves like OLS (minimal shrink)
High Stronger shrinkage, more bias

πŸ“Š 5. Output Interpretation

Coefficients Meaning
Small but β‰  0 Feature retained but shrunken
Close to OLS If alpha is low
Still interpretable Unlike PCA or regularization-based transforms

βœ”οΈ Use RΒ² / Adjusted RΒ² / RMSE to evaluate fit


βœ… Modeling Checklist

  • [ ] All features scaled (standardized)
  • [ ] Alpha selected via CV or tuning
  • [ ] Coefficients interpreted in context of shrinkage
  • [ ] Baseline OLS compared for performance gain

πŸ’‘ Tip

β€œRidge regression won’t choose your features β€” but it’ll stop them from yelling over each other.”