Lasso
π― Purpose¶
This QuickRef guides you through the logic, fitting process, and interpretation of Lasso Regression β a linear model with built-in feature selection via L1 regularization.
π¦ 1. When to Use Lasso¶
| Scenario | Why Lasso Works |
|---|---|
| Many weak or irrelevant predictors | Drives unimportant coefficients to zero |
| Need automated feature selection | Simplifies model by removing noise features |
| Overfitting in OLS | Regularizes with variable removal |
| High-dimensional (p > n) data | Useful when predictors > observations |
βοΈ 2. How It Works¶
- Adds L1 penalty to loss function:
$$ \text{Loss} = RSS + \alpha \sum |w_i| $$
- Forces some coefficients to exactly zero β feature selection built-in
π οΈ 3. Fitting Lasso in sklearn¶
from sklearn.linear_model import Lasso
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
model = make_pipeline(StandardScaler(), Lasso(alpha=0.1))
model.fit(X_train, y_train)
βοΈ Always scale features before fitting
π 4. Tuning Alpha¶
from sklearn.linear_model import LassoCV
model = LassoCV(cv=5).fit(X, y)
| Alpha β | Effect |
|---|---|
| Low | Keeps more features (closer to OLS) |
| High | Drops more features, increases sparsity |
π 5. Output Interpretation¶
| Coefficients | Meaning |
|---|---|
| = 0 | Dropped by model (not predictive) |
| β 0 | Kept in model β shrunk estimate |
| Sparse output | Makes downstream models simpler |
βοΈ Use with caution if interpretability or p-values are critical
β Modeling Checklist¶
- [ ] All features standardized
- [ ]
alphaselected via cross-validation - [ ] Zeroed features interpreted as "dropped"
- [ ] Model evaluated vs OLS or Ridge
π‘ Tip¶
βLasso isnβt just about shrinkage β itβs your first line of defense against irrelevant features.β