Lasso
π― Purpose¶
This QuickRef guides you through the logic, fitting process, and interpretation of Lasso Regression β a linear model with built-in feature selection via L1 regularization.
π¦ 1. When to Use Lasso¶
Scenario | Why Lasso Works |
---|---|
Many weak or irrelevant predictors | Drives unimportant coefficients to zero |
Need automated feature selection | Simplifies model by removing noise features |
Overfitting in OLS | Regularizes with variable removal |
High-dimensional (p > n) data | Useful when predictors > observations |
βοΈ 2. How It Works¶
- Adds L1 penalty to loss function:
$$ \text{Loss} = RSS + \alpha \sum |w_i| $$
- Forces some coefficients to exactly zero β feature selection built-in
π οΈ 3. Fitting Lasso in sklearn¶
from sklearn.linear_model import Lasso
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
model = make_pipeline(StandardScaler(), Lasso(alpha=0.1))
model.fit(X_train, y_train)
βοΈ Always scale features before fitting
π 4. Tuning Alpha¶
from sklearn.linear_model import LassoCV
model = LassoCV(cv=5).fit(X, y)
Alpha β | Effect |
---|---|
Low | Keeps more features (closer to OLS) |
High | Drops more features, increases sparsity |
π 5. Output Interpretation¶
Coefficients | Meaning |
---|---|
= 0 | Dropped by model (not predictive) |
β 0 | Kept in model β shrunk estimate |
Sparse output | Makes downstream models simpler |
βοΈ Use with caution if interpretability or p-values are critical
β Modeling Checklist¶
- [ ] All features standardized
- [ ]
alpha
selected via cross-validation - [ ] Zeroed features interpreted as "dropped"
- [ ] Model evaluated vs OLS or Ridge
π‘ Tip¶
βLasso isnβt just about shrinkage β itβs your first line of defense against irrelevant features.β