π Linear Feature Transformation Trigger Card
π― Purpose¶
Use this card to determine when to apply feature transformations (e.g. log, sqrt, power, binning) before running a linear regression model. Helps identify skewed or nonlinear predictors that may violate model assumptions or degrade performance.
π 1. When to Transform Features¶
Condition | Suggestion |
---|---|
Skewness > Β±1 | Use log or sqrt transform |
Heteroskedasticity observed | Log/sqrt to stabilize variance |
Long-tailed distribution | Consider log or Box-Cox/Yeo-Johnson |
X vs Y non-linear | Try polynomial or log(X) variants |
Zero-inflated or boundary effects | Apply binning or thresholding |
from scipy.stats import skew
skew(df['income']) # > 1 β log candidate
π§ͺ 2. Visual Triggers¶
Visual | Transformation Cue |
---|---|
Histogram with long right tail | Log or sqrt |
Residual plot shows fan shape | Variance-stabilizing transform |
X vs Y: curvature in scatter | Poly features or log(X) |
Extreme outliers (Z > 3) | Consider robust scale or winsorization |
βοΈ 3. Transformation Options¶
Method | Use When... |
---|---|
np.log1p(X) |
Right-skewed + zero values present |
np.sqrt(X) |
Moderate skew, non-negative |
power_transform() |
General normalizer (Box-Cox or YJ) |
StandardScaler |
When model is sensitive to scale (esp. Ridge/Lasso) |
βοΈ Scale after transforming, especially if using regularized models
β Transformation Checklist¶
- [ ] Skew > Β±1 or kurtosis > 3 reviewed
- [ ] Visual cues support transformation
- [ ] Chosen method appropriate for data range
- [ ] Post-transform distribution checked
- [ ] Feature scaling applied after transformation
π‘ Tip¶
βLinear models love smooth, symmetric input. If your predictors are shouting, transformation helps them whisper.β