Feature Transformation

🎯 Purpose¶

This guide outlines essential transformations to enhance model performance, enforce assumptions, and improve feature interpretability. It covers normalization, encoding, binning, polynomial expansion, and nonlinear transformations — supporting regression, classification, clustering, and time series.

⚖️ 1. Scaling and Normalization¶

Method	Use When
StandardScaler	For distance-based models (KMeans, SVM, PCA)
MinMaxScaler	When bounded output [0, 1] is required
RobustScaler	For skewed data or outliers
MaxAbsScaler	Sparse data, or features already centered

Snippet:¶

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

✔️ Always scale after train/test split (fit only on train!)

🧮 2. Encoding Categorical Features¶

Type	Best For	Notes
One-Hot	Low-cardinality nominal variables	Avoid with many unique categories
Ordinal	Ordered categories (e.g., rating 1-5)	Use with caution if order matters
Target Encoding	High-cardinality labels	Risk of leakage if not cross-val
Binary Hashing	Textual categorical compression	Good for NLP or wide tables

Snippet:¶

pd.get_dummies(df['color'], drop_first=True)

✔️ Always validate post-encoding shape and memory usage

📏 3. Binning and Grouping¶

Method	Use Case
Equal-width bins	Compress continuous values
Quantile bins	Rank-normalized distribution
Custom bins	Domain knowledge segmentation
Group rare levels	Categorical reduction before encoding

Example:¶

pd.qcut(df['age'], q=4, labels=False)

✔️ Useful for decision trees and interpretable dashboards

🔁 4. Polynomial & Interaction Features¶

Transformation	Benefit
Polynomial terms	Capture curvature (e.g., `x²`)
Interactions	Capture synergy (`x1 * x2`)
Splines	Smooth piecewise linearity
Log transforms	Linearize exponential relationships

Snippet:¶

from sklearn.preprocessing import PolynomialFeatures
poly = PolynomialFeatures(degree=2, include_bias=False)
X_poly = poly.fit_transform(X)

✔️ Beware of multicollinearity and feature explosion

🔄 5. Nonlinear & Power Transforms¶

Method	Use Case
Log	Right-skewed features, multiplicative
Sqrt	Positive-only data, mild skew
Box-Cox	Normalize + stabilize variance (positive only)
Yeo-Johnson	Same as Box-Cox, but supports zero/negatives

Snippet:¶

from sklearn.preprocessing import PowerTransformer
pt = PowerTransformer(method='yeo-johnson')
X_trans = pt.fit_transform(X)

✔️ Review skew before and after applying transformations

🧪 6. Time Features (Temporal Modeling)¶

Task	Transformation
Extract components	Hour, Day, Month, Year
Cyclical encoding	sin/cos for hour, day of week
Rolling aggregates	Mean, std, min, max (time windows)
Lag features	Previous value(s) of same variable

Example:¶

df['hour_sin'] = np.sin(2 * np.pi * df['hour'] / 24)
df['hour_cos'] = np.cos(2 * np.pi * df['hour'] / 24)

✔️ Consider time zo