Skip to content

Advanced Visual Interpretation


๐ŸŽฏ Purpose

This guide deepens the visual analysis of linear regression models by incorporating diagnostics for assumption testing, robustness, and model complexity. It extends the standard visual evaluation companion and supports high-quality model QA and reporting.


๐Ÿ“Š 1. Actual vs Predicted (Model Fit Check)

Goal: Evaluate prediction accuracy and potential bias.

โœ”๏ธ Look for tight clustering around the 45ยฐ line. โš ๏ธ Curvature or separation suggests underfitting or omitted variables.

sns.scatterplot(x=y_test, y=y_pred)
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'r--')

๐Ÿ“‰ 2. Residuals vs Fitted (Homoscedasticity)

Goal: Validate constant variance and check for patterning.

โœ”๏ธ Cloud-like spread = good. โš ๏ธ Funnel shape = heteroscedasticity. โš ๏ธ Curve = nonlinear trend not captured.

sns.scatterplot(x=y_pred, y=residuals)
plt.axhline(0, color='red', linestyle='--')

๐Ÿ“ 3. Histogram of Residuals (Normality Check)

Goal: Test if residuals are bell-shaped (assumption for inference).

โœ”๏ธ Smooth bell curve = OK. โš ๏ธ Skew, multiple peaks = assumption violated.

sns.histplot(residuals, kde=True)

๐Ÿ“‹ 4. QQ Plot (Normality Diagnostic)

Goal: Quantify deviation from normal distribution.

โœ”๏ธ Points on line = good. โš ๏ธ S-curve = skewed; tails = outliers or heavy-tailed errors.

sm.qqplot(residuals, line='45')

๐Ÿงช 5. Scale-Location Plot

Goal: Detect non-constant variance (more sensitive than residuals plot).

โœ”๏ธ Flat horizontal band = homoscedastic. โš ๏ธ Upward curve = residuals increasing with fitted value.

sns.scatterplot(x=y_pred, y=np.sqrt(np.abs(residuals)))
plt.axhline(y=np.mean(np.sqrt(np.abs(residuals))), color='red', linestyle='--')

๐Ÿงญ 6. Influence & Leverage Diagnostics

Goal: Identify influential points or high-leverage outliers.

Plot What to Look For
Cookโ€™s Distance Large spikes = influence
Leverage vs Residual Far top right = danger zone
influence = model.get_influence()
(c, p) = influence.cooks_distance
plt.stem(c)

๐Ÿ” 7. Visualizing Model Extensions

๐Ÿ“ Regularization (Ridge/Lasso)

  • Plot coefficients vs alpha (log scale)
  • Use RidgeCV, LassoCV with grid of alphas

๐Ÿ“ˆ Polynomial Regression

  • Overlay predicted vs actual with fitted line
  • Visual residual pattern vs degree of polynomial
from sklearn.preprocessing import PolynomialFeatures

๐Ÿงช 8. Visual Summary Table

Visual Diagnosis Target
Actual vs Predicted General fit & bias
Residuals vs Fitted Homoscedasticity
Histogram of Residuals Normality
QQ Plot Normality (tail behavior)
Scale-Location Plot Variance diagnostics
Leverage vs Residual Influential obs / outliers

๐Ÿ“‹ Analyst Visual Review Checklist

  • [ ] Actual vs Predicted: Tight line fit?
  • [ ] Residuals: Random cloud?
  • [ ] Histogram: Bell-shaped?
  • [ ] QQ Plot: Aligned with diagonal?
  • [ ] Scale-location: Flat trend?
  • [ ] Influential point plots reviewed?
  • [ ] If using Ridge/Lasso, coefficient paths reviewed?

๐Ÿ’ก Final Tip

Always blend residual visuals, fit diagnostics, and robustness checks for trustworthy regression results.

Use this with: Advanced Linear Regression Guidebook, Statistical Summary Sheet, and Evaluation Checklist.