Advanced Visual Interpretation
π― Purpose¶
This guide provides advanced visual tools for analyzing the output of classification models. It complements the EDA and modeling guidebooks by focusing on post-model diagnostics, probability behavior, threshold analysis, and interpretability across binary and multiclass settings.
π 1. Confusion Matrix (with Heatmap)¶
Purpose: Show accuracy and misclassification rates per class.
from sklearn.metrics import ConfusionMatrixDisplay
ConfusionMatrixDisplay.from_predictions(y_true, y_pred, cmap="Blues")
βοΈ Normalize by rows to visualize per-class recall βοΈ Annotate with % or count for stakeholders
π 2. ROC Curve (Binary / Multiclass OvR)¶
Purpose: Measure model's ranking ability at different thresholds.
from sklearn.metrics import roc_curve, roc_auc_score
- Plot TPR vs FPR
- Add
AUC
to legend for comparison
βοΈ Use OvR format for multiclass classifiers βοΈ Use to compare multiple models on one plot
πΏ 3. Precision-Recall Curve (Imbalanced Data)¶
Purpose: Reveal classifier performance on minority class.
from sklearn.metrics import precision_recall_curve
βοΈ Steep drop-off = model sensitivity to threshold βοΈ Use area under PR curve as stability measure
π¦ 4. Predicted Probability Histogram¶
Purpose: Show model confidence across samples.
plt.hist(y_proba, bins=20)
βοΈ Sharp peaks near 0 or 1 = confident classifier β οΈ Flat or centered = underfit or poorly calibrated
π 5. Calibration Curve (Reliability Plot)¶
Purpose: Compare predicted proba to actual likelihood.
from sklearn.calibration import calibration_curve
βοΈ Curve β diagonal = well-calibrated β οΈ Over/underconfidence curves inform post-hoc scaling
π§ͺ 6. Threshold Tuning Plot¶
Purpose: Visualize how precision, recall, and F1 change across thresholds.
from sklearn.metrics import precision_recall_curve
βοΈ Identify optimal threshold (not always 0.5) βοΈ Pair with business cost matrix if available
π§ 7. SHAP / Feature Importance Plots¶
For Tree Models:¶
shap.summary_plot(shap_values, X_test)
For Linear Models:¶
sns.barplot(x=coefficients, y=feature_names)
βοΈ Use to explain model behavior by input βοΈ Export visuals to stakeholder decks
π§ 8. Per-Class Breakdown (Multiclass Models)¶
Purpose: Visually diagnose class-level performance.
- Confusion matrix heatmap (normalized)
- One-vs-rest ROC/PR curves
- Class-specific confidence histograms
βοΈ Flag underperforming classes by color threshold βοΈ Use radar plots for class summary profiles
π Analyst Visual Review Checklist¶
- [ ] Confusion matrix plotted and interpreted
- [ ] ROC or PR curve plotted by class
- [ ] Probability distribution reviewed
- [ ] Calibration plot created
- [ ] Threshold vs F1/Recall chart analyzed
- [ ] SHAP or feature impact plot exported
- [ ] Stakeholder visuals saved
π‘ Final Tip¶
βVisuals are your interface between model truth and stakeholder understanding. Always tune thresholds and validate confidence.β
Use with: Classifier Statistical Summary Sheet, Evaluation Checklist, and Modeling Guidebook.