Multiclass & Ordinal Guidebook
๐ฏ Purpose¶
This guidebook expands traditional logistic regression to cover multiclass (nominal) and ordinal classification tasks using extensions of the logit framework.
1๏ธโฃ Multiclass (Multinomial) Logistic Regression¶
๐ Overview¶
Multinomial logistic regression models outcomes with 3+ unordered categories, using one-vs-rest or full softmax-style probabilities.
- Target example:
{'low', 'medium', 'high'}
(treated as nominal) - Implemented in
sklearn
,statsmodels
, andR
(multinom
,mlogit
)
๐งฎ Model Form (Softmax)¶
$$ P(Y = k \mid X) = \frac{\exp(X \cdot \beta_k)}{\sum_{j=1}^{K} \exp(X \cdot \beta_j)} $$
Each class gets its own set of coefficients (compared to a reference class).
๐ฆ Tools & Syntax¶
from sklearn.linear_model import LogisticRegression
clf = LogisticRegression(multi_class='multinomial', solver='lbfgs')
clf.fit(X, y)
import statsmodels.api as sm
model = sm.MNLogit(y, sm.add_constant(X)).fit()
โ๏ธ Use predict_proba()
to return class probabilities
๐ Output Interpretation¶
- Coefficients represent log-odds of class k vs reference
np.exp(coef_)
returns relative odds ratios- Watch for sign reversals when comparing class coefficients
2๏ธโฃ Ordinal Logistic Regression¶
๐ Overview¶
Ordinal logistic regression models outcomes with a natural order, using a single coefficient vector but multiple thresholds.
- Target example:
{'low' < 'medium' < 'high'}
- Often referred to as Proportional Odds Model
๐งฎ Model Form¶
$$ \log \left( \frac{P(Y \leq j)}{P(Y > j)} \right) = \theta_j - X \cdot \beta $$
- Each cutoff (j) has its own intercept
ฮธ_j
- The slope
ฮฒ
is shared across classes
โ๏ธ Tooling¶
# In Python (via mord package)
from mord import LogisticIT # or LogisticAT for adjacent categories
model = LogisticIT().fit(X, y)
# In R
polr(y ~ x1 + x2, data = df, method = "logistic")
โ๏ธ Check proportional odds assumption before trusting output
๐ Output Interpretation¶
- Coefficients
ฮฒ
โ effect across all splits (assumes consistency) - Intercepts
ฮธ_j
โ logit cutoff for class boundaries - Interpretation is: "โX increases odds of being in a higher category"
๐ Assumptions Summary¶
Model | Key Assumption |
---|---|
Multinomial | Independence of Irrelevant Alternatives (IIA) |
Ordinal | Proportional Odds โ same slope across thresholds |
โ Guide Checklist¶
- [ ] Target reviewed for type (nominal vs ordinal)
- [ ] Model syntax matched to structure
- [ ] Assumptions checked (IIA or PO)
- [ ] Probabilities or odds correctly interpreted
- [ ] Evaluation metrics chosen based on task (macro-F1, accuracy, etc.)
๐ก Tip¶
โMultinomial predicts which. Ordinal predicts how high.โ