Cox partial likelihood re-derived as normal equation of GLM
Bypassing risk-set construction and event conditioning arguments
The Cox proportional hazards model is a popular semiparametric regression model in survival analysis. Its partial likelihood offers a clever way to filter out the nonparametric baseline function while keeping focus on the regression coefficients (log-hazard ratios).
The partial likelihood was originally derived through careful construction of risk sets and conditioning arguments specific to the survival context. However, with a bit of handwaving, I’ll show that it can be re-derived, mechanically, as the “normal equation” of a generalized linear model (GLM).1
Normal equations of GLMs
Consider a GLM for response
- Linear regression:
, with leading to standard least squares. - Logistic regression:
, with corresponding to score equation for MLE.
Cox model as a GLM
Model specification
Let
Reformulation as a mean model
Let’s find a mean model implied by
Consider
Now, we have a formulation similar to
Normal equation with time-varying intercept
Therefore,
Use the first-line equation in
Conclusion
The Cox model can be reframed as a GLM for a binary event-indicator against an exponentially linked linear predictor with a time-dependent intercept. In this view, the partial likelihood score function aligns exactly with the normal equation of the GLM. This connection offers a new perspective on the Cox model and its estimation.
Footnotes
Whitehead (1980) presented a similar derivation (Whitehead, J. “Fitting Cox’s Regression Model to Survival Data Using GLIM.” Journal of the Royal Statistical Society Series C: Applied Statistics 29 (3): 268. https://doi.org/10.2307/2346901.)↩︎