Applied Survival Analysis

Chapter 10 - Competing/Semi-Competing Risks

Lu Mao

Department of Biostatistics & Medical Informatics

University of Wisconsin-Madison

Outline

  1. Cause-specific hazard and cumulative incidence

  2. Non- and semi-parametric methods

  3. Analysis of bone marrow transplantation study

  4. Semi-competing risks and examples

\[\newcommand{\d}{{\rm d}}\] \[\newcommand{\T}{{\rm T}}\] \[\newcommand{\dd}{{\rm d}}\] \[\newcommand{\cc}{{\rm c}}\] \[\newcommand{\pr}{{\rm pr}}\] \[\newcommand{\var}{{\rm var}}\] \[\newcommand{\se}{{\rm se}}\] \[\newcommand{\indep}{\perp \!\!\! \perp}\] \[\newcommand{\Pn}{n^{-1}\sum_{i=1}^n}\]

Basic Quantities

Definition and Examples

  • Competing risks
    • Definition: multiple types of failures, occurrence of one precludes others
      • Multiple latent risks competing with each other for first occurrence
    • Examples: different causes of death
  • Defining feature
    • One failure per subject
    • Less info than multivariate failure times
    • Implications for analysis and interpretation

Outcome Data & Identifiability

  • Target of inference: \((T, \Delta)\)
    • \(T\): time to failure
    • \(\Delta \in \{1, \ldots, K\}\): type/cause of failure (categorical)
  • Multivariate perspective
    • A conceptual framework \[\begin{equation}\label{eq:cmpr:mult} T=\min(T_1,\ldots, T_K) \hspace{1mm}\mbox{ and } \hspace{1mm}\Delta=\arg\min_{k=1,\ldots, K} T_k \end{equation}\]
      • \(T_k\): time to \(k\)th latent risk \((k=1,\ldots, K)\) in absence of other risks
      • Competing risks \(=\) partially observed multivariate failure times
    • Identifiability: net distribution of latent \(T_k\) not recognizable from \((T, \Delta)\)
      • unless under unrealistic assumption of mutual independence of the \(T_k\)

Two Approaches

  • Two ways to characterize \((T, \Delta)\)
    • Cause-specific hazard
    • Cumulative incidence (Sub-distribution)
  • Cause-specific hazard (CSH)
    • Definition: instantaneous incidence of \(k\)th risk given overall “survival” \[\begin{equation}\label{eq:cmpr:cs_hazard} \dd\Lambda_k^\cc(t)=\pr(t\leq T<t+\dd t, \Delta=k\mid T\geq t) \end{equation}\]
    • Example: \(\dd\Lambda_1^\cc(t)=\) incidence rate for CV death among survivors at \(t\)
      • \(k =1\): CV death; \(k =2\): other causes of death
    • Overall hazard: partitioned into CSH \[\begin{equation}\label{eq:cmpr:overall_hazard} \dd\Lambda(t)=\pr(t\leq T<t+\dd t \mid T\geq t) = \sum_{k=1}^K \dd\Lambda_k^\cc(t) \end{equation}\]

Cause-Specific Hazard

  • Limitations
    • The cumulative CSH \(\Lambda_k^\cc(t)\) not a meaningful quantity
    • \(\exp\{-\Lambda_k^\cc(t)\}\) not a survival function of any kind
  • Correspondence with “net hazard”
    • Under (unrealistic) mutual independence of the latent \(T_k\)
    • Hazard of \(T_k\) identifiable and equal to \(\Lambda_k^\cc(t)\)
    • Not recommended

Cumulative Incidence

  • Cumulative incidence function (CIF)
    • Definition: probability of failure from \(k\)th risk in presence of other risks \[ F_k(t)=\pr(T\leq t, \Delta=k) \]
    • Marginal quantity, easy to interpret (a real-world probability)
    • a.k.a. Sub-distribution function
    • \(F(t)=\pr(T\leq t) = \sum_{k=1}^K F_k(t)\)
  • Relationship
    • CSH in terms of CIF (for a particular risk, no one-to-one correspondence) \[\begin{equation}\label{eq:cmpr:correspond} \dd\Lambda_k^\cc(t)=\frac{\dd F_k(t)}{1-\sum_{l=1}^K F_l(t-)} \end{equation}\]

CSH vs CIF

  • Example: effect on CSH may not align with CIF
    • Treatment: \(\Lambda_1^\cc(t)=\Lambda_2^\cc(t)=3t\)
    • Control: \(\Lambda_1^\cc(t)=2t\) and \(\Lambda_2^\cc(t)=t\)

Comparison of Functions

  • CSH vs net hazard vs CIF

Methods for Competing Risks

Observed Data

  • A random \(n\)-sample \[(X_i, \delta_i, Z_i), \,\,\, i=1,\ldots, n\]
    • \(X=T\wedge C\)
    • \(\delta=\Delta I(T\leq C)\) (0: censored; \(k\): observed \(k\)th risk, \(k=1,\ldots, K\))
    • \(C\): independent censoring time
    • \(Z\): covariates
    • Observed counting process for \(k\)th risk \[ N_{ki}(t)=I(X_i\leq t, \delta_i=k) \]

Cause-Specific Hazard Models (I)

  • Proportional cause-specific hazards \[\begin{equation}\label{eq:cmpr:ph_csh} \pr(t\leq T<t+\dd t, \Delta=k\mid T\geq t, Z)=\exp(\beta_k^\T Z)\dd\Lambda_{k0}(t) \end{equation}\]
    • A model on “survivors” (those who have not failed from any risk)
    • \(\beta_k\): log-hazard ratios for \(k\)th risk in survivors
    • Estimation \[\begin{equation}\label{eq:cmpr:pls_csh} U_{nk}(\beta_k)=\Pn\int_0^\infty\left\{Z_i-\frac{\sum_{j=1}^n I(X_j\geq t)Z_j\exp(\beta_k^\T Z_j)}{\sum_{j=1}^n I(X_j\geq t)\exp(\beta_k^\T Z_j)}\right\}\dd N_{ki}(t) \end{equation}\]
      • Essentially treating non-\(k\) risks as censoring
      • Easy to implement in survival::coxph()

Cause-Specific Hazard Models (II)

  • Under (unrealistic) independence of the latent \(T_k\)
    • Proportional cause-specific hazards \(=\) proportional net hazards
    • Frailty to relax independence
    • Not recommended (non-identifiability)
  • Log-rank test
    • Risk-specific log-rank statisic with other risks treated as censoring
    • “Kaplan-Meier” doesn’t work \(\to\) estimand \(\exp\{-\Lambda_k^\cc(t)\}\) is non-quantity
    • Competing-risks equivalent of KM is Gray (1988) estimator of \(F_k(t)\)

CIF: Nonparametric Estimation

  • How to estimate \(F_k(t)\)
    • Recall relationship \[\begin{equation*} \dd F_k(t)=S(t-)\dd\Lambda_k^\cc(t) \end{equation*}\]

    • \(S(t-)=\pr(T\geq t)\): By KM estimator \(\hat S(t-)\) for overall failure

    • \(\dd\Lambda_k^\cc(t)\): By Nelsen-Aalen estimator (non-\(k\) risks as censoring) \[ \dd\hat\Lambda_k^\cc(t)=\frac{\sum_{i=1}^n\dd N_{ki}(t)}{\sum_{i=1}^n I(X_i\geq t)} \]

    • Gray estimator \[\hat F_k(t)=\int_0^t \hat S(u-)\dd\hat \Lambda_k^\cc(u)\]

CIF: Sub-Distribution Hazard

  • Sub-distribution hazard (SDH)
    • Easier to work with hazard-like functions (unbounded)
    • Time to \(k\)th risk in presence of other risks \[ T_k^*=TI(\Delta=k)+\infty I(\Delta\neq k) \]
      • \(F_k(t)=\pr(T_k^*\leq t)\)
    • SDH: incidence of \(k\)th risk given not failing from it \[\begin{equation}\label{eq:cmpr:sdh} \dd\Lambda_k(t)=\pr(t\leq T_k^*<t+\dd t\mid T_k^*\geq t). \end{equation}\]
    • Correspondence with CIF \[\begin{equation}\label{eq:cmpr:subdist} F_k(t)=1-\exp\{-\Lambda_k(t)\}. \end{equation}\]
    • Model/test on SDH \(\Leftrightarrow\) Model/test on CIF

CIF: Gray’s Test

  • Discrete version (\(k\)th risk) \[ \dd\Lambda_k(t)=\frac{\dd F_k(t)}{1-F_k(t-)} \]
  • Weighted log-rank-type test
    • Testing \(H_0:\) equality of CIF between two groups \[\begin{equation}\label{eq:cmpr:gray_tests} \int_0^\infty W(t)\big\{\dd\hat\Lambda_{k1}(t)-\dd\hat\Lambda_{k0}(t)\big\} \end{equation}\]
    • \(\hat\Lambda_{ka}(t)\): Gray’s CIF plug-in estimator for SDH in group \(a\) \((a=1, 0)\)
    • \(W(t)\): Weight function
    • Extensions: multi-group, stratification, etc.

CIF: Fine-Gray Regression

  • Proportional sub-distribution hazards (Fine and Gray, 1999) \[\begin{equation}\label{eq:cmpr:fg} \pr(t\leq T_k^*<t+\dd t\mid T_k^*\geq t, Z)=\exp(\beta_k^\T Z)\dd\Lambda_{k0}(t) \end{equation}\]
    • \(\beta_k\): log-hazard ratios for \(k\)th risk in entire population (under other risks)
    • Target population
      • Cause-specific hazard: survivors
      • Sub-distribution hazard: all, including those who have failed from other causes
    • Estimation and inference: partial-likelihood score with IPCW to address dependent censoring by other risks

Software: cmprk::cuminc()

  • Basic syntax for Gray’s estimator & test
obj <- cuminc(ftime, fstatus, group, strata, rho = 0)
  • Input
    • (ftime, fstatus): \((X, \delta)\)
    • group: group variable (optional); strata: strata variable (optional)
    • rho: \(\rho\) in weight \(W(t)=\hat S(t-)^\rho\) (HF \(G^\rho\) family)
  • Output: a list
    • obj$Tests: tests results on each risk
    • obj$"a k": CIF estimates for \(k\)th risk in group \(a\)
      • time: \(t\); est: \(\hat F_k(t)\); var: \(\hat\var\{\hat F_k(t)\}\)

Software: cmprk::crr() (I)

  • Basic syntax for Fine-Gray model
obj <- crr(ftime, fstatus, cov1, failcode = k)
  • Input
    • (ftime, fstatus): \((X, \delta)\); cov1: \(Z\)
    • failcode = k: models \(k\)th risk
  • Output: a list of class crr
    • obj$coef: \(\hat\beta_k\); obj$var: \(\hat\var(\hat\beta_k)\)
    • obj$uftime: \(t\); obj$bfitj: \(\dd\hat\Lambda_{k0}(t)\)

Software: cmprk::crr() (II)

  • Prediction of CIF by Fine-Gray model
    • obj: a crr object for fit model
    • z: new covariate data
# --- Method 1: use predict.crr() 
obj_pred <- predict(obj, z)
# --- Method 2: manual calculation
beta <- obj$coef
Lambda <- cumsum(obj$bfitj)
time <- obj$uftime
## calculate CIF based on FG model
cif <- 1- exp(- exp(sum(beta * z)) * Lambda)
## Same as obj_pred from Method 1
obj_pred <- cbind(time, cif)

Bone Marrow Transplant Study

Study Background

  • Study information
    • Population: 864 multiple-myeloma leukemia patients undergoing allo-HCT cell transplantation
    • Endpoints: Time from surgery to treatment-related mortality (TRM; death in remission) or relapse of leukemia
    • Risk factors
      • Cohort indicator (years 1995–2000 or 2001–2005)
      • Type of donor (unrelated or HLA-identical sibling)
      • History of a prior transplant
      • Time from diagnosis to transplantation (<24 months, or ≥ 24 months)

Study Data

  • Data format
head(cibmtr)
#    time status cohort donor hist wait
# 1 0.010      2      1     1    1    1
# 2 0.066      2      1     1    1    0
# 3 0.099      1      1     1    1    1
#...

CIF by Donor Type

  • Gray’s nonparametric estimates/tests of CIFs by donor type
obj <- cuminc(cibmtr$time, cibmtr$status, cibmtr$donor, rho = 0)

Regression Analysis

  • Semiparametric regression
    • Proportional cause-specific hazard
    • Proportional sub-distribution hazard (FG)
#change k
k <- 1
#--- proportional cause-specific hazards -----------------------------
obj.cs <- coxph(Surv(time, status == k) ~ cohort + donor + hist + wait,
                data = cibmtr)
#--- Fine and Gray --------------------------------------------------
obj.fg <- crr(cibmtr$time, cibmtr$status, cibmtr[, 3:6], failcode = k)

Regression Results

  • Differential effects of prior surgery
    • Relapse: increases risk by \(42\%\)
    • TRM: decreases risk by \(1-0.63 = 37\%\)

Semi-Competing Risks

Definition and Examples

  • Semi-competing risks
    • Terminal event (competing): death
    • Non-terminal events (non-competing): hospitalization, relapse, etc.
  • Examples
    • Death + relapse of cancer (German breast cancer study)
    • Death terminates nonfatal event but not vice versa
  • Methods
    • Marginal: cumulative incidence/frequency
    • Frailty (Ch. 8); Multistate (Ch. 12); Composite (Ch. 13)

Data and Identifiability

  • Target of inference: \((T, D)\)
    • \(T\): time to nonfatal event
    • \(D\): time to death
    • Joint distribution \[H(s, t)=\pr(D>s, T>t)\]
    • Identifiable region: \(\{(s, t):0\leq t\leq s<\infty\}\)
      • Possibly nonfatal \(\to\) death, not vice versa
    • Marginal models
      • Death: univariate event (KM, log-rank, Cox)
      • Nonfatal event: cumulative incidence (Gray, FG)

Recurrent Events

  • Outcome data: \(\{N^*(\cdot), D\}\)
    • \(N^*(t)\): number of recurrent events in presence of death
      • \(\dd N^*(t)\equiv 0\) for \(t>D\)
      • Repeated tumor occurrences/hospitalizations before death
    • Treating \(D\) as censoring (AG/LWYY) \(\to\) cause-specific event rate \[E\{\dd N^*(t)\mid D\geq t\}\]
    • Incidence rate in survivors
  • Cumulative frequency
    • Mean function in overall population, dead or alive \[ \mu(t)=E\{N^*(t)\} \]
    • Extension of CIF

Marginal Approaches

  • Methods for cumulative frequency
    • Gray-type nonparametric estimator/test (Ghosh and Lin, 2000)
    • Proportional CF model (Fine-Gray-type) (Ghosh and Lin, 2002; 2003) \[\begin{equation}\label{eq:cmpr:pcf} E\{N^*(t)\mid Z\}=\exp(\beta^\T Z)\mu_0(t) \end{equation}\]
    • Higher death rate can reduce CF
  • Joint analysis with mortality
    • Ghosh-Lin for recurrent events \(+\) Cox/log-rank for death
    • \(\chi^2\) test with 2 d.f.

Example: Bladder Tumor Study

  • Bladder cancer trial: thiotepa (\(n=38\)) vs placebo (\(n=48\))
    • Endpoints: tumor recurrences and death (\(\chi_2^2\) test \(p\)-value 0.184)

Conclusion

Notes

  • Gray and FG
    • Recommended by the European Group for Blood and Marrow Transplantation (EBMT) Statistical Committee
    • Extensions
      • Proportional sub-distribution odds model (Eriksson et al., 2015)
      • Sub-distribution transformation models (Mao and Lin, 2017)
      • Regression of interval-censored competing risks data (Mao et al., 2018)
  • Semi-competing risks

Fine, J. P., Jiang, H., & Chappell, R. (2001). On semi-competing risks data. Biometrika, 88(4), 907-919.

Summary

  • Competing risks
    • Cause-specific hazard: conditional failure rate on survivors
      • coxph(Surv(time, status == k) ~ covariates)
    • Cumulative incidence: marginal failure probability under other risks
      • Gray’s estimator/test cmprk::cuminc(ftime, fstatus, group, strata, rho = 0)
      • Fine-Gray model cmprk::crr(ftime, fstatus, cov1, failcode = k)
  • Recurrent events in presence of death
    • Ghosh-Lin methods (R packages mets, reReg, etc.)