# Install if needed
# install.packages("npsurvSS")
library(npsurvSS)
# 1. Define arms (control vs. treatment) for a two-arm study
<- create_arm(size = 500,
control accr_time = 2, # uniform accrual over 2 years
follow_time = 3.5, # then followed 3.5 years
surv_scale = 0.174, # baseline hazard for event
loss_scale = 0.01) # hazard of loss to follow-up
<- create_arm(size = 500,
treatment accr_time = 2,
follow_time = 3.5,
surv_scale = 0.174 * 0.8, # hazard ratio = 0.8
loss_scale = 0.01)
# 2. Compute required sample size for different tests
<- size_two_arm(
res_size
control,
treatment,test = list(
list(test = "weighted logrank"), # log-rank
list(test = "rmst difference", milestone = 3), # RMST at 3 years
list(test = "rmst difference", milestone = 5) # RMST at 5 years
)
)
print(res_size)
Chapter 6 - Sample Size Determination and Study Design
Slides
Lecture slides here. (To convert html to pdf, press E \(\to\) Print \(\to\) Destination: Save to pdf)
Chapter Summary
Determining an appropriate sample size is a crucial step in designing a time-to-event study. Inadequate sample size can lead to insufficient power, while an overpowered study may waste resources. As two most important methods in survival analysis, the log-rank test (or equivalently, a Cox model with a binary predictor) and restricted mean survival time (RMST) analysis need special attanetion. In their sample size calculations, study design choices—such as accrual patterns, length of follow-up, and loss to follow-up assumptions—play an important role.
General approach to sample size calculation
A common starting point is to consider a test statistic \(S_n=\sqrt n T_n /\hat\sigma_n\) that is approximately standard normal under the null hypothesis. Define a minimal clinically important effect size \(\theta\) (e.g., a targeted log-hazard ratio or a difference in RMST) and specify the desired power \(\gamma\). The general form for the required sample size, under asymptotic normality, often reduces to
\[ n \;=\; \frac{\sigma_0^2\,\bigl(z_{1-\alpha/2} \,+\, z_\gamma\bigr)^2}{f(\theta)^2}, \]
where \(f(\theta) = E(T_n)\) is the “signal” contributed by the effect size, \(\sigma_0^2\) is the variance term under the null, and \(z_{p}\) is the \(p\)-quantile of the standard normal distribution. Intuitively, stronger signals \(\bigl|f(\theta)\bigr|\) reduce the required \(n\), while higher variance \(\sigma_0^2\) or more stringent significance/power criteria increase \(n\).
Schoenfeld’s formula for the Cox (log-rank) test
Under a proportional hazards model with a binary treatment: \[ \lambda(t \mid Z=1) \;=\; \lambda_0(t)\,\exp(\theta), \] the log-rank test statistic has a well-known large-sample result leading to Schoenfeld’s formula:
\[\begin{equation}\label{eq:cox:sample_size} n \;=\; \frac{\bigl(z_{1-\alpha/2} + z_\gamma\bigr)^2}{q(1-q)\,\psi\,\theta^2}, \tag{1} \end{equation}\]
where:
- \(q\) is the proportion allocated to treatment,
- \(\theta\) is the log-hazard ratio (\(\theta=0\) under the null),
- \(\psi\) is the expected proportion of events (failures).
Alternatively, we can express \(n\psi\), the total number of events, as the right-hand side of \(\eqref{eq:cox:sample_size}\) without \(\psi\) in the denominator. You often hear this referred to as an “event-driven” calculation—knowing the total number of events required can be more important than total enrollment.
Sample size for RMST
The restricted mean survival time (RMST) over \([0, \tau]\) captures the average time lived up to \(\tau\). For a two-sample comparison, the difference \(\theta(\tau) = \mu_1(\tau) - \mu_0(\tau)\) can be tested, by its estimator \(\hat\theta(\tau)\). The corresponding sample size formula is:
\[\begin{equation}\label{eq:rmst:sample_size} n \;=\; \frac{\zeta(\tau)\,\bigl(z_{1-\alpha/2} + z_\gamma\bigr)^2}{ q(1-q)\,\theta(\tau)^2}, \tag{2} \end{equation}\]
where \(\zeta(\tau)\) is a variance-related term (depending on survival and censoring distributions). While slightly less efficient under strict proportional hazards, RMST-based tests offer a direct interpretation of how many additional months or years are gained under one treatment arm compared to another.
Study design and its impact
Study design considerations—uniform accrual over \(b\) years, additional follow-up \(c\) years after accrual closes, and potential exponential loss to follow-up—shape the computations of \(\psi\) (for log-rank) and \(\zeta(\tau)\) (for RMST) through the censoring distribution.
Under this set-up:
- Administrative censoring: Uniform\([c, b + c]\)
- Loss to follow-up: Assume exponential dropout with rate \(\lambda_L\).
These assumptions feed into analytic or numerical formulas to calculate:
- Censoring survival function \[\begin{equation}\label{eq:design:censoring} G(t;\lambda_L, b, c):=\mathrm{pr}(C>t)=\left\{\begin{array}{ll} \exp(-\lambda_L t)& 0\leq t\leq c\\ b^{-1}(c+b-t)\exp(-\lambda_L t) & c<t<c+b \\ 0& t\geq c+b. \end{array}\right. \end{equation}\]
- \(\psi\) for log-rank (closed-form) \[\begin{equation}\label{eq:cox:psi} \psi=\frac{\lambda_0}{\lambda_0+\lambda_L}\left[1-\exp\{-(\lambda_0+\lambda_L)c\} \frac{1-\exp\{-(\lambda_0+\lambda_L)b\}}{(\lambda_0+\lambda_L)b}\right] \end{equation}\]
- \(\zeta(\tau)\) for RMST (needs numerical integration) \[\begin{equation}\label{eq:design:rmst_zeta} \zeta(\tau)=\lambda_0^{-1}\int_0^\tau\{\exp(-\lambda_0 t)-\exp(-\lambda_0 \tau)\}^2\exp(\lambda_0 t)G(t;\lambda_L, b, c)^{-1}\mathrm{d} t \end{equation}\]
- Plug \(\psi\) or \(\zeta(\tau)\) back into sample size formulas \(\eqref{eq:cox:sample_size}\) or \(\eqref{eq:rmst:sample_size}\), respectively.
A general implementation
The R package npsurvSS
offers more general routines for sample size/power calculations under log-rank and RMST approaches. Suppose we hypothesize an exponential event rate \(\lambda_0\) in the control group and a hazard ratio \(\mathrm{HR} = \exp(\theta)\) in the treatment group:
You can customize the accrual pattern (piecewise uniform or otherwise), use Weibull rather than exponential survival times, or request sample size for alternative tests (e.g., Gehan–Wilcoxon, RMST ratio, survival probability difference).
Conclusion
Planning a time-to-event study involves balancing the feasibility of enrolling and following patients against the power to detect a meaningful effect. Schoenfeld’s formula under the proportional hazards framework yields intuitive “event-driven” calculations, while RMST-based approaches similarly ensure adequate power for differences in average time lived within a specified window. Because design factors—accrual, follow-up length, and dropout processes—directly enter these formulas, it is vital to specify reasonable assumptions and possibly refine them via pilot data.
R code
Show the code
###############################################################################
# Chapter 6 R Code
#
# This script reproduces all major numerical results in Chapter 6, including:
# 1. Functions for sample size calculations (psi_fun, zeta_fun)
# 2. Numerical examples from Section 6.2.2
# 3. Sample size computations for the GBC pilot data
###############################################################################
#==============================================================================
# (A) Functions Needed for Sample Size Calculation
#==============================================================================
#------------------------------------------------------------------------------
# 1. psi_fun(lambda0, lambdaL, b, c)
# Computes the event proportion psi for Cox model sample size
#
# Inputs:
# lambda0 = hazard rate for T
# lambdaL = hazard rate for LTFU
# b = length of accrual (years)
# c = additional length of follow-up (years)
#------------------------------------------------------------------------------
<- function(lambda0, lambdaL, b, c) {
psi_fun <- lambda0 + lambdaL
lambda <- lambda0 / lambda * (
psi 1 - exp(-lambda * c) * (1 - exp(-lambda * b)) / (lambda * b)
)return(psi)
}
#------------------------------------------------------------------------------
# 2. zeta_fun(tau, lambda0, lambdaL, b, c)
# Computes the variance component for RMST sample size via numerical integral
#
# Inputs:
# tau = restricting time
# lambda0 = hazard rate for T
# lambdaL = hazard rate for LTFU
# b = length of accrual (years)
# c = additional follow-up (years)
#------------------------------------------------------------------------------
# Survival function for censoring
<- function(t, lambdaL, b, c) {
Gfun <- ifelse(
Gt <= c,
t exp(-lambdaL * t),
ifelse(
< (c + b),
t exp(-lambdaL * t) * (b + c - t) / b,
0
)
)return(Gt)
}
# The integrand for zeta
<- function(t, tau, lambda0, lambdaL, b, c) {
zeta_integrand <- (exp(-lambda0 * t) - exp(-lambda0 * tau))^2 *
integrand exp(lambda0 * t) / (Gfun(t, lambdaL, b, c) * lambda0)
return(integrand)
}
# Main function using integrate()
<- function(tau, lambda0, lambdaL, b, c) {
zeta_fun <- function(t) {
f zeta_integrand(t, tau, lambda0, lambdaL, b, c)
}<- integrate(f, lower = 0, upper = tau)
zeta return(zeta$value)
}
#==============================================================================
# (B) Numerical Results in Section 6.2.2
#==============================================================================
# Evaluate zeta_fun at specific values
zeta_fun(tau = 5, lambda0 = 0.2, lambdaL = 0.01, b = 2, c = 4)
#==============================================================================
# (C) Sample Size Calculation Example with GBC Data
#==============================================================================
library(survival)
#------------------------------------------------------------------------------
# 1. Read GBC Data
#------------------------------------------------------------------------------
<- read.table("Data//German Breast Cancer Study//gbc.txt")
gbc
# Sort the data by time within each id
<- order(gbc$id, gbc$time)
o <- gbc[o, ]
gbc
# Keep the first row for each id
<- gbc[!duplicated(gbc$id), ]
df
# Set status = 1 if status == 1 or 2
$status <- (df$status > 0) + 0
df
# Subset: post-menopausal (meno==2) & no hormonal treatment (hormone==1)
<- df[df$meno == 2 & df$hormone == 1, ]
pilot <- nrow(pilot) # n = 209, matches Table 1.12 (Chapter 1)
n
#------------------------------------------------------------------------------
# 2. Estimate Baseline Hazard and Setup
#------------------------------------------------------------------------------
$time <- pilot$time / 12 # convert months to years
pilot<- sum(pilot$status > 0) / sum(pilot$time) # ~0.174
lambda0
# Accrual (b = 2 years), follow-up (c = 3.5 years)
<- 0.01 # LTFU rate
lambdaL <- 2
b <- 3.5
c
# Compute psi and zeta at tau = 3, 5
<- psi_fun(lambda0, lambdaL, b, c)
psi <- zeta_fun(tau = 3, lambda0, lambdaL, b, c)
zeta3 <- zeta_fun(tau = 5, lambda0, lambdaL, b, c)
zeta5
#------------------------------------------------------------------------------
# 3. Sample Size Computations for Various Hazard Ratios
#------------------------------------------------------------------------------
<- seq(0.6, 0.9, by = 0.01) # hazard ratio range
HR <- lambda0 * HR
lambda1
# Function for difference in RMST (exponential assumption)
<- function(tau, lam0, lam1) {
RMST_diff <- (1 - exp(-lam1 * tau)) / lam1 - (1 - exp(-lam0 * tau)) / lam0
val return(val)
}
# For tau = 3 and tau = 5
<- RMST_diff(3, lambda0, lambda1)
theta3 <- RMST_diff(5, lambda0, lambda1)
theta5
# Setup for alpha = 0.05 (two-sided), power = 0.8 or 0.9
<- 0.5
q <- qnorm(0.975)
za <- c(0.80, 0.90)
gamma_list
par(mfrow = c(1, 2))
for (i in seq_along(gamma_list)) {
<- gamma_list[i]
gamma <- qnorm(gamma)
zg
# Sample sizes for log-rank, 3-RMST, and 5-RMST
<- (za + zg)^2 / (q * (1 - q) * psi * (log(HR))^2)
ncox <- zeta3 * (za + zg)^2 / (q * (1 - q) * theta3^2)
nRMST3 <- zeta5 * (za + zg)^2 / (q * (1 - q) * theta5^2)
nRMST5
plot(
HR, ncox,type = "l",
lwd = 2,
ylim = c(0, 7000),
xlab = "Hazard ratio",
ylab = "Sample size",
main = paste0("Power = ", gamma),
cex.lab = 1.2,
cex.axis = 1.2
)lines(HR, nRMST5, lty = 2, lwd = 2)
lines(HR, nRMST3, lty = 3, lwd = 2)
legend(
"topleft",
lty = 1:3,
c("Log-rank", "5-RMST", "3-RMST"),
lwd = 2,
cex = 1.2
)
}
#==============================================================================
# (D) Demonstration with npsurvSS Package (Yung and Liu, 2020)
#==============================================================================
library(npsurvSS)
#------------------------------------------------------------------------------
# 1. Configure Two Arms (Control vs. Treatment)
#------------------------------------------------------------------------------
<- 0.173 # baseline event hazard
lambda0 <- 0.01 # hazard rate for LTFU
lambdaL
<- create_arm(
control size = 500,
accr_time = 2,
follow_time= 3.5,
surv_scale = lambda0,
surv_shape = 1, # default
loss_scale = lambdaL,
loss_shape = 1
)
<- lambda0 * 0.8
lambda1
<- create_arm(
treatment size = 500,
accr_time = 2,
follow_time= 3.5,
surv_scale = lambda1,
loss_scale = lambdaL
)
#------------------------------------------------------------------------------
# 2. Compare Required Sample Sizes for Several Tests
#------------------------------------------------------------------------------
size_two_arm(
control,
treatment,test = list(
list(test = "weighted logrank"), # Log-rank test
list(test = "rmst difference", milestone = 3), # RMST at 3 years
list(test = "rmst difference", milestone = 5) # RMST at 5 years
)
)
# Alternatively, power calculations could be done via power_two_arm()
# power_two_arm(control, treatment)