library(condsurv)
library(dplyr)
library(survival)

If \(S(t)\) represents the survival function at time \(t\), then conditional survival is defined as

\[S(y|x) = \frac{S(x + y)}{S(x)}\]

where \(y\) is the number of additional survival years of interest and \(x\) is the number of years a subject has already survived.

Generating conditional survival estimates

The conditional_surv_est function will generate this estimate along with 95% confidence intervals.

The lung dataset from the survival package will be used to illustrate.

First generate a single conditional survival estimate. This is the conditional survival of surviving to 1 year conditioned on already having survived 6 months (\(0.5\) year). This returns a list, where cs_est is the conditional survival estimate, cs_lci is the lower bound of the 95% confidence interval and cs_uci is the upper bound of the 95% confidence interval.

You can easily use purrr::map_df to get a table of estimates for multiple timepoints. For example we could get the conditional survival estimate of surviving to a variety of different time points given that the subject has already survived for 6 months (0.5 years).

years cs_est cs_lci cs_uci
1.0 0.58 0.49 0.66
1.5 0.36 0.27 0.45
2.0 0.16 0.10 0.25
2.5 0.07 0.02 0.15

A note on confidence interval estimation

The confidence intervals are based on a variation of the log-log transformation, also known as the “exponential” Greenwood formula, where the conditional survival estimate is substituted in for the traditional survival estimate in constructing the confidence interval.

If \(\hat{S}(y|x)\) is the estimated conditional survival to \(y\) given having already survived to \(x\), then

\[\hat{S}(y|x)^{exp(\pm1.96\sqrt{\hat{L}(y|x)})}\]

where

\[\hat{L}(y|x)=\frac{1}{\log(\hat{S}(y|x))^2}\sum_{j:x \leq \tau_j \leq y}\frac{d_j}{(r_j-d_j)r_j}\]

and

\(\tau_j\) = distinct death time \(j\)

\(d_j\) = number of failures at death time \(j\)

\(r_j\) = number at risk at death time \(j\)