On this Picostat.com statistics page, you will find information about the kidney data set which pertains to Kidney catheter data. The kidney data set is found in the survival R package. You can load the kidney data set in R by issuing the following command at the console data("kidney"). This will load the data into a variable called kidney. If R says the kidney data set is not found, you can try installing the package by issuing this command install.packages("survival") and then attempt to reload the data. If you need to download R, you can go to the R project website. You can download a CSV (comma separated values) version of the kidney R data set. The size of this file is about 1,798 bytes.

Data on the recurrence times to infection, at the point of insertion of the catheter, for kidney patients using portable dialysis equipment. Catheters may be removed for reasons other than infection, in which case the observation is censored. Each patient has exactly 2 observations.

This data has often been used to illustrate the use of random effects (frailty) in a survival model. However, one of the males (id 21) is a large outlier, with much longer survival than his peers. If this observation is removed no evidence remains for a random subject effect.


patient: id
time: time
status: event status
age: in years
sex: 1=male, 2=female
disease: disease type (0=GN, 1=AN, 2=PKD, 3=Other)
frail: frailty estimate from original paper


The original paper ignored the issue of tied times and so is not exactly reproduced by the survival package.


CA McGilchrist, CW Aisbett (1991), Regression with frailty in survival analysis. Biometrics 47, 461–66.


kfit <- coxph(Surv(time, status)~ age + sex + disease + frailty(id), kidney)
kfit0 <- coxph(Surv(time, status)~ age + sex + disease, kidney)
kfitm1 <- coxph(Surv(time,status) ~ age + sex + disease + 
		frailty(id, dist='gauss'), kidney)

Dataset imported from https://www.r-project.org.

