On this R-data statistics page, you will find information about the faithful data set which pertains to Old Faithful Geyser Data. The faithful data set is found in the datasets R package. You can load the faithful data set in R by issuing the following command at the console data("faithful"). This will load the data into a variable called faithful. If R says the faithful data set is not found, you can try installing the package by issuing this command install.packages("datasets") and then attempt to reload the data. If you need to download R, you can go to the R project website. You can download a CSV (comma separated values) version of the faithful R data set. The size of this file is about 2,275 bytes.
Old Faithful Geyser Data
Waiting time between eruptions and the duration of the eruption for the Old Faithful geyser in Yellowstone National Park, Wyoming, USA.
A data frame with 272 observations on 2 variables.
R project statistics dataset table
||Eruption time in mins
||Waiting time to next eruption (in mins)
A closer look at
faithful$eruptions reveals that these are heavily rounded times originally in seconds, where multiples of 5 are more frequent than expected under non-human measurement. For a better version of the eruption times, see the example below.
There are many versions of this dataset around: Azzalini and Bowman (1990) use a more complete version.
Härdle, W. (1991) Smoothing Techniques with Implementation in S. New York: Springer.
Azzalini, A. and Bowman, A. W. (1990). A look at some data on the Old Faithful geyser. Applied Statistics 39, 357–365.
geyser in package MASS for the Azzalini–Bowman version.
f.tit <- "faithful data: Eruptions of Old Faithful"ne60 <- round(e60 <- 60 * faithful$eruptions)
all.equal(e60, ne60) # relative diff. ~ 1/10000
table(zapsmall(abs(e60 - ne60))) # 0, 0.02 or 0.04
faithful$better.eruptions <- ne60 / 60
te <- table(ne60)
te[te >= 4] # (too) many multiples of 5 !
plot(names(te), te, type = "h", main = f.tit, xlab = "Eruption time (sec)")plot(faithful[, -3], main = f.tit,
xlab = "Eruption time (min)",
ylab = "Waiting time to next eruption (min)")
lines(lowess(faithful$eruptions, faithful$waiting, f = 2/3, iter = 3),
col = "red")
Dataset imported from https://www.r-project.org.