R Dataset / Package datasets / anscombe


On this R-data statistics page, you will find information about the anscombe data set which pertains to Anscombe's Quartet of ‘Identical’ Simple Linear Regressions. The anscombe data set is found in the datasets R package. You can load the anscombe data set in R by issuing the following command at the console data("anscombe"). This will load the data into a variable called anscombe. If R says the anscombe data set is not found, you can try installing the package by issuing this command install.packages("datasets") and then attempt to reload the data. If you need to download R, you can go to the R project website. You can download a CSV (comma separated values) version of the anscombe R data set. The size of this file is about 364 bytes.

Anscombe's Quartet of ‘Identical’ Simple Linear Regressions


Four x-y datasets which have the same traditional statistical properties (mean, variance, correlation, regression line, etc.), yet are quite different.




A data frame with 11 observations on 8 variables.

R project statistics dataset table
x1 == x2 == x3 the integers 4:14, specially arranged
x4 values 8 and 19
y1, y2, y3, y4 numbers in (3, 12.5) with mean 7.5 and sdev 2.03


Tufte, Edward R. (1989) The Visual Display of Quantitative Information, 13–14. Graphics Press.


Anscombe, Francis J. (1973) Graphs in statistical analysis. American Statistician, 27, 17–21.


require(stats); require(graphics)
summary(anscombe)##-- now some "magic" to do the 4 regressions in a loop:
ff <- y ~ x
mods <- setNames(as.list(1:4), paste0("lm", 1:4))
for(i in 1:4) {
  ff[2:3] <- lapply(paste0(c("y","x"), i), as.name)
  ## or   ff[[2]] <- as.name(paste0("y", i))
  ##      ff[[3]] <- as.name(paste0("x", i))
  mods[[i]] <- lmi <- lm(ff, data = anscombe)
}## See how close they are (numerically!)
sapply(mods, coef)
lapply(mods, function(fm) coef(summary(fm)))## Now, do what you should have done in the first place: PLOTS
op <- par(mfrow = c(2, 2), mar = 0.1+c(4,4,1,1), oma =  c(0, 0, 2, 0))
for(i in 1:4) {
  ff[2:3] <- lapply(paste0(c("y","x"), i), as.name)
  plot(ff, data = anscombe, col = "red", pch = 21, bg = "orange", cex = 1.2,
       xlim = c(3, 19), ylim = c(3, 13))
  abline(mods[[i]], col = "blue")
mtext("Anscombe's 4 Regression data sets", outer = TRUE, cex = 1.5)

Dataset imported from https://www.r-project.org.

Title Authored on Content type
OpenIntro Statistics Dataset - dream August 9, 2020 - 12:25 PM Dataset
OpenIntro Statistics Dataset - winery_cars August 9, 2020 - 2:38 PM Dataset
R Dataset / Package HSAUR / toothpaste March 9, 2018 - 1:06 PM Dataset
R Dataset / Package HSAUR / pottery March 9, 2018 - 1:06 PM Dataset
R Dataset / Package HistData / Guerry March 9, 2018 - 1:06 PM Dataset
<iframe src="https://r-data.pmagunia.com/iframe/r-dataset-package-datasets-anscombe.html" frameBorder="0" width="100%" height="307px" />
Attachment Size
dataset-84646.csv 364 bytes
Dataset License
GNU General Public License v2.0
Documentation License
GNU General Public License v2.0

This documentation is licensed under GPLv3 or later.