R Dataset / Package robustbase / hbk

Documentation

On this Picostat.com statistics page, you will find information about the hbk data set which pertains to Hawkins, Bradu, Kass's Artificial Data. The hbk data set is found in the robustbase R package. You can load the hbk data set in R by issuing the following command at the console data("hbk"). This will load the data into a variable called hbk. If R says the hbk data set is not found, you can try installing the package by issuing this command install.packages("robustbase") and then attempt to reload the data. If you need to download R, you can go to the R project website. You can download a CSV (comma separated values) version of the hbk R data set. The size of this file is about 1,202 bytes.


Hawkins, Bradu, Kass's Artificial Data

Description

Artificial Data Set generated by Hawkins, Bradu, and Kass (1984). The data set consists of 75 observations in four dimensions (one response and three explanatory variables). It provides a good example of the masking effect. The first 14 observations are outliers, created in two groups: 1–10 and 11–14. Only observations 12, 13 and 14 appear as outliers when using classical methods, but can be easily unmasked using robust distances computed by, e.g., MCD - covMcd().

Usage

data(hbk)

Format

A data frame with 75 observations on 4 variables, where the last variable is the dependent one.

X1

x[,1]

X2

x[,2]

X3

x[,3]

Y

y

Note

This data set is also available in package wle as artificial.

Source

Hawkins, D.M., Bradu, D., and Kass, G.V. (1984) Location of several outliers in multiple regression data using elemental sets. Technometrics 26, 197–208.

P. J. Rousseeuw and A. M. Leroy (1987) Robust Regression and Outlier Detection; Wiley, p.94.

Examples

data(hbk)
plot(hbk)
summary(lm.hbk <- lm(Y ~ ., data = hbk))hbk.x <- data.matrix(hbk[, 1:3])
(cHBK <- covMcd(hbk.x))
--

Dataset imported from https://www.r-project.org.

Title Authored on Content type
R Dataset / Package psych / bfi March 9, 2018 - 1:06 PM Dataset
OpenIntro Statistics Dataset - scotus_healthcare August 9, 2020 - 2:38 PM Dataset
R Dataset / Package psych / withinBetween March 9, 2018 - 1:06 PM Dataset
R Dataset / Package Stat2Data / Kids198 March 9, 2018 - 1:06 PM Dataset
R Dataset / Package Ecdat / Wages1 March 9, 2018 - 1:06 PM Dataset
Attachment Size
dataset-35859.csv 1.17 KB
Dataset License
GNU General Public License v2.0
Documentation License
GNU General Public License v2.0