OpenIntro Statistics Dataset - photo_classify

Documentation

This dataset was taken from the list of OpenIntro dataset files found at https://www.openintro.org/data/.

OpenIntro features a number of free books that can be used in high school and AP statistics courses. The license on these datasets is currently unknown. You can find out more about OpenIntro at https://www.openintro.org.

photo_classify

Photo classifications: fashion or not

This is a simulated data set for photo classifications based on a machine learning algorithm versus what the true classification is for those photos. While the data are not real, they resemble performance that would be reasonable to expect in a well-built classifier.

Variables

  • mach_learn - The prediction by the machine learning system as to whether the photo is about fashion or not.
  • truth - The actual classification of the photo by a team of humans.

Details

The hypothetical ML algorithm has a precision of 90\ meaning of those photos it claims are fashion, about 90\ of them are actually about fashion. The recall of the ML algorithm is about 64\ of the photos that are about fashion, it correctly predicts that they are about fashion about 64\

Source

The data are simulated / hypothetical.

Taken from: https://www.openintro.org/data/index.php?data=photo_classify.


Title Authored on Content type
R Dataset / Package psych / bfi March 9, 2018 - 1:06 PM Dataset
OpenIntro Statistics Dataset - scotus_healthcare August 9, 2020 - 2:38 PM Dataset
R Dataset / Package psych / withinBetween March 9, 2018 - 1:06 PM Dataset
R Dataset / Package Stat2Data / Kids198 March 9, 2018 - 1:06 PM Dataset
R Dataset / Package Ecdat / Wages1 March 9, 2018 - 1:06 PM Dataset
Attachment Size
dataset-151428313.csv 32.33 KB
Dataset License
Unknown
Documentation License
No license (All rights reserved)