On this Picostat.com statistics page, you will find information about the student data set which pertains to Hypothetical student-level data. The student data set is found in the mediation R package. You can load the student data set in R by issuing the following command at the console data("student"). This will load the data into a variable called student. If R says the student data set is not found, you can try installing the package by issuing this command install.packages("mediation") and then attempt to reload the data. If you need to download R, you can go to the R project website. You can download a CSV (comma separated values) version of the student R data set. The size of this file is about 283,952 bytes.
Hypothetical student-level data
The original data source is the Education Longitudinal Study of 2002. To deal with the issue on individually identifiable information, we generated hypothetical student-level data using a multiple imputation method. The Education Longitudinal Study of 2002 used a two-stage sample selection process. First, a national sample of schools was selected using stratified probability proportional to size (PPS), and school contacting resulted in 1,221 eligible public, Catholic, and other private schools from a population of approximately 27,000 schools containing 10th grade students. Of the eligible schools, 752 participated in the study. In the second stage of sample selection, a sample of approximately 26 sophomores, from within each of the participating public and private schools was selected. Each school was asked to provide a list of 10th grade students, and quality assurance (QA) checks were performed on each list that was received.
A data matrix with 9,679 rows and 17 columns, containing no missing values. The data are provided only for illustrative purposes and not for inference about education effectiveness, for which the original data source should be consulted.
Indicator variable for fight at school. 1 = fight.
Indicator variable for attachment to school. 1 = like.
Indicator variable for part-time job. 1 = work.
Measure of math score.
Frequency in which the student was late for school. 5 levels.
Indicator variable for coeducation. 1 = coeducation.
Measure of student morale in the school. 4 levels.
Indicator variable for gender. 1 = female.
Total family income. 13 levels.
Percent of 10th grade students receiving free lunch. 1 to 7 levels.
Parents highest level of education. 8 levels
Indicator variable for catholic school. 1 = catholic school.
The complete student-level data is available from the data archives at www.icpsr.umich.edu/
United States Department of Education. National Center for Education Statistics
Dataset imported from https://www.r-project.org.