Technical Notes for Racial/Ethnic Score Gap Tool
The Racial/Ethnic Score Gap Tool provides users with the opportunity to observe how various factors, consisting of NAEP survey question variables, are related to the score gaps of select racial/ethnic student groups. At present, this tool uses 2019 NAEP reading and mathematics data at grade 12 and 2022 NAEP reading and mathematics data at grades 4 and 8 to allow users to see how average score gaps between White–Black, White–Hispanic, and Asian–White students change when you consider additional factors that may relate to student performance.
The analyses shown in the tool are exploratory. The tool allows users to examine whether there are associations between select factors or variables collected by NAEP and the White–Black, White–Hispanic, and Asian–White average student score gaps. The tool does not assess whether the relationship between factors and score gaps is causal (i.e., this analysis does not and cannot test whether the select factor causes differential achievement among the student racial/ethnic groups explored here).
NAEP does not provide scores for individual students or schools; instead, it offers results for populations of students (e.g., fourthgraders) and subgroups of those populations (e.g., female students, Hispanic students) as well as results regarding subjectmatter achievement, instructional experiences, and school environment. NAEP results are based on school and student samples that are carefully designed to accurately represent student populations of interest.
NAEP Survey Question Variable Information
Variable selection
Exploratory analyses using 2019 reading data at grade 12 were conducted prior to the final variable selection for the models in the regression analyses. We began with a larger list of potential predictors that could fit into the categories of interest: socioeconomic status (individual); socioeconomic status (school environment); student postsecondary plans and academic behaviors; and student noncognitive factors. Variables that had small associations when controlling for other variables in the model were excluded. These variables were:
 NAEPID
 Item Description
 PARENTS
 Parents in the home
 UTOL4
 School location (i.e., city, suburb, town, or rural)
 PCTBLKC
 Percentage of Black students in school
 PCTHSPC
 Percentage of Hispanic students in school
 C087302
 Percentage of parents attend parentteacher conferences in school
 R850901
 How often for English/language arts class discuss something whole class has read
 R850902
 How often for English/language arts class work in pairs/small groups to discuss something we read
 R850903
 How often for English/language arts class discuss different interpretations of what we have read
 R851003
 How often teachers ask students to critique author's craft or technique in English/language arts class
 R850202
 How often teachers ask students to interpret the meaning of the passage
 RDACT12
 How often teachers ask students to evaluate/analyze/critique when reading
A handful of interaction terms were of interest from the list of remaining variables. The following interactions were evaluated for potential inclusion in the models:
 SQCATR7 (Index of confidence in reading knowledge and skills) x B013801 (Number of books in home)
 SQCATR7 (Index of confidence in reading knowledge and skills) x C087602 (Proportion of last year's graduates attending a fouryear college)
 B013801 (Number of books in home) x C087602 (Proportion of last year's graduates attending a fouryear college)
 APIBELA (Taking/took an Advanced Placement (AP) English or language arts course and/or International Baccalaureate Language A1 course) x C087602 (Proportion of last year's graduates attending a fouryear college)
 SRACE10 (Race/ethnicity) x B013801 (Number of books in home)
 SRACE10 (Race/ethnicity) x SQCATR7 (Index of confidence in reading knowledge and skills)
 SRACE10 (Race/ethnicity) x C087602 (Proportion of last year's graduates attending a fouryear college)
 SRACE10 (Race/ethnicity) x APIBELA (Taking/took an Advanced Placement (AP) English or language arts course and/or International Baccalaureate Language A1 course)
 SRACE10 (Race/ethnicity) x B018101 (Days absent from school in the last month)
 SRACE10 (Race/ethnicity) x PARED (Parental education level)
No interactions showed a meaningful increase in explained variance, so none were retained. The exploratory analysis was used to guide variable selection for other grade/subject combinations.
Tables 1 through 6 below show the variables included in the Racial/Ethnic Score Gap Tool. Tables 1 and 2 include variables for fourth and eighthgrade students in the 2022 NAEP reading assessment and table 3 includes variables for twelfthgrade students in the 2019 NAEP reading assessment. Tables 4 and 5 include variables for fourth and eighthgrade students in the 2022 mathematics assessment and table 6 includes variables for twelfthgrade students in the 2019 mathematics assessment. Click the links to see detailed results for each variable in the NAEP Data Explorer including percentage distribution and average scores.
Please note that while all categories for each variable are displayed in the data explorer, some categories were combined for purposes of running the regression models. Two or more categories were collapsed in the following predictor variables included in the regression: COMPINT, APIBELA (grade 12 only), and SCHNSLP (grade 12 only). Categories were collapsed to help with interpretation and because some categories had very small percentages of students.
 COMPINT: This predictor variable was collapsed from four categories to two categories, where "both computer/tablet and Internet" was one category, and all other responses were combined in a second category.
 APIBELA (grade 12 only): This predictor variable was collapsed from four categories to two categories, where "Neither" was one category, and all other responses were combined in a second category.
 SCHNSLP (grade 12 only): This 5category derived variable was constructed specifically for this secondary report and was based on responses to NAEP variables C038301, C051401, and C051651 as follows:
 If all responses to above variables are missing, then "Missing"; otherwise
 If C038301 = No, then 1 = "School does not participate"; otherwise
 If C051401 = All students, then 6 = "All students".
 C051651 categorized into:
 If omit, then "Missing".
 2 = "0–25%", 3 = "26–50%", 4 = "51–75%", 5 = "76–100%".
 "All students" from C051401 and "76–100%" from C051651 were further combined into a single category.
SCHNSLP is not included in the 2022 grades 4 and 8 regressions because a skip pattern was introduced in 2019 for item VH240216 in which if a school indicates that all students receive free lunch because of a special program, then VH240218 (i.e., the item asking what the percentage of students in the school receive free lunch) is automatically skipped. Because the majority of schools offered such a program in 2022 due to the pandemic, there was a dramatic change in meaning for the variable between 2019 and 2022. This change also limits the utility of the variable in 2022 because nearly three quarters of students fall into the category "All students".
Additional details about the survey questionnaire variables that were used to construct the derived variable SCHNSLP are available in the NAEP Data Explorer by using the NAEP ID to locate the variable of interest for grade 12 reading and mathematics.
TABLE 1 Selected variables included in the regression model for fourthgrade students in NAEP reading: 2022
NAEP Variable ID 
Variable Description 
GENDER (DSEX) 
Gender 
SRACE10 
Race/ethnicity 
SLUNCH3 
National School Lunch Program eligibility (student level) 
COMPINT 
Have internet access and/or computer/tablet at home 
B013801 
Number of books in home 
SQCATR5 
Index of academic selfdiscipline 
SQCATR7 
Index of confidence in reading knowledge and skills 
SQCTR10 
Index of mastery goals in reading 
TABLE 2 Selected variables included in the regression model for eighthgrade students in NAEP reading: 2022
NAEP Variable ID 
Variable Description 
GENDER (DSEX) 
Gender 
SRACE10 
Race/ethnicity 
SLUNCH3 
National School Lunch Program eligibility (student level) 
COMPINT 
Have internet access and/or computer/tablet at home 
B013801 
Number of books in home 
PARED 
Highest level of parental education 
SQCATR4 
Index of persistence in learning 
SQCATR5 
Index of academic selfdiscipline 
SQCATR7 
Index of confidence in reading knowledge and skills 
SQCTR10 
Index of mastery goals in reading 
TABLE 3 Selected variables included in the regression model for twelfthgrade students in NAEP reading: 2019
NAEP Variable ID 
Variable Description 
GENDER (DSEX) 
Gender 
SRACE10 
Race/ethnicity 
SCHNSLP^{1} 
Percentage of students eligible for the National School Lunch Program (NSLP) in school 
COMPINT 
Have internet access and/or computer/tablet at home 
B013801 
Number of books in home 
PARED 
Highest level of parental education 
C087602 
Percentage of last year's graduates enrolled in a fouryear college 
B035705 
Student applied to a fouryear college 
B035702 
Student submitted the Free Application for Federal Student Loan Aid (FAFSA) 
B018101 
Number of days absent from school in the last month 
APIBELA 
Students taking or took an Advanced Placement English or language arts course and/or International Baccalaureate Language A1 
SQCATR4 
Index of persistence in learning 
SQCATR5 
Index of academic selfdiscipline 
SQCATR7 
Index of confidence in reading knowledge and skills 
SQCATR9 
Index of performance goals in reading 
SQCTR10 
Index of mastery goals in reading 
^{1} The derived variable SCHNSLP, constructed specifically for this report, is not available in the NAEP Data Explorer. The following individual variables used to construct SCHNSLP are available in the data explorer: C038301, C051401, and C051651. Please note that NAEP variable ID C051651 shown in the data explorer is the ninecategory, noncollapsed version.

TABLE 4 Selected variables included in the regression model for fourthgrade students in NAEP mathematics: 2022
NAEP Variable ID 
Variable Description 
GENDER (DSEX) 
Gender 
SRACE10 
Race/ethnicity 
SLUNCH3 
National School Lunch Program eligibility (student level) 
COMPINT 
Have internet access and/or computer/tablet at home 
B013801 
Number of books in home 
SQCATM5 
Index of academic selfdiscipline 
SQCATM7 
Index of confidence in mathematics knowledge and skills 
SQCTM10 
Index of mastery goals in mathematics 
TABLE 5 Selected variables included in the regression model for eighthgrade students in NAEP mathematics: 2022
NAEP Variable ID 
Variable Description 
GENDER (DSEX) 
Gender 
SRACE10 
Race/ethnicity 
SLUNCH3 
National School Lunch Program eligibility (student level) 
COMPINT 
Have internet access and/or computer/tablet at home 
B013801 
Number of books in home 
PARED 
Highest level of parental education 
MATCRS8 
Math class taken at eighth grade 
SQCATM4 
Index of persistence in learning 
SQCATM5 
Index of academic selfdiscipline 
SQCATM7 
Index of confidence in mathematics knowledge and skills 
SQCTM10 
Index of mastery goals in mathematics 
TABLE 6 Selected variables included in the regression model for twelfthgrade students in NAEP mathematics: 2019
NAEP Variable ID 
Variable Description 
GENDER (DSEX) 
Gender 
SRACE10 
Race/ethnicity 
SCHNSLP^{1} 
Percentage of students eligible for the National School Lunch Program (NSLP) in school 
COMPINT 
Have internet access and/or computer/tablet at home 
B013801 
Number of books in home 
PARED 
Highest level of parental education 
C087602 
Percentage of last year's graduates enrolled in a fouryear college 
B035705 
Student applied to a fouryear college 
B035702 
Student submitted the Free Application for Federal Student Loan Aid (FAFSA) 
B018101 
Number of days absent from school in the last month 
APIBMAT 
Students taking or took an Advanced Placement calculus or statistics course and/or International Baccalaureate Mathematics 
MATCRST 
Highest math course taken 
SQCATM4 
Index of persistence in learning 
SQCATM5 
Index of academic selfdiscipline 
SQCATM7 
Index of confidence in mathematics knowledge and skills 
^{1} The derived variable SCHNSLP, constructed specifically for this report, is not available in the NAEP Data Explorer. The following individual variables used to construct SCHNSLP are available in the data explorer: C038301, C051401, and C051651. Please note that NAEP variable ID C051651 shown in the data explorer is the ninecategory, noncollapsed version.

Indices related to students' attitudes toward learning
While some survey questions are analyzed and reported individually (for example, number of books in students' homes), several questions on the same topic are combined into an index measuring a single underlying construct or concept. More information about the 2019 (grade 12) and 2022 (grades 4 and 8) NAEP reading and mathematics indices and the individual survey questions they comprise can be found in the NAEP Data Explorer by clicking the links in tables 1 through 6 above.
The creation of 2019 and 2022 indices involved the following four main steps:
 Selection of constructs of interest. The selection of constructs of interest to be measured through the survey questionnaires was guided in part by the National Assessment Governing Board framework for collection and reporting of contextual information. In addition, NCES reviewed relevant literature on key contextual factors linked to student achievement in reading to identify the types of survey questions and constructs needed to examine these factors in the NAEP assessment.
 Question development. Survey questions were drafted, reviewed, and revised. Throughout the development process, the survey questions were reviewed by external advisory groups that included survey experts, subjectarea experts, teachers, educational researchers, and statisticians. As noted above, some questions were drafted and revised with the intent of analyzing and reporting them individually; others were drafted and revised with the intent of combining them into indices measuring constructs of interest.
 Evaluation of questions. New and revised survey questions underwent pretesting whereby a small sample of participants (students, teachers, and school administrators) were interviewed to identify potential issues with their understanding of the questions and their ability to provide reliable and valid responses. Some questions were dropped or further revised based on the pretesting results. The questions were then further pretested among a larger group of participants and responses were analyzed. The overall distribution of responses was examined to evaluate whether participants were answering the questions as expected. Relationships between survey responses and student performance were also examined. A method known as factor analysis was used to examine the empirical relationships among questions to be included in the indices measuring constructs of interest. Factor analysis can show, based on relationships among responses to the questions, how strongly the questions "group together" as a measure of the same construct. Convergent and discriminant validity of the construct with respect to other constructs of interest were also examined. If the construct of interest had the expected pattern of relationships and nonrelationships, the construct validity of the factor as representing the intended index was supported.
 Index scoring. Using the item response theory (IRT) partial credit scaling model, index scores were estimated from students' responses and transformed onto a scale which ranged from 0 to 20. As a reporting aid, each index scale was divided into low, moderate, and high index score categories. The cut points for the index score categories were determined based on the average response to the set of survey questions in each index. In general, high average responses to individual questions correspond to high index score values, and low average responses to individual questions correspond to low index score values. As an example, for a set of index survey questions with five response categories (such as not at all, a little bit, somewhat, quite a bit, and very much), students with an average response of less than 3 (somewhat) would be classified as low on the index. Students with an average response greater than or equal to 3 (somewhat) to less than 4 (quite a bit) would be classified as moderate on the index. Finally, students with an average response of greater than or equal to 4 (quite a bit) would be classified as high on the index.
Items in the grade 12 mathematics student questionnaire comprising indices SQCATM9 and SQCTM10 were not included in the regression model due to missing rates that were above 15 percent which contributed to a high overall missing rate for regression analyses that included these two indices. For example, only 56 percent of the grade 12 students were retained in the regression that included SQCATM9 and SQCTM10. Therefore, only indices SQCATM4, SQCATM5, and SQCATM7 were included as part of the Racial/Ethnic Score Gap Tool.
Regression Models
Regression models were used to enable comparisons between unadjusted score gaps (i.e., regression coefficients from the model that only includes dummy variables for the racial/ethnic categories, excluding the category for White students as the reference group) to adjusted score gaps (i.e., regression coefficients from a model that includes dummy variables for some combination of predictors in addition to race/ethnicity). Unadjusted score gaps represent the difference between the average scores of two groups of students, and adjusted score gaps represent the estimated score gap once regression analysis controls for the variable(s) selected. For example, the White–Black score gap in grade 4 mathematics is 28.95. It is the regression coefficient for the dummy variable associated with Black when the regression model only includes dummy variables for SRACE10 as predictors (excluding White as the reference group). The gap decreased to 17.68 when the SES variables were added as predictors to the model (i.e., when the gap was adjusted for the cluster of SES variables). Unadjusted score gaps are shown as statistically significant if the regression coefficient for the dummy variable associated with a particular race/ethnicity category is significant at the α = 0.05 level.
The regression analyses were run using the 2022 NAEP reading and mathematics grades 4 and 8, and the 2019 NAEP reading and mathematics grade 12 national reporting sample which included about 108,200 fourthgrade, 111,300 eighthgrade, and 26,700 twelfthgrade participating students in reading and about 116,200 fourthgrade, 111,000 eighthgrade, and 25,400 twelfthgrade participating students in mathematics. All analyses used students' scale score as the outcome variable and utilized sampling weights. Coefficient estimates were calculated using 20 plausible values, and standard errors were calculated using 20 plausible values and 62 replicate weights. Missing data was handled by listwise deletion. All predictors included in the model are categorical and were included as dummy variables identifying each group of students for each variable, except for the one group chosen as the omitted category that serves as the reference group.
Comparing Adjusted to Unadjusted Score Gaps
Determining whether score gaps are significantly different between two regression models is not straightforward because the regression coefficients in each model are estimated from the same data, so errors on the regression coefficients for the same predictor may be expected to be positively correlated. With positively correlated or uncorrelated errors, determining statistical significance based on nonoverlapping confidence intervals is conservative (Cumming, 2009), meaning that if two 𝑛 % confidence intervals do not overlap, the difference is statistically significant at the \[a=1{n\over 100}\]
level; if confidence intervals do overlap, differences may not be statistically significant. Note that overlapping confidence intervals do not imply nonsignificance. The regression tool compares unadjusted score gaps (i.e., regression coefficients from the model that only includes dummy variables for the racial/ethnic categories, excluding the category for White students as the reference group) to adjusted score gaps (i.e., regression coefficients from a model that includes dummy variables for some combination of predictors in addition to race/ethnicity) using 95 percent confidence intervals. Nonoverlapping confidence intervals imply statistical significance at the α = 0.05 level.
No multiple comparison adjustments were used in calculating the confidence intervals. Standard NAEP methodology utilizes the BenjaminiHochberg false discovery rate (FDR) procedure to adjust for multiple comparisons in a single analysis (e.g., analyzing White student performance versus the performance of Black, Hispanic, and Asian students); however, the purpose of the confidence intervals in the regression tool is to compare adjusted versus nonadjusted score gaps within a racial/ethnic group rather than to compare unadjusted gaps across groups. Therefore, adjustments for multiple comparisons are not appropriate in this context.
References
Cumming, G. (2009). Inference by eye: Reading the overlap of independent confidence intervals. Statistics in Medicine, 28, 205–220.
SOURCE: U.S. Department of Education, Institute of Education Sciences, National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 2019 and 2022 Reading and Mathematics Assessments.