Racial/Ethnic Score Gap Tool

Technical Notes for Racial/Ethnic Score Gap Tool

The Racial/Ethnic Score Gap Tool provides users with the opportunity to observe how various factors, consisting of NAEP survey question variables, are related to the score gaps of select racial/ethnic student groups. At present, this tool uses 2019 NAEP reading and mathematics data at grade 12 and 2022 NAEP reading and mathematics data at grades 4 and 8 to allow users to see how average score gaps between White–Black, White–Hispanic, and Asian–White students change when you consider additional factors that may relate to student performance.

The analyses shown in the tool are exploratory. The tool allows users to examine whether there are associations between select factors or variables collected by NAEP and the White–Black, White–Hispanic, and Asian–White average student score gaps. The tool does not assess whether the relationship between factors and score gaps is causal (i.e., this analysis does not and cannot test whether the select factor causes differential achievement among the student racial/ethnic groups explored here).

NAEP does not provide scores for individual students or schools; instead, it offers results for populations of students (e.g., fourth-graders) and subgroups of those populations (e.g., female students, Hispanic students) as well as results regarding subject-matter achievement, instructional experiences, and school environment. NAEP results are based on school and student samples that are carefully designed to accurately represent student populations of interest.

NAEP Survey Question Variable Information

Variable selection

Exploratory analyses using 2019 reading data at grade 12 were conducted prior to the final variable selection for the models in the regression analyses. We began with a larger list of potential predictors that could fit into the categories of interest: socioeconomic status (individual); socioeconomic status (school environment); student postsecondary plans and academic behaviors; and student noncognitive factors. Variables that had small associations when controlling for other variables in the model were excluded. These variables were:

NAEPID: Item Description
PARENTS: Parents in the home
UTOL4: School location (i.e., city, suburb, town, or rural)
PCTBLKC: Percentage of Black students in school
PCTHSPC: Percentage of Hispanic students in school
C087302: Percentage of parents attend parent-teacher conferences in school
R850901: How often for English/language arts class discuss something whole class has read
R850902: How often for English/language arts class work in pairs/small groups to discuss something we read
R850903: How often for English/language arts class discuss different interpretations of what we have read
R851003: How often teachers ask students to critique author's craft or technique in English/language arts class
R850202: How often teachers ask students to interpret the meaning of the passage
RDACT12: How often teachers ask students to evaluate/analyze/critique when reading

A handful of interaction terms were of interest from the list of remaining variables. The following interactions were evaluated for potential inclusion in the models:

SQCATR7 (Index of confidence in reading knowledge and skills) x B013801 (Number of books in home)
SQCATR7 (Index of confidence in reading knowledge and skills) x C087602 (Proportion of last year's graduates attending a four-year college)
B013801 (Number of books in home) x C087602 (Proportion of last year's graduates attending a four-year college)
APIBELA (Taking/took an Advanced Placement (AP) English or language arts course and/or International Baccalaureate Language A1 course) x C087602 (Proportion of last year's graduates attending a four-year college)
SRACE10 (Race/ethnicity) x B013801 (Number of books in home)
SRACE10 (Race/ethnicity) x SQCATR7 (Index of confidence in reading knowledge and skills)
SRACE10 (Race/ethnicity) x C087602 (Proportion of last year's graduates attending a four-year college)
SRACE10 (Race/ethnicity) x APIBELA (Taking/took an Advanced Placement (AP) English or language arts course and/or International Baccalaureate Language A1 course)
SRACE10 (Race/ethnicity) x B018101 (Days absent from school in the last month)
SRACE10 (Race/ethnicity) x PARED (Parental education level)

No interactions showed a meaningful increase in explained variance, so none were retained. The exploratory analysis was used to guide variable selection for other grade/subject combinations.

Tables 1 through 6 below show the variables included in the Racial/Ethnic Score Gap Tool. Tables 1 and 2 include variables for fourth- and eighth-grade students in the 2022 NAEP reading assessment and table 3 includes variables for twelfth-grade students in the 2019 NAEP reading assessment. Tables 4 and 5 include variables for fourth- and eighth-grade students in the 2022 mathematics assessment and table 6 includes variables for twelfth-grade students in the 2019 mathematics assessment. Click the links to see detailed results for each variable in the NAEP Data Explorer including percentage distribution and average scores.

Please note that while all categories for each variable are displayed in the data explorer, some categories were combined for purposes of running the regression models. Two or more categories were collapsed in the following predictor variables included in the regression: COMPINT, APIBELA (grade 12 only), and SCHNSLP (grade 12 only). Categories were collapsed to help with interpretation and because some categories had very small percentages of students.

COMPINT: This predictor variable was collapsed from four categories to two categories, where "both computer/tablet and Internet" was one category, and all other responses were combined in a second category.
APIBELA (grade 12 only): This predictor variable was collapsed from four categories to two categories, where "Neither" was one category, and all other responses were combined in a second category.
SCHNSLP (grade 12 only): This 5-category derived variable was constructed specifically for this secondary report and was based on responses to NAEP variables C038301, C051401, and C051651 as follows:
- If all responses to above variables are missing, then "Missing"; otherwise
- If C038301 = No, then 1 = "School does not participate"; otherwise
- If C051401 = All students, then 6 = "All students".
- C051651 categorized into:
  - If omit, then "Missing".
  - 2 = "0–25%", 3 = "26–50%", 4 = "51–75%", 5 = "76–100%".
  - "All students" from C051401 and "76–100%" from C051651 were further combined into a single category.

SCHNSLP is not included in the 2022 grades 4 and 8 regressions because a skip pattern was introduced in 2019 for item VH240216 in which if a school indicates that all students receive free lunch because of a special program, then VH240218 (i.e., the item asking what the percentage of students in the school receive free lunch) is automatically skipped. Because the majority of schools offered such a program in 2022 due to the pandemic, there was a dramatic change in meaning for the variable between 2019 and 2022. This change also limits the utility of the variable in 2022 because nearly three quarters of students fall into the category "All students".

Additional details about the survey questionnaire variables that were used to construct the derived variable SCHNSLP are available in the NAEP Data Explorer by using the NAEP ID to locate the variable of interest for grade 12 reading and mathematics.

TABLE 1 Selected variables included in the regression model for fourth-grade students in NAEP reading: 2022
NAEP Variable ID	Variable Description
GENDER (DSEX)	Gender
SRACE10	Race/ethnicity
SLUNCH3	National School Lunch Program eligibility (student level)
COMPINT	Have internet access and/or computer/tablet at home
B013801	Number of books in home
SQCATR5	Index of academic self-discipline
SQCATR7	Index of confidence in reading knowledge and skills
SQCTR10	Index of mastery goals in reading

TABLE 2 Selected variables included in the regression model for eighth-grade students in NAEP reading: 2022
NAEP Variable ID	Variable Description
GENDER (DSEX)	Gender
SRACE10	Race/ethnicity
SLUNCH3	National School Lunch Program eligibility (student level)
COMPINT	Have internet access and/or computer/tablet at home
B013801	Number of books in home
PARED	Highest level of parental education
SQCATR4	Index of persistence in learning
SQCATR5	Index of academic self-discipline
SQCATR7	Index of confidence in reading knowledge and skills
SQCTR10	Index of mastery goals in reading

TABLE 3 Selected variables included in the regression model for twelfth-grade students in NAEP reading: 2019
NAEP Variable ID	Variable Description
GENDER (DSEX)	Gender
SRACE10	Race/ethnicity
SCHNSLP¹	Percentage of students eligible for the National School Lunch Program (NSLP) in school
COMPINT	Have internet access and/or computer/tablet at home
B013801	Number of books in home
PARED	Highest level of parental education
C087602	Percentage of last year's graduates enrolled in a four-year college
B035705	Student applied to a four-year college
B035702	Student submitted the Free Application for Federal Student Loan Aid (FAFSA)
B018101	Number of days absent from school in the last month
APIBELA	Students taking or took an Advanced Placement English or language arts course and/or International Baccalaureate Language A1
SQCATR4	Index of persistence in learning
SQCATR5	Index of academic self-discipline
SQCATR7	Index of confidence in reading knowledge and skills
SQCATR9	Index of performance goals in reading
SQCTR10	Index of mastery goals in reading
¹ The derived variable SCHNSLP, constructed specifically for this report, is not available in the NAEP Data Explorer. The following individual variables used to construct SCHNSLP are available in the data explorer: C038301, C051401, and C051651. Please note that NAEP variable ID C051651 shown in the data explorer is the nine-category, non-collapsed version.

TABLE 4 Selected variables included in the regression model for fourth-grade students in NAEP mathematics: 2022
NAEP Variable ID	Variable Description
GENDER (DSEX)	Gender
SRACE10	Race/ethnicity
SLUNCH3	National School Lunch Program eligibility (student level)
COMPINT	Have internet access and/or computer/tablet at home
B013801	Number of books in home
SQCATM5	Index of academic self-discipline
SQCATM7	Index of confidence in mathematics knowledge and skills
SQCTM10	Index of mastery goals in mathematics

TABLE 5 Selected variables included in the regression model for eighth-grade students in NAEP mathematics: 2022
NAEP Variable ID	Variable Description
GENDER (DSEX)	Gender
SRACE10	Race/ethnicity
SLUNCH3	National School Lunch Program eligibility (student level)
COMPINT	Have internet access and/or computer/tablet at home
B013801	Number of books in home
PARED	Highest level of parental education
MATCRS8	Math class taken at eighth grade
SQCATM4	Index of persistence in learning
SQCATM5	Index of academic self-discipline
SQCATM7	Index of confidence in mathematics knowledge and skills
SQCTM10	Index of mastery goals in mathematics

TABLE 6 Selected variables included in the regression model for twelfth-grade students in NAEP mathematics: 2019
NAEP Variable ID	Variable Description
GENDER (DSEX)	Gender
SRACE10	Race/ethnicity
SCHNSLP¹	Percentage of students eligible for the National School Lunch Program (NSLP) in school
COMPINT	Have internet access and/or computer/tablet at home
B013801	Number of books in home
PARED	Highest level of parental education
C087602	Percentage of last year's graduates enrolled in a four-year college
B035705	Student applied to a four-year college
B035702	Student submitted the Free Application for Federal Student Loan Aid (FAFSA)
B018101	Number of days absent from school in the last month
APIBMAT	Students taking or took an Advanced Placement calculus or statistics course and/or International Baccalaureate Mathematics
MATCRST	Highest math course taken
SQCATM4	Index of persistence in learning
SQCATM5	Index of academic self-discipline
SQCATM7	Index of confidence in mathematics knowledge and skills
¹ The derived variable SCHNSLP, constructed specifically for this report, is not available in the NAEP Data Explorer. The following individual variables used to construct SCHNSLP are available in the data explorer: C038301, C051401, and C051651. Please note that NAEP variable ID C051651 shown in the data explorer is the nine-category, non-collapsed version.

Indices related to students' attitudes toward learning

While some survey questions are analyzed and reported individually (for example, number of books in students' homes), several questions on the same topic are combined into an index measuring a single underlying construct or concept. More information about the 2019 (grade 12) and 2022 (grades 4 and 8) NAEP reading and mathematics indices and the individual survey questions they comprise can be found in the NAEP Data Explorer by clicking the links in tables 1 through 6 above.

The creation of 2019 and 2022 indices involved the following four main steps:

Selection of constructs of interest. The selection of constructs of interest to be measured through the survey questionnaires was guided in part by the National Assessment Governing Board framework for collection and reporting of contextual information. In addition, NCES reviewed relevant literature on key contextual factors linked to student achievement in reading to identify the types of survey questions and constructs needed to examine these factors in the NAEP assessment.
Question development. Survey questions were drafted, reviewed, and revised. Throughout the development process, the survey questions were reviewed by external advisory groups that included survey experts, subject-area experts, teachers, educational researchers, and statisticians. As noted above, some questions were drafted and revised with the intent of analyzing and reporting them individually; others were drafted and revised with the intent of combining them into indices measuring constructs of interest.
Evaluation of questions. New and revised survey questions underwent pretesting whereby a small sample of participants (students, teachers, and school administrators) were interviewed to identify potential issues with their understanding of the questions and their ability to provide reliable and valid responses. Some questions were dropped or further revised based on the pretesting results. The questions were then further pretested among a larger group of participants and responses were analyzed. The overall distribution of responses was examined to evaluate whether participants were answering the questions as expected. Relationships between survey responses and student performance were also examined. A method known as factor analysis was used to examine the empirical relationships among questions to be included in the indices measuring constructs of interest. Factor analysis can show, based on relationships among responses to the questions, how strongly the questions "group together" as a measure of the same construct. Convergent and discriminant validity of the construct with respect to other constructs of interest were also examined. If the construct of interest had the expected pattern of relationships and nonrelationships, the construct validity of the factor as representing the intended index was supported.
Index scoring. Using the item response theory (IRT) partial credit scaling model, index scores were estimated from students' responses and transformed onto a scale which ranged from 0 to 20. As a reporting aid, each index scale was divided into low, moderate, and high index score categories. The cut points for the index score categories were determined based on the average response to the set of survey questions in each index. In general, high average responses to individual questions correspond to high index score values, and low average responses to individual questions correspond to low index score values. As an example, for a set of index survey questions with five response categories (such as not at all, a little bit, somewhat, quite a bit, and very much), students with an average response of less than 3 (somewhat) would be classified as low on the index. Students with an average response greater than or equal to 3 (somewhat) to less than 4 (quite a bit) would be classified as moderate on the index. Finally, students with an average response of greater than or equal to 4 (quite a bit) would be classified as high on the index.

Items in the grade 12 mathematics student questionnaire comprising indices SQCATM9 and SQCTM10 were not included in the regression model due to missing rates that were above 15 percent which contributed to a high overall missing rate for regression analyses that included these two indices. For example, only 56 percent of the grade 12 students were retained in the regression that included SQCATM9 and SQCTM10. Therefore, only indices SQCATM4, SQCATM5, and SQCATM7 were included as part of the Racial/Ethnic Score Gap Tool.

Regression Models

Regression models were used to enable comparisons between unadjusted score gaps (i.e., regression coefficients from the model that only includes dummy variables for the racial/ethnic categories, excluding the category for White students as the reference group) to adjusted score gaps (i.e., regression coefficients from a model that includes dummy variables for some combination of predictors in addition to race/ethnicity). Unadjusted score gaps represent the difference between the average scores of two groups of students, and adjusted score gaps represent the estimated score gap once regression analysis controls for the variable(s) selected. For example, the White–Black score gap in grade 4 mathematics is 28.95. It is the regression coefficient for the dummy variable associated with Black when the regression model only includes dummy variables for SRACE10 as predictors (excluding White as the reference group). The gap decreased to 17.68 when the SES variables were added as predictors to the model (i.e., when the gap was adjusted for the cluster of SES variables). Unadjusted score gaps are shown as statistically significant if the regression coefficient for the dummy variable associated with a particular race/ethnicity category is significant at the α = 0.05 level.

The regression analyses were run using the 2022 NAEP reading and mathematics grades 4 and 8, and the 2019 NAEP reading and mathematics grade 12 national reporting sample which included about 108,200 fourth-grade, 111,300 eighth-grade, and 26,700 twelfth-grade participating students in reading and about 116,200 fourth-grade, 111,000 eighth-grade, and 25,400 twelfth-grade participating students in mathematics. All analyses used students' scale score as the outcome variable and utilized sampling weights. Coefficient estimates were calculated using 20 plausible values, and standard errors were calculated using 20 plausible values and 62 replicate weights. Missing data was handled by listwise deletion. All predictors included in the model are categorical and were included as dummy variables identifying each group of students for each variable, except for the one group chosen as the omitted category that serves as the reference group.

Comparing Adjusted to Unadjusted Score Gaps

Determining whether score gaps are significantly different between two regression models is not straightforward because the regression coefficients in each model are estimated from the same data, so errors on the regression coefficients for the same predictor may be expected to be positively correlated. With positively correlated or uncorrelated errors, determining statistical significance based on non-overlapping confidence intervals is conservative (Cumming, 2009), meaning that if two 𝑛 % confidence intervals do not overlap, the difference is statistically significant at the \[a=1-{n\over 100}\]
level; if confidence intervals do overlap, differences may not be statistically significant. Note that overlapping confidence intervals do not imply non-significance. The regression tool compares unadjusted score gaps (i.e., regression coefficients from the model that only includes dummy variables for the racial/ethnic categories, excluding the category for White students as the reference group) to adjusted score gaps (i.e., regression coefficients from a model that includes dummy variables for some combination of predictors in addition to race/ethnicity) using 95 percent confidence intervals. Non-overlapping confidence intervals imply statistical significance at the α = 0.05 level.

No multiple comparison adjustments were used in calculating the confidence intervals. Standard NAEP methodology utilizes the Benjamini-Hochberg false discovery rate (FDR) procedure to adjust for multiple comparisons in a single analysis (e.g., analyzing White student performance versus the performance of Black, Hispanic, and Asian students); however, the purpose of the confidence intervals in the regression tool is to compare adjusted versus non-adjusted score gaps within a racial/ethnic group rather than to compare unadjusted gaps across groups. Therefore, adjustments for multiple comparisons are not appropriate in this context.

References

Cumming, G. (2009). Inference by eye: Reading the overlap of independent confidence intervals. Statistics in Medicine, 28, 205–220.

SOURCE: U.S. Department of Education, Institute of Education Sciences, National Center for Education Statistics, National Assessment of Educational Progress (NAEP), 2019 and 2022 Reading and Mathematics Assessments.