About the NAEP U.S. History Assessment
The National Assessment of Educational Progress (NAEP) in U.S. history is designed to measure students' knowledge of American history in the context of change and continuity in democracy, culture and society, technological and economic changes, and America's changing world role. Students answer a series of selected-response and open-ended questions based on these areas (or themes) in American history. Performance results are reported for students in the nation and disaggregated by various student characteristics.
In 2018, the NAEP U.S. history assessment transitioned from a paper-based assessment (PBA) to a digitally based assessment (DBA) at grade 8. A multistep process was used for the transition from PBA to DBA, which involved administering the assessment in both formats to randomly equivalent groups of students in 2018. The transition was designed and implemented with careful intent to preserve trend lines that show student performance over time. Thus, the results from the 2018 U.S. history assessment can be compared to results from previous years.
Reporting the Results
NAEP began administering assessments periodically in the 1990s and administered the grade 8 U.S. history assessment approximately every four years beginning in 2001. (Note: grades 4 and 12 U.S. history assessments were administered in 1994, 2001, 2006 and 2010). Results are reported as average scores on a scale of 0 to 500 (although student performance typically ranges from 220 to 300, representing the 10th and 90th percentiles, respectively). In 1994, the mean of the U.S. history scale was set at 250 and the standard deviation at 50. The U.S. history composite scale was formed by taking a weighted sum of the subscales for each of the themes specified in the framework. See the distribution of assessment questions by the themes. Results are also reported as percentages of students performing at or above the NAEP achievement levels: NAEP Basic, NAEP Proficient, and NAEP Advanced.
Because NAEP scores are developed independently for each subject, subscale, and grade, results cannot be compared across subjects, subscales, or grade levels. Read more about the NAEP scaling process in the Technical Documentation.
Results are reported for students overall and for selected demographic groups such as by race/ethnicity, gender, and studentsâ€™ eligibility for the National School Lunch Program (NSLP). Results for the NSLP have been reported since 2003 when the quality of the data on students' eligibility for the program improved. As a result of the passage of the Healthy, Hunger-Free Kids Act of 2010, schools can use a new universal meal service option, the "Community Eligibility Provision" (CEP). Through CEP, eligible schools can provide meal service to all students at no charge, regardless of economic status and without the need to collect eligibility data through household applications. CEP became available nationwide in the 2014â€“2015 school year; as a result, the percentage of students categorized as eligible for NSLP has increased in comparison to 2013. Readers should therefore interpret NSLP trend results with caution.
Read more about how student groups are defined and how to interpret NAEP results from the U.S. history assessment.
NAEP reports results using widely accepted statistical standards; findings are reported based on a statistical significance level set at .05, with appropriate adjustments for multiple comparisons. Only those differences that are found to be statistically significant are referred to as "higher" or "lower."
Comparisons over time of scores and percentages and within-year comparisons between groups are based on statistical tests that consider both the size of the difference and the standard errors of the two statistics being compared. Standard errors are margins of error, and estimates based on smaller groups are likely to have larger margins of error. For example, a 2-point change in the average score for the nation may be statistically significant, while a 2-point score change for a student group is not, due to the size of the standard errors for the score estimate. The size of the standard errors may also be influenced by other factors such as the degree to which the assessed students are representative of the entire population. Standard errors for the estimates presented in this report are available in the NAEP Data Explorer (NDE). For the 2018 analyses, an additional component was included for the standard error calculation when linking scores across the two delivery modes (PBA and DBA).
Average scores and percentages of students are presented as whole numbers in the report; however, the statistical comparison tests are based on unrounded numbers. In some cases, the scores or the percentages have the same whole number values, but they are statistically different from each other. The "Customize data tables" link at the bottom of the page provides data tables from the NAEP Data Explorer (NDE). The tables offer detailed information on more precise values for the scores and percentages and explain how the two comparison estimates may differ from each other.
A scale score that is significantly higher or lower in comparison to an earlier assessment year is reliable evidence that student performance has changed. NAEP is not, however, designed to identify the causes of change in student performance. Although comparisons are made in studentsâ€™ performance based on demographic characteristics and educational experiences, the comparisons cannot be used to establish a cause-and-effect relationship between the characteristic or experience and achievement. Many factors may influence student achievement, including educational policies and practices, available resources, and the demographic characteristics of the student body. Such factors may change over time and vary among student groups.