The persistence of reading and math proficiency: the benefits of Alabama’s pre-kindergarten program endure in elementary and middle school

Preschool programs provide opportunities to improve early childhood educational outcomes as well as long-term outcomes, such as improved educational attainment, improved socioeconomic status, and improved health in adulthood. However, recent studies of long-term impacts have shown equivocal results, with some educational gains occurring immediately following participation in preschool that diminish or “fadeout” over time. The purpose of this study was to use multivariable linear regression and school fixed effects to determine the impact of Alabama’s First Class Pre-K (FCPK) program on reading and math proficiency. In an effort to test for fadeout, a second multivariable linear regression was used with an additional interaction term of FCPK receipt with time since receipt, to assess for changes in the impact of FCPK as children progress from 3rd grade to 7th grade. Results indicate that children who received FCPK were statistically significantly more likely to be proficient in both math and reading compared to students who did not receive FCPK. Further, there was no statistical evidence of fadeout of the benefits of FCPK through the 7th grade, indicating the persistence of the benefits of FCPK into middle school.

More recently, large-scale preschool programs have yielded equivocal results over the long-term. Though it is generally accepted that high-quality preschool programs yield significant early gains, especially for more disadvantaged children, evaluation of some programs has suggested "fadeout, " or diminished benefits over time (Barnett 1995(Barnett , 1998Dodge et al. 2016;Hill et al. 2015;Huang et al. 2012;Jenkins et al. 2018;Lipsey et al. 2018;McCoy et al. 2017;Muschkin et al. 2015;Phillips et al. 2017;Puma et al. 2012;Watts et al. 2018). For example, the Head Start Impact Study showed that random selection/assignment to the program was associated with short-term gains in cognitive and social skills, but these gains diminished by the time children entered elementary school (U.S. Department of Health and Human Services 2010). In addition, a study of Tennessee's Pre-K program showed that although the program was associated with initial positive impact on cognitive skills, there was a fadeout of effects 1 year later, and even apparent adverse impact after several years (Lipsey et al. 2018).
Conversely, Michigan's Great Start School Readiness program documented that although initial test score gains faded out in middle school, benefits became evident again in high school exit examinations (Schweinhart et al. 2012). Several other studies have shown that initial gains fade to about half of the initial effect, but fadeout varies based on pre-k program quality and/or the quality of subsequent learning environments (Camilli et al. 2010;Yoshikawa et al. 2013).
One possible reason for fadeout is the sustaining environments' hypothesis. Longterm success of a preschool program is assumed to relate to the quality of later learning environments and has led to widespread interest in aligning Kindergarten-3rd grade classrooms with pre-k experiences (Stipek et al. 2017). Different reasons hypothesized include other children catching up as Kindergarten teachers cover content taught in prek, low-quality schools stagnating all children's learning (Jenkins et al. 2018), and children not attending pre-k being retained (Barnett 2014a). Conclusions from recent large-scale studies (Bassok et al. 2018) and meta-analysis (Bailey et al. 2020) do not falsify the sustaining environment hypothesis. Rather, Bassok et al. (2018) found initial gains to shrink as children pass through school, with literacy gains maintained through 1st grade and math gains through 3rd grade. The authors recommend future research focus on large sample sizes, sustained attention to preschool curriculum, and professional development of teachers. In support of this, research by Jenkins et al. (2018) found pre-k coupled with providing teacher professional supports in Kindergarten and 1st grade all but eliminated math fadeout.
The efficacy of pre-k question has been mired in debates about the absence of considering long-term positive effects of pre-k for international studies and research methodology (Barnett 2014b), though a more recent systematic review of the literature indicates lasting benefits for programs of higher quality (Meloy et al. 2019). A meta-analysis by Yoshikawa et al. (2013) provided evidence for a fading over time, but with long-term gains still evident into adulthood. More recently, a study of a large and mature pre-k program by Gormley et al. (2017) found effects for math and honors course placement, lasting into middle school. In addition, studies outside of the US find impacts that last into adolescence (Bassok et al. 2018) with fadeout mediated by earlier health (Rossin-Slater and Wüst (2017).

Alabama's First Class Pre-K
The Alabama Department of Early Childhood Education (ADECE) is a state agency charged with supporting coordination of early childhood systems and assuring their alignment with the K-12 systems in the state. The mission of the ADECE is to inspire, support, and deliver cohesive, comprehensive systems of high-quality education and care so that all Alabama children thrive and learn. First Class Pre-K (FCPK) is Alabama's diverse delivery, voluntary, high-quality, pre-K program for 4-year olds. Administered by the Office of School Readiness, FCPK is one of several early childhood programs housed within ADECE, including the Head Start State Collaboration Office, the Maternal and Infant Early Childhood Home Visiting (MIECHV) Program, and the State Children's Policy Council (CPC). Programs are state and federally funded, with strong support from the Alabama Governor, State Legislature, and community.
FCPK classrooms are established through a competitive grant process in which interested sites respond to a request for proposals, available to all 67 Alabama counties and to diverse delivery models including public schools, Head Start centers, private child care, community programs, faith-based centers, college/university programs, and military child development centers. Applications are evaluated and scored by independent grant readers using a 10-point scoring rubric with points granted for poverty percentage based on free or reduced lunch status of nearest public school, creating new access in underserved areas, and being in a high-need county and/or high-need county with one or more failing schools. Further, all applicant sites are considered in the context of the under-5 population and existing early childhood resources including FCPK programs, Head Start, MIECHV programs, and licensed and licensed-exempt child care settings. These comprehensive reviews support final decisions for classroom awards that reduce crowd-out or duplication of services in areas where adequate opportunities exist for access to high-quality pre-K, while identifying opportunities for greatest impact based on underserved areas or areas without access to high-quality pre-K programs. Awarded sites must meet specific quality assurances and abide by rigorous operating guidelines. For example, lead and auxiliary teachers have strict degree, credentialing, and professional development requirements; receive pay that is comparable to their K-12 counterparts; and must implement evidence-based, developmentally appropriate approaches to learning for young children. Further, beginning with some of the later cohorts included in these analyses, all classrooms now receive individualized, onsite professional development through reflective coaching. Research-based observational assessments such as the Classroom Assessment Scoring System (CLASS) ™ and Teaching Strategies GOLD (TS GOLD) ™ are now incorporated at regular intervals to support quality adult-child interactions and child-centered instructional practices.
ADECE has increased access for children to attend FCPK, growing from just 57 classrooms serving 1.7% of 4-year old in 2005-2006 to 1042 classrooms serving 32% of 4-year old in 2018-2019. Yet, as Alabama has expanded access to preschool programs, the state has maintained high quality as the center of all its efforts. Alabama's FCPK program has been awarded the highest quality rating by the National Institute for Early Education Research (NIEER) for the past 13 years.
Each year, ADECE engages in extensive public outreach to increase awareness of the availability of FCPK slots in local communities. This includes website announcements, social media, press releases, and posted notices for recruitment in locations such as child care centers, pediatrician offices, and health departments. FCPK is open to all Alabama resident 4-year-old children (must be 4 years of age by September 1st of the school year). Families interested in having their children participate in FCPK register them through a centralized, online process. Programs then select children to participate through a local lottery. Therefore, children are selected for available classroom slots through a random drawing held at each local program.

Purpose
The purpose of this paper is to examine the evidence for two questions: 1. Are children who participate in Alabama's FCPK program more likely to be proficient in reading and math skills compared to their non-FCPK peers?
2. If performance is better, do these differences persist as children progress into later grades?

Cohort
All students who attended Kindergarten in Alabama public schools were classified into cohorts based on the year they entered Kindergarten. Children who received FCPK the previous year were identified. Analyses are based on the population of children in each cohort that remains observable in the school year data (i.e., did not leave Alabama public schools). Individual-level, de-identified data for five separate cohorts of children were examined. Data were structured to evaluate aggregate performance by grade over time, thereby allowing analyses for three groups of 3rd, 4th, and 5th graders; two groups of 6th graders; and one group of 7th graders across 3 years of testing in all Alabama public schools that offered those grades during the testing period (n = 1138). Alabama began using the ACT-Aspire assessment in the 2014-2015 school year; therefore, these analyses include all available test data.
See Table 1 for outline of data structure.

ACT Aspire Assessment System
During the study period, Alabama used the standardized ACT Aspire Assessment System© to measure reading and math proficiency beginning in 3rd grade. The ACT Aspire is a vertically articulated, standards-based system of assessments linked to ACT College Readiness Benchmarks and aligned with the Common Core State Standards (ACT Aspire 2020a). ACT Aspire scale scores differ based on grade and range from 400 to 442 for reading and 400 to 460 for math (ACT Aspire 2020b). The Alabama State Department of Education determined four proficiency levels based on benchmark scores for each grade and subject and equal to the ACT Readiness Levels reported by ACT Aspire. Level 3 proficiency is set according to ACT's College Readiness Benchmarks (Barnett 2014a(Barnett , 2014b.

Proficiency definition
Proficiency in reading and/or math was defined according to a two-prong approach based on both ACT Aspire performance and grade retention status. Observations were classified as proficient only if they met both prongs of the definition: Proficiency = 1. Student scored level 3 or 4 on ACT Aspire test (Alabama State Department of Education standard for proficiency) AND 2. Student is in expected grade based on when entered Kindergarten/has never been retained (i.e., students classified as proficient in these analyses scored proficient on test and were in correct grade for age).

Analyses
Analyses were completed on behalf of ADECE by the multi-disciplinary Pre-K Research and Evaluation Team. Student demographic data were obtained from the Alabama State Department of Education database (iNow) and included gender, race/ethnicity, poverty status, and school. Gender categories included "male" and "female. " Race/ethnicity categories were coded as "Asian, " "Black, " "White, " "Hispanic, " and "Other-multi. " Poverty status was determined based on receipt of free or reduced price lunch during the school year of the test. School was determined as the name of the school where the student took the standardized test or was enrolled at the beginning of the school year.
Multivariable linear probability models were estimated to investigate the association between receiving FCPK and reading and math proficiency, while controlling for student demographic characteristics described above (Model A). Multivariable analytical models allow for the determination of independent effects of predictor variables on the dependent variable (i.e., holding all other variables constant or assuming that the only difference between observations is the single variable) (Wooldridge 2013). The models also used "school fixed effects" to account for unobservable time-invariant school-level factors (such as school culture, community supports, and the average socioeconomic status of families in neighborhoods zoned for the school). This was done by essentially including a vector that included a binary indicator for each school in the dataset. This is a critical component in the model, since factors like school culture and average socioeconomic status in the neighborhood zones are potentially associated both with proficiency and the likelihood of receiving FCPK and may confound the results if unaccounted for in the regression model. In general, fixed effects models are an often-used econometric method to allow for unobserved effects that do not change over time to be correlated with explanatory variables (Wooldridge 2013). Lai et al. (2011) also applied school-level fixed effects in their analyses of school quality and teacher qualifications on student performance. Altonji (1998) included high school fixed effects in a study of personal and school characteristics on estimates of financial return for postsecondary education. Finally, Caldas and Bankston (1997) used school fixed effects to control for the socioeconomic status of the entire student body in their analyses of individual academic achievement.
To test for fadeout, linear probability models were estimated for each outcome. These models included a variable representing the interaction between the receipt of FCPK and "time, " along with all other student demographics mentioned above (Model B). "Time" is defined as students aging over each subsequent test/school year. The interpretation of the interaction term between FCPK and time indicates whether the impact of FCPK on proficiency changes as students' age, or alternatively, as more years pass since they received FCPK. Robust cluster standard errors were applied for both models. Analyses were conducted in Stata (v. 16.0) using the areg procedure. The University of Alabama at Birmingham Institutional Review Board approved the study.

Sample demographics
Overall during the study period, 6.4% of students received FCPK. Recipients of FCPK were statistically significantly more likely to be Black and live in poverty and less likely to be White, Asian, and Hispanic compared with children who did not receive FCPK (see Table 2). Demographics are based on the number and percentage of datapoints in the

Statewide proficiency
Across the three school years and five cohorts, 35.4% of students were proficient in reading and 47.0% were proficient in math based on test score and placement in expected grade/lack of retention. Students in poverty, males, and students of the minority race/ethnicity groups of Black, Hispanic, or Other-multiple race/ethnicities were statistically significantly less likely to be proficient in either skill (Table 3).

Multivariable analysis
For each outcome, two models were developed-Model A: a basic model including each variable independently and Model B: the basic model plus a variable that includes an interaction of First FCPK and time to assess for changes in impact of FCPK over each subsequent test/school year (Tables 4 and 5).
With all other factors held constant, students in poverty, males, and students of the minority race/ethnicity groups of Black, Hispanic, or Other-Multiple race/ethnicities were less likely to be proficient in either skill. Further, controlling for these demographics and school attended, students who received FCPK were on average 1.7 percentage points (coeff. = 0.017; t = 1.60; p < 0.05) more likely to be proficient in reading and on average 2.6 percentage points (coeff. = 0.026; t = 6.18; p < 0.05) more likely to be proficient in math compared to students who did not receive FCPK.
The FCPK/Time interaction variable was not statistically significant for either outcome; hence, there is no evidence indicating that statistically significant differences in performance of FCPK students were not lost over time (i.e., as children age and progress to later grades).

Discussion
This is the first study to assess whether children in the Alabama public school system who receive FCPK have greater reading and math proficiency than their peers, and whether such effects fade over time. The first important finding from our analyses is that among our sample of students in 3rd-7th grades over 3 years of standardized testing, children who received FCPK were statistically significantly more likely to be proficient in reading and math compared to children who did not receive FCPK. These results persist after controlling for factors that have been shown to influence academic performance, including poverty, gender, race/ethnicity, classroom/school factors, and time. Further, effects are independent from within-school variation in the  School Absorbed receipt of FCPK, eliminating the potential for confounding from between-school differences in neighborhood socioeconomic status. An additional critical finding is that our analyses indicate no evidence of fadeout of the statistically significant benefits of FCPK. These findings are especially meaningful considering that observations were for students in 3rd-7th grades, representing persistence of the benefits of FCPK well beyond the end of the program and into later grades where some other programs have shown diminished impact. Though differences are somewhat modest (1.7 and 2.6 percentage points for reading and math, respectively), the important caveat is that the differences persist, regardless of demographics or school attended, and are associated with a 9-month intervention at age 4 and are observed among students who enter the same schools as their peers. Given the variation in historical performance by demographic characteristic and school, we would not otherwise anticipate to find these differences.
We suggest that these benefits are related to the high-quality standards and rigorous implementation of Alabama's FCPK program. These elements that support quality are further detailed in the white paper "Supporting Quality, Accountability, and Student Outcomes in the Alabama First Class Pre-K Program" (Preskitt 2018) and include key leadership; implementation of foundational guidance; coaching, monitoring, and professional development; ongoing assessment; and teacher implementation of high-impact practices that are consistent with excellent pre-K classrooms. This is consistent with findings discussed previously (Bassok et al. 2018;Gormley et al.2017;Rossin-Slater and Wüst 2017;Meloy et al. 2019;Yoshikawa et al. 2013).

Limitations
Data used for these analyses are individual observations of children's test performance and grade level across three school years; however, we did not have access to markers to link individual children's performance longitudinally. Depending upon the expected grade and test participation, each individual child could have been represented up to three times in the aggregate dataset (i.e., once for each subsequent school year and test). Therefore, we could observe overall trends in scores, over time, as children progressed through elementary school, but we could not observe how an individual child performed over time. For this reason, it was not possible to control for individual characteristics using child-level fixed effects. However, the use of 'school fixed effects'-specifically the vector with a binary indicator for each school-allows us to account for neighborhood level socioeconomic factors that remain time-invariant. Thus, as long as the distribution of family socioeconomic factors within each school's zone does not change substantially within our study period, we are able to largely account for these confounders. Nevertheless, we recognize that it is possible that FCPK families were inherently different from non-FCPK families in other unmeasurable ways. Also, the interaction term for FCPK and Time addressed potential analytical issues with aggregate by year versus individual child longitudinal data.
Another limitation is that we have no way of determining which children in the no-FCPK comparison group may have gone to some other type of pre-K program. However, even if this could be determined, we would be unable to draw any conclusions related to the components or quality of these programs. Further, at the time when children included in these analyses could have been in FCPK, there was no way to identify children whose families put their names into the lottery, yet did not receive a slot in an FCPK classroom. However, the program has recently added an electronic method to track the wait list and subsequent placement if slots become available. Future studies can include these comparisons and gaining access to this information will lead to future lowcost rigorous evaluations.
In addition, we observed children in 3rd through 7th grade to determine if the positive impacts of FCPK dissipated. Because standardized testing of reading and math skills does not occur in Alabama prior to the 3rd grade, our analyses were limited to what happened at 3rd grade and beyond. We are aware that the observed differences in performance for students who received FCPK may have been larger at Kindergarten entry and could potentially have been reduced prior to the 3rd grade assessment. Still, the observed differences remain statistically significant at these later time points.
Finally, we believe that the quality supports present in FCPK classrooms impact proficiency in later grades; however, the elements or combinations of elements that best support the persistence of benefits is unknown. Future research should include longer term follow-up studies to further examine impact of FCPK on students' academic careers (through high school graduation).

Conclusions
These analyses incorporate advanced empirical methods to control for student demographics, school and neighborhood characteristics, and time to isolate potential differences in reading and math proficiency at elementary and middle school based on receipt of Alabama's First Class Pre-K program. The findings represent a higher bar of proficiency, including both test score and lack of retention (i.e., scored proficient and were in the expected grade based upon Kindergarten entry). The persistence of statistically significant benefits for students who received FCPK into the elementary and middle school years-well beyond the program's end-is reassuring and supports accountability for continued investments and expansion.