Introduction
Attempts to define early identification procedures have a long history in educational and psychological measurement. The task, of course, is to develop measures and procedures that successfully identify young children who will later experience reading problems. Despite the amount of effort devoted to this goal, a satisfactory solution has yet to emerge. This situation is particularly ironic in that we have available validated instructional practices that can assist many young children in their efforts to enter the world of literacy (e.g., Foorman & Torgesen, 2001; Fuchs et al., 2001; Torgesen et al., 1999). However, the instructional practices that are likely to help those who struggle the most are intensive and expensive (e.g., Torgesen et al. 2001; Vellutino et al., 1996). Thus, it behooves us to develop screening procedures that will allow scarce instructional resources to be targeted at those children who will otherwise experience reading failure.
The challenges of early identification are well documented (e.g., Fletcher & Satz, 1984; Jenkins & O'Connor, 2002; O'Connor & Jenkins, 1999; Scarborough, 1998). Measures that enjoy strong correlations with reading often fail as classification/screening tasks, yielding too many classification errors. There are two types of screening errors: over- and under-identification. Over identification means the screening procedure identified too many children as at risk for poor outcomes--children were identified who did not experience reading problems. Under identification is just the opposite--the screen missed children who later experienced poor outcomes. Which error is more egregious depends upon one's perspective. Over identification means children who don't need intervention will receive it; under identification means children who need intervention will not receive it.
Ideally we could devise a system that minimizes each error type, recognizing their reciprocal relationship--as one increases the other decreases. Some investigators focus efforts on minimizing under identification and set screening criteria so that no child with a poor outcome will be missed. Scanlon and Vellutino (1996) adjusted their kindergarten screening criterion for letter names from 10 to 20 and the number of children misidentified as poor readers in first grade more than tripled. O'Connor and Jenkins (1999) similarly set their first grade criteria so that no child was missed but the over identification rate (i.e., false positives; FP/FP + TP) ranged from 47 to 70%. Essentially, one must pick her poison.
What we are faced with in screening is developing methods that will hit a moving target. That is, children continue to develop on the very skills we use as screens but our methods rarely take this development into account. Scarborough (1998) illustrated this phenomenon by showing that kindergarten children who have poor phonological awareness skills may or may not have difficulty getting into reading because phonological skills are quite learnable. Some children with poor initial skills will "get it", others won't, and we don't know how to discriminate these two groups of children using the extant one-step screening procedures. Phillips et al. (2002) provided another example of how slippery development can be. These investigators recently rebutted Juel's (1988) findings of reading status immutability by showing that the probability of being a poor reader in both first and sixth grades is no more than .50 as compared to .88 between first and third grades reported by Juel. Almost half the children who were below average in first grade were in the average achieving group by sixth grade, presumably due to usual exposure to the curriculum and instruction.
The point to be made is that children are not passive members of inert classrooms waiting for the next measurement occasion. This is so despite the fact that most attempts at screening/early identification assume this to be the case either implicitly or explicitly. One reason that screening efforts have not achieved an acceptable degree of accuracy may be the failure to attend to growth. There is accumulating evidence that measures of learning (i.e., growth) may be key to early identification efforts (Byrne, Fielding-Barnsley, & Ashley, 2000; Deno, Fuchs, Marston, & Shin, 2001; O'Connor & Jenkins, 1999; Speece & Case, 2001; Speece & Cooper, 1990). Byrne et al. (2000) reported that the number of phonological awareness training sessions needed by preschool children to demonstrate perfect performance differentiated disabled and nondisabled readers in elementary school and contributed significant unique variance (8% to 21%) to fifth grade literacy performance beyond the contribution of phonological awareness. Deno et al. (2001) found that first grade students in general education demonstrated over twice the growth in oral reading fluency compared to their counterparts in special education and that this discrepancy held when beginning reading levels (intercepts) were controlled. This evidence suggests that, in addition to level of performance, measurement should recognize growth.
The dual focus on level and growth is the cornerstone of RTI conceptualizations proposed by Fuchs and Fuchs and their colleagues (e.g., Fuchs, 1995; Fuchs & Fuchs,1998; Fuchs, 2003). In their model, level and growth measures derived from weekly assessments on Curriculum-Based Measures (CBM) provide both screening and diagnostic information. Children who are below classmates or grade mates on both level and slope (Dually Discrepant) are initially considered at risk for poor academic outcomes. Following intervention phases in general education, children who continue to be nonresponsive are candidates for more intensive interventions (e.g., secondary interventions). Each of the screening and intervention phases typically last 6 to 8 weeks.
The work I have conducted with this model has focused primarily on the validation of the dual discrepancy criteria as a diagnostic (outcome) classification. In the domain of reading we found the classification to be valid with respect to construct and social consequential validity (Speece & Case, 2001). An important finding was that the DD classification was not biased on gender or ethnicity. We also found that children who were frequently identified as DD across three years, compared to other at-risk children, exhibited poorer performance on reading and reading-related measures, were rated lower by their teachers on academic competence, problem behaviors, and social skills, and were identified more frequently by school personnel as requiring assistance or attention beyond that provided in the general education classroom (Case, Speece, & Molloy, 2003)
These findings and others (Fuchs & Fuchs, 1998; Fuchs, 2003) provide support for the use of the dual discrepancy procedure as a diagnostic category. We have not investigated the accuracy of the initial DD classification as a screen. Placed in the context of one-shot screening approaches, the DD approach to screening may be viewed as a daunting undertaking: All children would be administered a CBM probe once a week for 8 weeks so that slope and level performances could be assigned to each child. These data are used to make a screening designation as at risk or not at risk. Although this screening procedure satisfies the criterion of acknowledging development by assessing children's responsiveness to the general education curriculum, a reasonable question is whether there is a more efficient method. For example, perhaps monthly screening across three or four months would produce either gain scores or slope estimates that are sensitive to responsiveness. The screen may take a month longer but require fewer resources to implement.
Perhaps efficiency is not the right question at the present time. Perhaps the right question is whether this screen yields accurate classification. Then the "right" question becomes: How do we implement? Nonetheless, the amount of philosophical, conceptual, and structural change required to implement weekly measurement in general education classrooms requires some recognition. In noting the difficulty of achieving school change, Erickson (1996) quipped "It reminds us of the joke about how many psychiatrists it takes to change a light bulb: Only one, but it takes a long time and the light bulb really has to want to change" (p. 91).
From this perspective, I used an existing database of two cohorts of first grade children to examine the utility of CBM measures collected in September and January to identify children who in May met various criteria for reading disability. To be sure, this was not the optimal data set to examine questions concerning growth. Monthly or weekly measures in the fall were not collected and Letter Sound Fluency (LSF) was the only CBM measure available on all children. Ideally, the questions would be examined with a variety of CBM and published measures varying the timing of measurement. Oral Reading Fluency (ORF) data were collected beginning in January because of expected floor effects earlier in the year. Both LSF and ORF data were collected either weekly or monthly beginning in January. The technical characteristics of these measures are strong and well established (e.g., Deno, 1985; Fuchs & Fuchs, 1998; 1999; Speece & Case, 2001). Although limited for investigating the issue of efficiency, the data are useful for illustrating several points and raising relevant research questions.
Previous Page | Next Page
(Table of contents) | (Description of Study)

