Frank M. Gresham, University of California-Riverside
Learning Disabilities Summit: Building a Foundation for the Future White Papers
This paper is available in alternative formats: | Download Word | Download pdf |
Pages: | 1 | 2 | 3 | 4 | 5 | 6 |
This paper argues that a child's inadequate responsiveness to an empirically validated intervention can be taken as evidence of LD and should be used to classify children as such. Some might argue that diagnoses in medicine, for example, are not confirmed or disconfirmed on the basis of whether a patient responds to treatment. However, one should always keep in mind that medical diagnoses often have direct treatment implications and that the causes of many physical diseases (unlike mild disabilities such as LD) are known. Moreover, treatment intensity in medicine is typically matched to the nature and severity of whatever physical malady is present. Obviously, a physician's first choice of treatment for most medical problems is not hospitalization. The point here is that not all children will require the most intense form of treatment of academic difficulties, and treatment intensity, strength, and/or duration should increase only after the child fails to show an adequate response to intervention.
In the current paper, I argue that children who fail to respond to empirically validated treatments implemented with integrity might be identified as LD. The concept of responsiveness to intervention appears to be a viable alternative approach to defining LD, particularly in light of the myriad difficulties with discrepancy-based models. This paper defines responsiveness to intervention as a change in academic performance as a function of an intervention. In order to employ treatment responsiveness as a criterion for identifying students as LD, assessment procedures should have treatment validity; that is, the assessment should contribute to the planning and implementation of more effective treatments to remediate academic deficits (Fuchs & Fuchs, 1998; Gresham & Witt, 1997; Nelson et al., 1987). Several issues in adopting the responsiveness-to-intervention approach appear to have been resolved, including (a) modeling academic growth, (b) sensitivity of measures to reflect growth, and (c) validated treatment protocols. These were discussed at length in this paper and will not be reiterated here except to say that the validated treatment protocols represent different intensities and durations of treatment. Depending on a student's response to treatment, these treatments may have to be titrated until an acceptable level of academic functioning is achieved. More important, several unresolved issues await further investigation and deliberation before the field can adopt responsiveness to intervention in eligibility determination.
Five important issues appear to be most important at this time in adopting responsiveness to intervention as the criterion for LD eligibility determination: (a) selecting the "best" intervention available, (b) determining the optimal length and intensity of the intervention, (c) ensuring the integrity of interventions, and (d) conducting cost-benefit analyses. These issues are discussed in the following sections.
Selecting the "best" intervention available. According to available research, there appears to be a consensus on the core components a reading intervention should address for students with reading disabilities. Reading research over the past 20 years indicates that the reading difficulties of these students are caused by weaknesses in the ability to process the phonological aspects of language (Liberman, Shankweiler, & Liberman, 1989; Stanovich & Siegel, 1994; Torgesen, 1996). In fact, reading growth is best predicted by initial levels of phonological skill rather than verbal ability or discrepancy between IQ and reading achievement (Torgesen et al., 2001; Vellutino et al., 1996, 1998). Torgesen et al. (2001) suggested that these phonological weaknesses require reading instruction that is more phonemically explicit and systematic than that provided to other children and there are many ways in which this might be accomplished in designing instructional activities.
Given the above consensus regarding the most important skills to target in intervention, what is the "best" intervention to accomplish this end? The meta-analysis by Swanson and Hoskyn (1999) suggested that interventions using a combination of direct instruction and strategy instruction produced the largest effect sizes, with 80% of the treatment groups having mean reading scores equal to or greater than those of control group students. Recall that the typical intervention in this meta-analysis was 13.3 hours over 10 weeks. Vellutino et al.'s (1996) intervention provided 35-40 hours of instruction over 15 weeks whereas the recent study by Torgesen et al. (2001) involved 67.5 hours over 8-9 weeks.
Comparisons among these studies are difficult given the large variability in the intensity and length of interventions (to be discussed below). Interventions based on applied behavior analysis, while effective, typically are of shorter duration, and outcome measures typically are more narrowly defined (Daly et al., 1996; Daly & Martens, 1994; Haring et al., 1978). Given the various effective intervention options available, practitioners must determine what "best practices" will be at the local level in terms of selecting and implementing a given strategy.
Determining the optimal length and intensity of intervention. Determining the length and intensity of intervention that is implemented is a crucial decision when using responsiveness to intervention as the criterion for identifying LD. Keep in mind a fundamental principle: The length and intensity of intervention will depend entirely on a student's responsiveness to it, which is individually based. Fuchs and Fuchs (1997, 1998) indicated that a general educator should attempt two interventions lasting no longer than 6 weeks before placing the student in a special education trial period. This special education trial period should last no longer than 8 weeks, after which time the assessment team reconvenes to continue and/or enhance the intervention program. Fuchs and Fuchs (1997) suggested that any assessment method must provide adequate data for evaluating treatment effectiveness and should answer the following questions. Is the nonadapted regular education classroom producing adequate academic growth? Have adaptations to the general education classroom produced improved growth? Has the provision of special education interventions improved student learning?
Another insight into this issue of length and intensity of interventions can be found in the meta-analysis of Swanson and Hoskyn (1999). As stated earlier, the typical intervention consisted of 22.47 minutes of daily instruction delivered 3.58 times per week for 35.72 sessions. Thus, the prototypical intervention consisted of about 13.3 hours of instruction distributed over approximately 10 weeks. It should be noted, however, that there was a huge degree of variability in terms of minutes of daily instruction (SD = 29.71 minutes), times per week (SD = 1.58), and number of sessions (SD = 21.72 sessions). Moreover, the samples used in these studies varied greatly regarding criteria used for participant selection, thereby introducing a confounding factor when evaluating responsiveness to intervention.
The prototypical study using (a) direct instruction, (b) strategy training, and (c) combined direct instruction + strategy training produced effect sizes of 0.77, 0.67, and 0.81, respectively. Also, students having the most severe reading deficits (<85) responded better to treatments (M = 0.71) than students with less severe reading difficulties (>84 and <91; M = 0.51). If one were to use the length and intensity of the prototypical reading study in this meta-analysis with a combination of direct instruction and strategy training, one could expect to produce a standard score point difference of 12 (M = 100, SD = 15) between pretest and posttest scores. For example, a student entering the intervention with a standard score of 78 could be expected to improve to a score of 90 at posttest, thereby indicating near-normal performance.
Another approach to determining optimal length and intensity of intervention can be found in the Vellutino et al. (1996) investigation. Recall that this study selected children who scored at or below the 15th percentile in reading (Word Identification and Word Attack) and were given 35-40 weeks of intensive one-to-one tutoring in reading. Each session lasted for 30 minutes, and 80 sessions were spread over 15 weeks for a total of 35-40 hours of reading instruction. At posttest, about half of the children showed either Good Growth or Very Good Growth in reading with posttest percentile ranks in the 44th and 64th percentiles, respectively, by the end of second grade. This study suggested that an intensive one-to-one reading intervention could be used to normalize reading performances of poor readers selected in the first grade. It is unknown at this time, however, how much one might change or otherwise deviate from this effective treatment protocol and produce similar results.
Finally, the study by Torgesen et al. (2001) compared two interventions with fourth graders implemented in two, 50-minute daily sessions, 5 days per week over 8-9 weeks (67.5 hours of intervention). The 19 children who were returned to general education subsequent to intervention moved from pretest scores of about 70 (average of Word Attack and Word Identification) to 2-year follow-up scores of approximately 95. In contrast, the students remaining in special education moved from pretest scores of about 67 to posttest scores of 83. Relative to growth made in the regular resource room, the average effect size was approximately 4.15 for the two treatment groups (difference between pretreatment and posttreatment slopes divided by pooled variability of pretreatment slopes). As with the Vellutino et al. (1996) study, we do not know how much this intervention can be modified or diluted and still obtain relatively large treatment effects.
One means of determining the optimal length and intensity of interventions based on the extant literature is to employ a multiple gating procedure similar to that used in the Heartland Area Education Agency (AEA) in Iowa to make special education entitlement decisions (Reschly & Tilly, 1999; Reschly & Ysseldyke, 1995). Figure 1 shows the problem-solving model used in the Heartland AEA for making special education eligibility determinations. Note that I have superimposed examples of interventions varying in intensity (that were reviewed in the current paper) within the Heartland AEA model. The responsiveness-to-intervention approach in this model makes the following assumptions:
Figure 1. Degree of unresponsiveness and intensity of treatment.

Ensuring the integrity of interventions. Treatment integrity (sometimes called treatment fidelity or procedural reliability) refers to the degree to which a treatment is implemented as intended (Gresham, 1989; Yeaton & Sechrest, 1981). Establishing and maintaining the integrity of treatments is one of the most important aspects of both the scientific and practical application of instructional procedures. It is likely that the ineffectiveness of many instructional interventions can be attributed, in part, to the poor integrity with which these procedures were implemented (i.e., deviations from an established treatment protocol). Adopting a responsiveness-to-intervention approach to identifying LD makes treatment integrity (the reliability of treatment implementation) a central feature of the entire process. In contrast, the entire practice of determining the most appropriate IQ-achievement discrepancy model is based on the reliability of difference scores (e.g., simple difference, predicted difference). In order to determine the degree of responsiveness to intervention, a treatment must be reliably and accurately implemented.
Recently, Gresham, MacMillan, Beebe-Frankenberger, and Bocian (2000) sought to determine the extent to which integrity was assessed in the LD intervention literature by analyzing articles in the three major LD journals from January 1995 to August 1999 (Journal of Learning Disabilities, Learning Disability Quarterly, and Learning Disabilities: Research & Practice). Of the 479 articles published in these journals, 65 articles (13.6%) were intervention articles. Of these 65 articles, only 12 articles (18.5%) actually measured and reported data on treatment integrity. In their synthesis of the LD intervention literature, Swanson, Carson, and Saches-Lee (1996) reported that less than 2% of the studies provided any information about treatment integrity. In spite of the methodological and statistical rigor used in this and other meta-analyses of the LD literature, none of these methodological considerations can answer two fundamental questions: (a) How are treatments implemented, and (b) What is the relation between treatment integrity and treatment outcomes in LD intervention research?
Swanson and Sachs-Lee (2000), in their review of the single-case intervention research with LD, found that only 28% of the studies (N = 24 studies) provided any measure of treatment integrity. Of these 28 studies, only 8 studies specified steps used to measure the integrity of the intervention. There appears to be a curious double standard in the LD intervention literature with respect to the measurement and reporting of reliability for the independent and dependent variables. That is, it is almost always the case that reliability data for the dependent variable are presented in published treatment-outcome research. In contrast, this same type of information rarely is required for the independent (treatment) variable.
Given the central importance of assessing treatment integrity in the responsiveness-to-intervention model of LD identification, the following recommendations are offered concerning how researchers and practitioners might conduct integrity assessments:
Specific components of an intervention should be operationally defined and measured much like the operational definition and measurement of dependent measures.
Each component of a treatment should be measured by either direct observation or videotaping using an occurrence-nonoccurrence method. Levels of treatment integrity should be obtained by summing the number of components correctly implemented and dividing this number by the total number of components to yield percentage integrity.
Two estimates of treatment integrity should be calculated. One, the integrity of each component across days or sessions of treatment should be computed to yield component integrity. Two, the integrity of all treatment components within days or sessions of treatment should be calculated to yield daily or session integrity. Given these two estimates of integrity, failure to find significant treatment effects might be explained by poor component integrity over time, by poor daily or session integrity, or both.
Indirect methods of assessing treatment integrity such as instructional manuals, permanent products, self-reports, interviews, and behavior rating scales should be used to supplement direct measures of integrity, but they must be interpreted cautiously. There is often low agreement between direct and indirect methods of integrity assessment (Gresham, 1997; Noell & Witt, 1999; Wickstrom, Jones, LaFleur, & Witt, 1998).
Cost-benefit analysis. An important aspect of using the responsiveness-to-intervention approach to LD identification is determining the financial costs to school districts. As mentioned earlier, the average cost of a traditional eligibility determination for a student with a mild disability is around $2,500 per case (Reschly, personal communication, 2001). What costs are incurred by using the CBM-dual-discrepancy model in which local normative data are collected over 20 weeks? What costs are associated with adopting any of the functional assessment models? Currently, we have no published data to assist us in calculating these costs.
Torgesen (personal communication, 2001), however, provided some data regarding the costs of his intensive intervention program described earlier (Torgesen et al., 2001). Torgesen states that a teacher who was doing this kind of intervention with children (two 50-minute sessions per day) could probably work with two children at a time for 8 weeks and the rest of the time could be spent following up on children taught earlier, or working as a teacher consultant, or planning. Given the normal interruptions in schools (assemblies, absences) it takes about 10 weeks of teacher time to deliver the full 80 sessions.
A teacher could work with about six severely LD children a day for 10 weeks. On the basis of a 37-week school year, a teacher could probably go through about three treatment cycles with six students per cycle and thus provide intensive reading intervention services to approximately 18 children per year (6 students H 3 treatment cycles). Remember, however, that Torgesen et al.'s (2001) data suggest that about half of these children will no longer need special education after the intervention. One way Torgesen calculates the cost is to take the cost per session at $50 (more or less depending on local costs for private tutoring) and multiply this figure by 80 sessions of instruction; the cost per student is approximately $2,000. Thus, for a teacher working with 18 students per year, the total cost of an intensive, treatment-oriented approach to LD would be about $36,000 per year. The mere cost of simply identifying, but not treating, 18 LD students using traditional IQ-achievement discrepancies is estimated to be $45,000 (18 H $2500).
One should consider these costs in light of the fact that the cost of educating a student in a resource room placement is 1.7 to 2.0 times the cost of educating a general education student in a regular classroom. In addition, remember that in the Torgesen et al. (2001) study, 40% of the students in the study no longer needed special education. Moreover, one should also note that the efficacy of traditional special-education-delivered interventions, according to meta-analyses, have been somewhat less than impressive (Kavale & Forness, 1999).
Another consideration in calculating these cost-benefits is the cost of LD eligibility determination using the traditional competing paradigm model described in concert with special education costs. Assuming the cost of a typical eligibility process is approximately $2,500 and also remembering that all LD students must undergo 3-year reevaluations, the cost of identifying and providing special education for LD students is almost twice that of educating general education students. As such, there may be long-term cost-benefits in adopting the responsiveness-to-intervention model, particularly in light of the following: (a) The average effect size of special education placement for LD students is about 0.30 (Kavale & Forness, 1999), (b) relatively few students get decertified as LD during their school careers, (c) early intensive reading interventions for poor readers (kindergarten-first grade) leads to GG or VGG in reading for about 50% of this population, and (d) intensive intervention may lead to a decertification of about 40% of children receiving this type of intervention.
The question for the LD field remains: How long do we implement an intervention before we determine that a child is an inadequate responder and thus eligible for more intensive special education services? Further, what is the cost of this intervention-based model relative to the traditional eligibility approach? Is the responsiveness-to-intervention approach more expedient in identifying students as LD so that intervention takes place earlier? How intense should this intervention be and how long should it last? Who should implement the intervention (teachers, paraprofessionals, reading specialists)? These questions must be addressed first when adopting a responsiveness-to-intervention approach to the identification of LD.
One must realize that some individuals have political, personal, financial, and/or other reasons in wanting to maintain the status quo in the classification of students as LD. This position is indefensible in light of the overwhelming evidence in the field that the IQ-discrepancy approach to LD identification is simply not valid and, most important, does not inform treatment decisions. These individuals may argue that a treatment-responsiveness model is analogous to confirming the accuracy of a cancer diagnosis by determining whether or not a treatment regimen of chemotherapy and radiation leads to remission. They might also argue that this approach does not improve the identification of students as LD, that it has some insurmountable measurement problems, that it leads to late identifications, and that it will be extremely expensive. However, it always should be remembered that these arguments are simply red herrings in the sea of abyss of what we now call LD.
It is incumbent upon the LD field to focus on answering the critical questions using empirical findings for assessment and interventions provided in this paper as a foundation. Establishing an effective method for determining eligibility for LD that can be linked to intervention can go a long way toward decreasing, if not eliminating, the probability that learning disabilities will continue to be the sociological sponge that wipes up the spills of general education.