Kenneth A. Kavale, University of Iowa
Learning Disabilities Summit: Building a Foundation for the Future White Papers
This paper is available in alternative formats: | Download Word | Download pdf |
Pages: | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
When LD determination is not based on the application of strict criteria, the diagnostic process may be likened to the U.S. Supreme Court's definition of pornography: "I know it when I see it." The lack of rigor in the diagnostic process has led to an accelerated rate of LD identification and LD becoming, by a wide margin, the largest category in special education. Presently, LD accounts for more than 50% of all students with disabilities and more than 5% of all students in school (U.S. Department of Education, 1999). In commenting on the magnitude of the increase in LD prevalence, MacMillan, Gresham, Siperstein, and Bocian (1996) suggested that "Were these epidemic-like figures interpreted by the Center for Disease Control, one might reasonably expect to find a quarantine imposed on the public schools of America" (p. 169). There is little justification for such numbers, and the problem is compounded by the lack of consistency in the way the LD population is distributed across settings (Kavale & Forness, 1995). Clearly, fewer students are identified as LD when a strict discrepancy criterion is implemented rigorously (e.g., Finlan, 1992), but external factors (e.g., financial resources) may significantly influence (and increase) the number of students identified as LD (Noel & Fuller, 1985). Forness (1985) showed how state special education policy changes in California significantly affected the number of students identified in the high-incidence mild disability categories. LD saw a 156% gain compared to the 104% gain nationally, and a comparison with concomitant losses for MR and BD led Forness to the conclusion "that California's relatively dramatic increase in children identified as learning disabled may be at the expense of two other related categories" (p. 41). Such state disparities were not uncommon and led to the conclusion that "Our results suggest that variation in LD diagnostic levels across states is significantly related to distinctions in diagnostic practice as well as or instead of actual disease prevalence" (Lester & Kelman, 1997, p. 605). In contrast, far greater consistency in classification rates has been found for hearing impairment and physical/multiple disability compared to LD (Singer, Palfrey, Butter, & Walker, 1989).
The confounding among high-incidence mild disabilities appears to be primarily between LD and MR. MacMillan et al. (1996) found among 150 referred students 43 with IQ levels at 75 or below. Of the 43, only 6 were classified MR even though they met the requisite eligibility cut-off score, while 18 were classified LD primarily because the LD label was viewed as a more acceptable designation. Similarly, Gottlieb, Alter, Gottlieb, and Wishner (1994) found that an urban LD sample possessed a mean IQ level that was 1 SD lower than a suburban comparison sample. They concluded that "These children today are classified as learning disabled when in fact most are not" (p. 463). This view was affirmed by MacMillan, Gresham, _and Bocian (1998) who found that out of 61 students classified LD by schools, only 29 met the required discrepancy criterion. In analyzing the results, they remarked that "We did not anticipate the extent to which the process would yield children certified as LD who failed to meet the discrepancy required by the education code" (p. 322). Thus, even though discrepancy remains the primary (and sometimes sole) criterion for LD identification, it was often ignored in actual practice. Gottlieb et al. (1994) suggested "the discrepancy that should be studied most intensively is between the definition of learning disability mandated by regulation and the definition employed on a day-to-day basis in urban schools" (p. 455).
Because "public school practices for diagnosing children with LD bear little resemblance to what is prescribed in federal and state regulations (i.e., administrative definitions) defining LD" (MacMillan et al., 1998, p. 323), the LD population has become increasingly heterogeneous and the longstanding "problem of heterogeneity" firmly entrenched (Gallagher, 1986). For example, Gordon, Lewandowski, and Keiser (1999) analyzed the problems associated with the LD label for "relatively well functioning" students. By failing to rigorously adhere to a SDL criterion, students with LD may not demonstrate underachievement, a primary LD feature (Algozzine, Ysseldyke, & Shinn, 1982) which then makes the utility of the LD category open to question (Epps et al., 1984).
The vagaries of LD classification, especially the inability to differentiate LD and low achievement (LA), have been demonstrated in studies conducted by the University of Minnesota Institute for Research on Learning Disabilities (Minnesota studies). Ysseldyke, Algozzine, and Epps (1983) analyzed psychometric data obtained from students without LD using 17 operational definitions of LD. For 248 cases, 85% met the requirements for one operational definition of LD, while 68% qualified with two or more operational definitions. Only 37% of the non-LD sample did not meet the criteria specified in any of the 17 operational definitions of LD. A second analysis examined data for students with LD and students with LA to determine how many would qualify with each of the 17 operational definitions of LD used earlier. For the LD group, 1% to 78% were classified LD with each definition while the LA group was also classified LD from 0% to 71% of the time using each operational definition. Further analysis showed that 4% of the LD group was not classified by any of the 17 operational definitions while 88% of the LA group qualified as LD by using at least one operational definition. In a similar investigation, Epps, Ysseldyke, and Algozzine (1983) examined the number of students identified as LD with each of 14 operational definitions that emphasized the discrepancy criterion. The definitions classified from 7% to 81% of students as LD, whereas 5% to 70% of a non-LD group were also classified LD using at least one of these 14 operational definitions. To determine the congruence among the 14 operational definitions, Epps, Ysseldyke, and Algozzine (1985) performed a factor analysis and found two factors. The first factor (I) emphasized LA whereas the second factor (II) was represented by discrepancy. In terms of their respective weights, Factor I accounted for 70% of the variance compared to 16% for Factor II. The difference in explained variance led to the conclusion that LD might be properly conceptualized as a category reflecting LA, rather than discrepancy.
Epps et al. (1985) also found that knowing how many LD definitions qualified a student provided little assistance in correctly predicting group membership (LD vs. LA). Algozzine and Ysseldyke (1983) also found considerable inaccuracy in decisions about group membership (LD vs. LA) and concluded that "To make classification dependent on these discrepancies seems somehow arbitrary and capricious" (p. 245). Consequently, discrepancy appeared to possess limited value, and suggestions about its worth as a criterion for LD identification possessed little merit because "there may be an equally large number of children exhibiting similar degrees of school achievement not commensurate with their measured ability who are not categorized and therefore are not receiving special education services even though they are eligible for them under the current conceptual scheme represented by the category of learning disabilities" (p. 246). Thus, the failure to make LD a classification predicated on discrepancy suggests that it has not been possible to unequivocally define a category different from LA, and it might be more appropriate to recognize LA as the major problem.
The Minnesota studies appeared to support the view that reliance upon a discrepancy criterion for LD identification may not be defensible because it does not provide a clear distinction between LD and LA. L. R. Wilson (1985), however, challenged the idea that the LD category should be eliminated in favor of a more general classification like LA because a more general category will do little to eliminate the ambiguities and inconsistencies associated with LD. In fact, the Minnesota studies may themselves possess ambiguities and inconsistencies that limit the findings. For example, the Minnesota studies used only a discrepancy criterion for LD identification, and failed to include other components of the federal definition such as the exclusion which "states that the academic deficit cannot be the result of other possible causes such as emotional and personality factors, cultural deprivation, impaired sensory acuity, or educational deprivation" (p. 45). Since this aspect of the federal definition was not applied, the identification process was necessarily incomplete and restricted.
The other major problem area was related to sampling, specifically the possibility of bias in the Minnesota samples. The final sample used in the Minnesota studies was selected from a much larger population, which raised the question, "Is there evidence to suggest that the selection was random or is there reason to believe that bias may have distorted the findings?" (L. R. Wilson, 1985, p. 45). With respect to the LA group, L. R. Wilson suggested that "there is good reason to suspect that selection factors may have produced a disproportionately large number of discrepant achievers in the group of low achievers who were not formally labeled as learning disabled" (p. 46). Finally, the restricted nature of the selected samples raised questions about the generalizibility of the Minnesota findings.
In an analysis of a large-scale Iowa sample, L. R. Wilson (1985) demonstrated "that the federal definition of learning disabilities can be successfully used, that it can be consistently applied by a large group of special education professionals, that the various components of currently accepted learning disability definitions can provide the basis for discriminating a reasonably unique group of children, and that the exceptions found in this study, and other similar ones, do not automatically invalidate the previous conclusions" (pp. 49-50). The application of both a discrepancy and exclusion criterion resulted in a sound foundation for LD classification. As a result, the LD concept was quite defensible, and it would be "premature to eliminate it in favor of other concepts that probably have the very same weaknesses" (p. 51). In response, Algozzine (1985) suggested that there was really no reprieve for the LD concept and again LD was suggested to be a less than viable special education category because "creating the new concept of learning disabilities has not reduced the ambiguities, inconsistencies, and inadequacies that existed when low achievement was not a separate diagnostic category" (p. 75).
The continuing debate about the LD-LA distinction began to erode the integrity of LD. Longstanding critiques of the LD definition (e.g., Reger, 1979; Senf, 1977; E. Siegel, 1968) evolved into suggestions that LD really did not exist as an independent entity as well as its depiction as myth (McKnight, 1982), questionable construct (Klatt, 1991), or imaginary disease (Finlan, 1994). The assumption that LD and LA could not be reliably distinguished became conventional wisdom. The primary evidence came from a study by Ysseldyke, Algozzine, Shinn, and McGue (1982) showing a substantial degree of overlap between the test scores of LD and LA groups and a conclusion raising "serious concerns regarding the differential classification of poorly achieving students as either LD or non-LD" (p. 82). Further confirmation was found in a study by B. A. Shaywitz, Fletcher, Holahan, and Shaywitz (1992) who concluded that "Our findings suggest more similarities than differences between the reading disabled groups" (p. 646). Group membership in this case was defined with a discrepancy criterion (LD) or low achievement (LA) criterion (scoring below 25th percentile in reading). When the LD and LA groups were compared across a number of child-, teacher-, and parent-based measures, few differences were found, with the major exception being in the ability (i.e., IQ) area. Nearly all the variance between groups was accounted for by IQ, but this may only be a reflection of the way groups were defined.
The findings from these studies have had significant impact and have been reported with remarkable consistency. For example, the Ysseldyke, Algozzine, Shinn, & McGue (1982) study has been used to conclude that limited LD-LA differences existed as exemplified in the following statements gleaned from the literature:
a. Certain researchers have suggested that LD is largely a category for low-achieving children.
b. [Ysseldyke et al.] found few psychometric differences between groups of students identified as learning disabled (LD) and low achievers who did not carry the label.
c. Recent studies of children diagnosed as learning disabled have shown that many such children...are virtually indistinguishable from low-achieving non-handicapped peers.
The difficulties in differentiating LD and LA groups were based on the Ysseldyke, Algozzine, Shinn, & McGue (1982) findings of a large number of identical scores between LD and LA subjects as well as a high percentage of overlap between scores. For example, on the Woodcock-Johnson Psychoeducational Battery, LD and LA groups showed identical scores 33 out of 49 times and an average overlap percentage of 95%. On five other psychoeducational measures, in better than half the cases there were identical scores and a 96% percentage of overlap. These metrics appeared, however, to be at variance with the reported statistical analyses. A comparison of Woodcock-Johnson scores revealed "that on average the LD group performed significantly poorer on 10 of the subtests" (p. 98), while statistical comparison of the five other psychoeducational measures showed "that the mean level performance of the LD children was lower on many of the measures, particularly the PIAT [Peabody Individual Achievement Test], and at times was significantly less than the mean level of their low-achieving peers" (p. 79).
Kavale, Fuchs, and Scruggs (1994) reexamined the Minnesota studies using quantitative synthesis methods (meta-analysis) and demonstrated how the percentage of overlap metric used by Ysseldyke, Algozzine, Shinn, & McGue (1982) may have masked real performance differences. The overlap metric used in the Minnesota studies was calculated by using the range of scores found for one group and then comparing how many cases from the second group fell within that same range, but with such a methodology, "[t]he potential bias toward overlap is high because the comparison is based on the variability demonstrated by only one group with the other being forced into that distribution without regard to the characteristics of its own variability" (Kavale et al., 1994, p. 74). The effect size (ES) statistic used in meta-analysis, because it is a standard score (z-score), eliminates potential bias by representing the extent to which groups can be differentiated, or, conversely, the degree of group overlap. For example, an ES of 1.00 indicates that the two compared groups differed by 1 SD and that 84% of one group can be clearly differentiated from the other group with a 16% group overlap.
Using the data from the Ysseldyke, Algozzine, Shinn, & McGue (1982) study, Kavale et al. (1994) calculated ES's for 44 comparisons and found an average ES of 0.338. This means that, on average, it would be possible to reliably differentiate 63% of the LD group. Conversely, 37% could not be differentiated, and this represented the degree of overlap that was substantially less than the average 95% reported by Ysseldyke, Algozzine, Shinn, & McGue (1982). For the Woodcock-Johnson Cognitive Ability subtests, an average ES of 0.304 was found, while the Achievement subtests provided an average ES of 0.763. With little reason to expect cognitive (IQ) differences between LD and LA groups, the modest group differentiation was not surprising. On the other hand, almost 8 out of 10 members of the LD group scored at a level that made it possible to discern clear achievement differences when compared with the LA group members. Similar findings emerged with other cognitive and achievement tests. For example, Wechsler Intelligence Scale for Children-Revised (WISC-R) comparisons revealed an average ES of 0.141 (56% level of group differentiation) while PIAT comparisons showed an average ES of 1.14, indicating that in almost 9 out of 10 cases (87%), the LD group performance was substantially below the LA group. Consequently, "it appears that the lower achievement scores of the LD group are of a magnitude that distinguishes them from their LA counterparts" (Kavale et al., 1994, pp. 74-75).
Algozzine, Ysseldyke, and McGue (1995) contested the meta-analytic findings but agreed that students with LD may be the lowest of the low achievers. They suggested that the difficulty was in interpreting the meaning of that status: "Where we part company is in the inference that because students with LD may be the lowest of a school's low achievers, they necessarily represent a group of people with qualitatively different needs who require qualitatively different instruction" (pp. 143-144). What Algozzine et al. (1995) failed to consider, however, were the findings showing minimal group differentiation in the cognitive domain. With essentially no difference in ability but large differences in achievement, the LD group demonstrated "significant discrepancy" that was not shown by the LA group. Consequently, Kavale (1995) suggested that the LD and LA groups "represent two distinct populations. Because the LD group are lower on achievement dimensions but not on ability, they are, in addition to being the lowest of the low achievers, a different population defined by an ability-achievement distinction represented in a different achievement distribution but not in a different ability distribution" (p. 146).