LD Summit Table of Contents


Classification of Learning Disabilities: An Evidence-Based Evaluation

Jack M. Fletcher, University of Texas; G. Reid Lyon, National Institutes of Health; Marcia Barnes, University of Toronto; Karla K. Stuebing, University of Texas; David J. Francis, University of Houston; Richard K. Olson, University of Colorado; Sally E. Shaywitz, Bennett A. Shaywitz, Yale University
Learning Disabilities Summit: Building a Foundation for the Future White Papers

This paper is available in alternative formats: | Download Word | Download pdf |

Pages: | 1 | 2 | 3 | 4 | 5 | 6 | 7 |

DISCREPANCY HYPOTHESIS

The IQ-achievement discrepancy criterion is the most controversial and best-studied component of the federal definition of LD. From a classification perspective, it is a hypothesis that children with poor achievement below a level predicted by an IQ score (IQ discrepant) are different from children with poor achievement consistent with their IQ score (low achievement). IQ-discrepant children with LD have been proposed to differ from low achievers who are not IQ discrepant on several dimensions, including neurological integrity, cognitive characteristics, response to intervention, prognosis, gender, and the heritability of LD (Fletcher et al., 1998; Rutter, 1989; Siegel, 1992; Stanovich, 1991). There is an extensive body of research that can be used to evaluate this hypothesis. Although virtually all of the published studies involve reading disabilities (RD), we address LD in other domains later in this paper.

Isle of Wight Studies

The IQ-discrepancy classification hypothesis is not without support. The earliest empirical evidence validating IQ discrepancy came from the Isle of Wight studies in the early 1970s (Rutter & Yule, 1975). In this epidemiological study of RD, Rutter and Yule (1975) administered the Performance IQ Scale of the Wechsler Intelligence Scale for Children (WISC) and measures of reading. They defined two groups using a regression-adjusted definition: specific reading retardation, representing children with reading scores two standard errors below IQ, and general reading backwardness, representing children with reading scores that were deficient, but within two standard errors of IQ. In examining the distribution of residualized scores, they found an over-representation of children with general reading backwardness in the lower tail of the distribution of reading scores, representing a "hump." They also found evidence suggesting that the two groups of poor readers could be differentiated, thus accepting the existence of a group of children with specific RD:

Reading retardation is shown to differ significantly from reading backwardness in terms of sex ratio, neurological disorder, pattern of neurodevelopmental deficits and educational prognosis. It is concluded that the concept of specific reading retardation is valid. (p. 195)

Is There A Bimodal Distribution?

The Isle of Wight studies were widely accepted because they seemed to support the IQ-discrepancy hypothesis. Since that time, more critical evaluation of this support has become necessary. Although methodological factors involving inadequate ceilings on the reading measures have been cited (van der Wissell & Zegers, 1985), the critical issue centers around the interest of Rutter and Yule (1975) in the question of whether specific forms of RD could be distinguished from reading failure attributable to all other causes. Given this hypothesis, no exclusionary criteria were applied and approximately 36% of the children in the group defined as backwards readers had known or suspected evidence of a neurological disorder; many also had IQ scores in the ranges associated with mental deficiency. At the time, Rutter and Yule (1975) wrote that "it could be argued that the association with general reading backwardness was to be expected on the grounds of the below average intelligence of that group of children" (p. 189). It is well known that the distribution of IQ scores in a population is bimodal when individuals are included who have sustained injury to the central nervous system (Robinson, Zigler, & Gallagher, 2000). Not surprisingly, epidemiological studies in Australia (Jorm, Share, Matthews, & Matthews, 1986), New Zealand (Silva, McGee, & Williams, 1985), Great Britain (Rodgers, 1983; Stevenson, 1988), and the United States (Shaywitz, Escobar, Shaywitz, Fletcher, & Makuch, 1992) that either excluded or had fewer children with brain injury have largely failed to replicate the Rutter and Yule (1975) finding of a bimodal distribution. This finding can be attributed to the prevalence of neurologically impaired children on the Isle of Wight, many with mental deficiency (Fletcher et al., 1998).

Can IQ-Discrepant and Low Achieving Poor Readers Be Differentiated?

Rutter (1989) observed that the critical test of the classification hypothesis does not depend on the presence of a bimodal distribution. Rather, the question is whether differences can be found that meaningfully differentiate IQ-discrepant and low achieving groups, which is a classification hypothesis. More recent studies of the validity of this hypothetical two-group classification, reviewed by Aaron (1997), Fletcher et al. (1993), Fletcher et al. (1998), Siegel (1992), and Stanovich (1991), have provided mixed evidence for the validity of the two-group classification. Many comparisons yielded null results, whereas others demonstrated small but statistically significant differences between the two groups.

When the studies are examined, they can be broken into domains involving prognosis, response to intervention, neurobiological factors, behavioral characteristics, achievement, and cognitive correlates. The bulk of the studies involve the behavioral, achievement, and cognitive domains, which are addressed in three meta-analyses summarized below. There is also research examining prognosis, response to intervention, and neurobiological factors. All six domains can be examined as evaluations of the validity of a two-group classification of poor readers based on presence or absence of IQ discrepancy.

Prognosis

Rutter and Yule (1975) reported that children who were backwards readers (i.e., low achieving) actually showed more rapid development of academic skills than children who were reading retarded (i.e., IQ discrepant). As the reading and spelling skills of the backwards readers were lower at baseline, and children were not randomly assigned to the two groups, the greater advances may reflect regression to the mean. Francis, Shaywitz, et al. (1996) examined this question using data from the Grade 9 follow-up of children in the epidemiological, population-based Connecticut Longitudinal Project. In this project, reading skills were assessed yearly beginning in Grade 1. The population is now being followed as adults.

Francis, Shaywitz, et al. (1996) composed three groups of children based on Grade 3 WISC-R full scale IQ and reading tests: not reading impaired (NRI), IQ discrepant using a 1.5 standard error regression-based criterion, and low achieving (not discrepant, but reading below the 25th percentile). Comparisons of the reading development of the three groups on the composite score from the Woodcock-Johnson Psycho-Educational Test Battery (Woodcock & Johnson, 1979) showed no differences between the two groups with RD in the rate of growth over time or the level of reading ability at any age despite the fact that about half the children in the IQ-discrepant group received special education services. As expected, both groups of poor readers differed significantly from the NRI group in growth rate and reading ability at all ages.

In Figure 1, these comparisons are carried through Grade 12. Again, there are clearly no differences in growth rates or level of reading ability at any age despite an 18-point difference in IQ between the two groups of poor readers. There was also no evidence that the poor readers narrowed the gap. More than 70% of those who read poorly in Grade 3 read poorly in Grade 12, showing that without intervention, LD in reading is a chronic, lifelong condition. These findings parallel those of Share, McGee, and Silva (1989), who reported results from another large longitudinal study in New Zealand. They found that IQ was not relating to reading achievement within age bands (7, 9, 11, 13 years) nor did IQ predict change over time. Share et al. (1989) concluded, "It might be timely to formulate a concept of reading disability that is independent of IQ. Unless it can be shown to have some predictive value for the nature of treatment or treatment outcomes, considerations of IQ should be discarded in discussions of reading difficulties" (p. 99).

Figure 1

Response to intervention

In turning to treatment, several studies examined outcomes in relationship to different indices of IQ or IQ discrepancy. Aaron (1997) reviewed earlier studies that sometimes included comparisons of groups defined as LD and low achieving, observing that both groups made little progress in their reading development, even

Figure 1. Growth in reading skills by children from 6-18 years of age (Grades 1-12) in the Connecticut Longitudinal Study based on the reading cluster of the Woodcock-Johnson Psycho-Educational Test Battery. The children were identified at 8 years of age (Grade 3) as not reading impaired (NRI), reading disabled according to a 1.5 standard error discrepancy between IQ and reading achievement (RDD), or low reading achievement with no discrepancy (25th percentile; low achieving). The figure shows that growth in the two groups with reading disability is similar (the growth curves are indistinguishable); that neither catches up to the NRI group; and that the differences between the NRI group and the two groups with reading disability are apparent well before Grade 3.

with remedial placements. More recent studies explicitly examine this hypothesis in remedial or prevention efforts. In a remedial study of children with poor reading skills in Grades 2-5, Wise, Ring, and Olson (1999) assessed the relationship of full scale IQ in response to different approaches to intervention. They found that full scale IQ predicted about 5% of the variance in word reading outcomes on one measure of word reading, but that this effect was not apparent on other measures of word reading or assessments of phonological processing ability at the end of intervention. Similarly, Hatcher and Hulme (1999) found no relationships of IQ and reading outcomes involving word recognition.

Studies that have attempted to prevent RD in kindergarten and Grade 1 have also found no relationships of reading outcomes with full scale IQ or verbal IQ (Foorman, Francis, Beeler, Winikates, & Fletcher, 1997; Foorman, Francis, Fletcher, Schatschneider, & Mehta, 1998; Torgesen et al., 1999; Vellutino, Scanlon, & Lyon, 2000). Foorman et al. (Foorman, Francis, Beeler, et al. 1997; Foorman, Francis, Fletcher, et al. 1998) and Torgesen et al. (1999) examined relationships of reading intervention outcomes and general verbal ability, while Vellutino et al. (2000) looked both at levels of IQ and IQ discrepancy based on full scale IQ. In Vellutino et al. (2000), IQ-discrepancy scores were computed and compared among a variety of subgroups formed on the basis of reading gains, response intervention, and other indices. They concluded that "...the IQ-achievement discrepancy does not reliably distinguish between disabled and non-disabled readers ... Neither does it distinguish between children who were found to be difficult to remediate and those who are readily remediated, prior to initiation of remediation, and it does not predict response to remediation" (p. 235). These findings are especially important in showing that IQ discrepancy is not specifically associated with those who respond to intervention.

In all the above studies, measures of phonological awareness skills were robust predictors of response to intervention. Some of these studies found that levels of IQ predicted growth in reading comprehension ability (Hatcher & Hulme, 1999; Torgesen et al., 1999; Wise et al., 1999), but consider what IQ tests actually assess. The subtests that make up a verbal IQ scale are commonly found to represent a general verbal comprehension skill closely related to vocabulary (Fletcher et al., 1996; Sattler, 1993; Share et al., 1989, 1991). As such, it is not surprising that IQ would predict reading comprehension as vocabulary is an essential part of IQ and a strong predictor of reading comprehension skills (Adams, 1990). Indeed, if IQ tests included measures of phonological awareness, it is likely that such measures would predict response to intervention. Inclusion of such subtests would also virtually eliminate the possibility that children with RD could ever be IQ discrepant given the close linking of phonological awareness skills and RD. Altogether, the results do not provide much support for differences in response to intervention between children defined as IQ-discrepant and low achieving poor readers.

Neurobiological factors

A series of studies from a group of researchers at the University of Colorado has been completed on the heritability of RD that addresses the validity of the IQ-discrepancy hypothesis. Pennington, Gilger, Olson, and DeFries (1992) classified a large population of monozygotic and dizygotic twins in which at least one member was classified with RD and a set of control twins in which neither was RD into one of four groups: RD based on IQ discrepancy, RD based on low achievement, RD based on both IQ discrepancy and low achievement, and those not classified as RD. Comparisons were made in three domains involving (a) genetic etiology, (b) gender ratios and clinical correlates, and (c) neuropsychological profiles. The researchers reported no evidence for differential genetic etiology based on type of definition. They also did not find evidence for significant differences in gender ratios, clinical correlates, and neuropsychological profiles.

More recent studies from this group have specifically tested the hypothesis that the genetic etiology of RD may vary by virtue of either IQ discrepancy or level of IQ. In a series of studies summarized by Wadsworth, Olson, Pennington, and DeFries (2000), genetic factors were more related to RD in children who have higher IQ scores than those with lower IQ scores. In Wadsworth et al. (2000), the overall heritability of reading disability was 0.58. Separating children defined as RD with full scale IQ scores above or below 100 resulted in heritability estimates of 0.43 for the lower IQ group and 0.72 for the higher IQ group, a statistically significant difference. These results indicate that environmental influences are particularly salient as a cause of reading difficulties in children with lower IQ scores.

These differences in heritability, while statistically and practically significant, are relatively small. Several earlier studies of the cohort with smaller samples yielded differences that did not reach statistical significance. Wadsworth et al. (2000) required almost 400 pairs of twins in order to detect the difference. It is not accurate to suggest that, because of these differences, classifications based on IQ discrepancy have value for components of LD other than the etiology of RD. As the researchers noted, the relatively high IQ of children with RD could be related to a more intractable genetically-based reading failure despite strong environmental support for IQ and for learning to read, whereas those children with RD who have relatively lower IQ scores may have more pervasive deficiencies in cognitive development and reading that reflect broader environmental disadvantages. For example, children in the lower IQ group in Wadsworth et al. (2000) had homes where there were fewer books and where mothers had fewer years of education. The researchers argued against excluding lower IQ children from intervention or remediation because they did not meet an IQ-discrepant definition, suggesting that the greater impact of environment influences on RD in this group suggests the need for emphasizing environmental intervention. Unfortunately, the traditional use of IQ and achievement criteria for LD in determining access to services has exactly the opposite effect.

There are also studies of children with RD that use functional imaging methods, such as functional magnetic resonance imaging (fMRI), which are reviewed in detail in the section on constitutional factors. While no study has a sample that is sufficiently large to actually compare IQ-discrepant and low achieving poor readers, it is noteworthy that no studies include only those children with IQ discrepancy. There is no evidence from these studies that children who meet IQ-discrepancy and low achieving definitions of RD have different neuroimaging profiles.

Meta-analyses of behavior, achievement, and cognitive ability domains

There are three meta-analyses that address the validity of IQ-discrepancy classifications for children with RD in the behavior, achievement, and cognitive ability domains and that constitute the bulk of studies of the IQ-discrepancy classification (Fuchs, Fuchs, Mathes, & Lipsey, 2000a; Hoskyn & Swanson, 2000; Stuebing et al., in press). The three studies were completely independent, but addressed slightly different questions. Fuchs et al. focused on the question of whether "the reading performance of underachieving children with and without the learning disabilities label is the same or different" (Fuchs, Fuchs, Mathes, Lipsey, & Eaton, 2000b, p. 2). To address this question, they identified and coded 76 studies that evaluated reading skills in children who were poor readers with and without the LD label. Fuchs et al. (2000a, b) reported a large effect size (0.76) showing poorer reading by groups with the label of LD in reading (presumably IQ-discrepant) relative to groups presumed to be poor readers without the LD label.

Hoskyn and Swanson (2000) coded 19 studies that met stringent IQ and achievement criteria. They focused specifically on studies where cognitive skills were compared in groups formed of those with higher IQ and poor reading achievement (IQ-discrepant) versus those with both lower IQ and poor reading achievement. They found negligible to small differences on several measures of reading and phonological processing (range = -0.02 -0.29), but larger differences on measures of vocabulary (0.55) and syntax (0.87). The groups were more similar than different, leading them to conclude that "... our synthesis concurs with several individual studies indicating that the discrepancy ... is not an important predictor of cognitive differences between low achieving children and children with RD" (p. 117).

Stuebing et al. (in press) explicitly addressed the validity of the IQ-discrepancy classification hypothesis for RD in behavior, achievement, and cognitive domains. They reported on 46 studies that compared groups composed of poor readers who met explicit criteria for IQ discrepancy and low achievement. In the latter study, simply possessing the label of LD was not adequate, but some specification of the criteria used to designate children as IQ discrepant or low achieving was required. Fuchs et al. required the label of LD, with a presumption of IQ discrepancy, and some type of often unevaluated comparison group that presumably represented non-LD low achievers (e.g., placement in compensatory education). In contrast, Stuebing et al. required discrepancy criteria for the LD group and an indication that the low achieving group did not include individuals who might be IQ discrepant or typically achieving readers. These criteria were more liberal than Hoskyn and Swanson, but captured most of the 19 studies included in their meta-analysis.

Stuebing et al. (in press) found negligible aggregated effects for behavior (-0.05) and achievement (-0.12). A small effect size was found for cognitive ability (0.30). The effect sizes for the behavioral domain were homogeneous, but heterogeneity was apparent for the achievement and cognitive ability domains. When the heterogeneity was evaluated by examining the specific tasks within the achievement domain, those that involved word recognition, oral reading, and spelling showed small effect sizes indicating poorer performance by the IQ-discrepant groups. Tasks involving reading comprehension, math, and writing yielded negligible effect sizes. The small effect sizes for the former measures may reflect their similarity to the types of tasks used to measure poor reading in many studies. Similarly, constructs under cognitive ability closely related to reading yielded negligible effect sizes: phonological awareness (-0.13), rapid naming (-0.12), memory (0.10), and vocabulary (0.10). Not surprisingly, measures of IQ not used to define the groups yielded large effect size differences, while measures of cognitive skills like those measured by IQ tests (spatial cognition, concept formation) yielded small to medium effect sizes, the direction of both showing better performance by the IQ-discrepant group. Even with the inclusion of these measures of cognitive ability, the difference was only about three tenths of a standard deviation. Other analyses demonstrated (a) substantial overlap between the groups, and (b) that the size of the effects in different studies could be predicted by knowing the scores on the IQ and reading tasks used to define the groups (i.e., sampling variation across studies) and the correlation of these variables with the tasks used to compare the two groups. Stuebing et al. concluded that classifications of LD based on IQ discrepancy had at best weak validity.

The results of these three studies are quite consistent despite the differences in the research questions and the criteria for selecting studies. The most important difference was that unlike Stuebing et al. (in press), the other two meta-analyses did not differentiate IQ and achievement variables used to form the groups from those that served as dependent variables. It would be expected that variables used to define the groups would generate large effect sizes as IQ-discrepancy definitions select the poorest readers at each level of IQ (see Psychometric Issues below). To illustrate, Fuchs et al. (2000a, b) evaluated two constructs outside the reading domain that were not incorporated in the aggregated effect size estimate. The constructs yielded effect sizes consistent with Hoskyn and Swanson (2000) and Stuebing et al. (in press): 0.10 for phonological awareness and 0.26 for rapid naming. When measures of reading used to form groups were examined in Stuebing et al., a moderate effect size in reading showing poorer performance in children with IQ discrepancy was apparent. Altogether, these meta-analyses do not provide strong support for the validity of classifications based on IQ discrepancy.

Other Forms of LD and the IQ-Discrepancy Hypothesis

Discrepancy hypotheses have not received strong support in studies of RD, but LD is more than just RD. In this section, we review research on math disabilities, speech and language disorders, and psychometric issues relevant to any formulation of LD.

Specific math disability

As part of the Yale Center for Learning and Attention Disorders, Shaywitz (1996) evaluated the two-group classification hypothesis for computational disorders in math. The nature of these types of math disabilities (MD) is discussed below in the section on the heterogeneity hypothesis. Here we simply compare children who meet a 1.5 standard error IQ-discrepancy definition of MD with those who achieve below the 25th percentile, but whose math score on the Woodcock-Johnson Calculations subtest (Woodcock & Johnson, 1979) is within 1.5 standard errors of what would be predicted based on their full scale WISC-R score (Wechsler, 1974). These children do not meet criteria for RD using either IQ-discrepancy or low achieving criteria. They differ in full scale IQ (IQ-discrepancy M = 107, SD = 12; low achieving M = 96, SD = 9) and in math calculations (IQ-discrepant M = 78, SD = 10; low achieving M = 85, SD = 4). The nature and direction of the differences are exactly what would be expected given the properties of IQ-discrepancy definitions, where at each level of IQ the lowest performing children are identified into the IQ-discrepant group. Note also the reduction in the standard deviation relative to the population SD of 15, which is a product of subdividing a continuous distribution (Cohen, 1983).

Figure 2 shows a comparison of these two groups of children on a set of cognitive variables involving attention, language, problem solving, concept formation, and visual-motor skills. As Figure 2 shows, the IQ-discrepant group has higher performance levels on all variables. Note that neither group shows the severe impairment in phonological awareness associated with RD (see Figure 3 below). The group that is low achieving in math is noticeably poorer in vocabulary despite average reading skills. The critical issue, as for RD, is not that the groups differ; such differences in level of performance are expected because IQ tests are used to define the groups, and IQ is moderately to highly correlated with each of the measures (e.g., vocabulary) used to evaluate the children. Rather, the question is whether the pattern of differences separates the groups, implying that the correlates of math achievement differentiate the group. Testing the profiles for differences in shape did not yield a statistically significant difference and the effect size was negligible (0.06). As we have shown in the reading area (Fletcher et al., 1998), eliminating variability due to the difference in vocabulary eliminates the differences in level of performance apparent in Figure 2. The differences in Figure 2 are a product of the definitions and the correlates of poor math achievement do not appear to differ once the differences induced by the definition are taken into account.

Figure 2

Comorbid reading and math disability

Figure 3 compares IQ-discrepant and low achieving children with RD and MD on the same variables as Figure 2. In the upper panel, children with RD and no MD are depicted for contrast purposes, while the lower panel shows children with both RD and MD. In both panels, the striking impairment in phonological awareness is apparent. Note also the dip in vocabulary skills that characterizes both the low achieving groups. In the low achieving group that has only RD, the performance level in vocabulary is comparable with that of the low achieving MD group in Figure 2. Vocabulary is lowest in the low achieving RD-MD group. Again, these patterns reflect in part the relationship of IQ and vocabulary as opposed to specific associations with either RD or MD. The comorbid RD-MD group is more impaired in language skills, but also shows impairment on some of the same measures as the group that is only MD.

Figure 3

Speech and language disorders

Disorders of oral expression and listening comprehension are included under the LD category, though speech and language disorders are also a separate category in special education under IDEA. Epidemiological studies directed by Bruce Tomblin have explored the validity of IQ-discrepancy definitions in children who have disorders of expressive and receptive language. These comparisons have not supported the validity of IQ-discrepancy hypotheses for children with oral language disorders.

To illustrate, Tomblin and Zhang (1999) used measures of nonverbal IQ and oral language ability to create three groups of children from their large epidemiological study: not impaired, specific language impairment (IQ > 87 and composite language skills < 1.25 standard deviations below age), and general delay (IQ d" 87 and composite language skills < 1.25 standard deviations below age). Comparisons of the three groups on a variety of expressive and receptive language measures showed that the two language-impaired groups differed on multiple dimensions from the non-impaired group. Differences between the two language-impaired groups were less robust: "children with general delay closely parallel the specifically language-impaired group except that the children with general delay were more impaired and noticeably poorer on the test involving comprehension of sentences (grammatical understanding)" (p. 367). The investigators go on to question whether even this latter difference in grammatical understanding is specific to either group, noting, "current diagnostic methods and standards for specific language impairment do not result in a group of children whose profiles of language achievement are unique." A consensus group convened by the National Institute of Deafness and Communication Disorders reached a similar conclusion (Tager-Flusberg & Cooper, 1999).

Psychometric Issues

Although we could continue a research program to evaluate the IQ-discrepancy hypothesis across multiple permutations, psychometric factors make it unlikely that any form of discrepancy can be effectively used. These factors raise questions about the viability of any approach to LD identification based solely on the use of test scores and cut-off points. Whereas to this point we have addressed the validity of LD classification, psychometric factors raise questions about the reliability of LD classifications.

Figure 4. Bivariate distribution of simulated IQ and achievement measures with a mean of 100, standard deviation of 1.5, and correlation of 0.6. Cutoffs depicting a 1.5 standard error discrepancy and low achievement (< 25th percentile) are drawn. Four segments are apparent: not reading impaired, only low achievement, only IQ discrepant, and both low achievement and IQ discrepancy.

Figure 4

In Figures 4-6, we examine what happens when groups are formed using IQ-discrepancy definitions in simulated data constructed to follow the bivariate normal distribution with no true group structure. It is apparent that the instability in these "simulated groups" parallels the instability seen in true groups (Shaywitz et al., 1992), raising doubts about the validity of the "true groups" formed by IQ-discrepancy rules. Consider Figure 4, which plots the bivariate distribution of simulated ability and achievement measures with a mean of 100, standard deviation of 15, and correlation of 0.6, consistent with population estimates (Sattler, 1993). Figure 4 also shows the groups that emerge when a 1.5 standard error regression definition like that employed in Connecticut (see Figure 1) is used, along with an arbitrary cutoff for low achieving at the 25th percentile. In Figure 4, it is clear that the groups are clearly demarcated, with no overlap in group membership. Note that many data points are below the low achieving cutoff, but are not IQ discrepant. Another subgroup is below both the IQ-discrepancy and low achieving cutoffs. A few children are above the low achieving cutoff but below the IQ-discrepancy cutoff.

Figure 5. Bivariate distribution of simulated IQ and achievement measures with a mean of 100, standard deviation of 1.5, and correlation of 0.6. Cutoffs depicting a 1 standard deviation discrepancy and low achievement (< 25th percentile) are drawn. The subject designations are from Figure 4 and show how simulated cases shift across the four segments by virtue of the change in the definition of discrepancy.

Figure 5

Figure 5 shows what happens when a different definition of discrepancy (one standard deviation) is used, analogous to how discrepancy is defined in many states, i.e., discrepancy without adjustment for the correlation of IQ and achievement. The symbols for the group represent their original locations in Figure 4. Note that the IQ-discrepancy cutoff is much steeper; the regression line in Figure 4 is actually slightly curved so that it is steeper at lower levels of IQ and flatter at higher levels of IQ. As a consequence, the unadjusted discrepancy definition identifies fewer children with lower IQs as discrepant (14% become low achieving) and identifies more children with higher IQs as "disabled" (6% of a large NRI group). The arbitrariness of the two discrepancy cutoffs is illustrated by asking what could possibly be the important differences in the 14% of children who change from IQ discrepant to low achieving at lower levels of IQ and other children who stay in these segments? Similarly, are the 6% of those who become "disabled" in Figure 5 truly impaired in reading? Fletcher et al. (1998) found no evidence supporting this hypothesis.

Figure 6. Simulated stability of group designations over time based on high stability (0.9) and reliability (0.8) for IQ and achievement measures. The subject designations are from Figure 4 and demonstrate the high instability associated with psychometric decision rules for identifying LD.

Figure 6

Figure 6 uses simulated data to show what happens to group membership over time. This figure was generated assuming high stability (0.9) in the traits measured and high reliability (0.8) for the measures of both traits. These assumptions mean that the traits vary little from person to person over time (i.e., individual differences are stable), and the traits are well-measured by the specific instruments. Thus, although there may be growth in the traits, growth does not differ much from one person to the other. These conditions should lead to a high degree of stability in classifications. Heterogeneity in growth would lead to instability in both individual differences over time and the classifications.

Figure 6 shows that classifications are not stable over time, despite the generally favorable conditions for stability. The instability is apparent in all four segments of Figure 6. In the group that is both IQ discrepant and low achieving, 38% move to the low achieving segment and another 38% move to the NRI segment. For the segment that is low achieving at Time 1, 14% move to the both IQ-discrepant and low achieving segment and 36% move to the NRI segment. In the Time 1 NRI segment, 3% move to the both IQ-discrepant and low achieving segment, 7% to the low achievement segment, and 1% to the IQ-discrepant-only segment. Finally, 67% of the only-discrepant segment moved to the NRI segment.

The lack of stability is also apparent when IQ and achievement scores are modeled from the Connecticut Longitudinal Study (Shaywitz et al., 1992). If IQ discrepancy and low achievement formed distinct and valid groupings, then one would expect stability in classifications over time, or at least instability that does not parallel the instability found in arbitrary classifications in simulated data. That IQ-discrepancy and low achieving classifications show instability that parallels the instability of arbitrary classifications in simulated data suggests that the IQ-discrepancy and low achieving distinctions are similarly arbitrary classifications, formed within a bivariate normal space and whose properties are largely driven by psychometric characteristics of this space rather than any inherent characteristics of the groups being formed.

Conclusions: Discrepancy Hypothesis

Concerns about the validity of the IQ-discrepancy classification hypothesis have led some to essentially reject the concept of LD (Ysseldyke, Algozzine, Shinn, & McGue, 1982), leading to fierce disagreements on whether LD and low achieving groups differ--all in defense of the concept of LD, not the validity of a hypothetical classification (Algozzine, Ysseldyke, & McGue, 1995; Kavale, 1995; Kavale, Fuchs, & Scruggs, 1994). The question is not so much whether children defined as IQ discrepant and low achieving are different, but how much they differ and whether the differences are meaningful for research and practice. The evidence reviewed above for prognosis, response to intervention, neurobiological factors, behavior, achievement, and cognitive abilities suggests that the IQ-discrepancy classification hypothesis lacks strong evidence for external validity. The psychometric evidence shows that the classification has problems with reliability. The criteria derived from the two-group classification produce groups of underachievers that are significantly overlapping; the differences that emerge are not strongly related to academic performance or to treatment and prognosis. Differences in behavior and cognitive abilities independent of the criteria used to form the groups are negligible. There is evidence for differences in the heritability of RD, but the differences are small, difficult to attribute solely to genetic factors, and with little evidence supporting the need to single out the IQ-discrepant group. There is no evidence from neuroimaging studies of a need to differentiate the groups; such studies routinely combine IQ-discrepant and low achieving children. Thus, consistent with the call of many researchers, the viability of the IQ-discrepancy classification hypothesis must be questioned.

Pages: | 1 | 2 | 3 | 4 | 5 | 6 | 7 |

Return to LD Summit papers table of contents