Chapter 8: Test-Retest Reliability

Test-retest reliability is computed using the correlation between scores obtained on two occasions over a specified period of time for the same youth by the same rater. Measures with stable scores are expected to have high correlations, indicating little change in scores from one administration to another. The test-retest reliability of the Conners 4 was

Test-retest reliability is computed using the correlation between scores obtained on two occasions over a specified period of time for the same youth by the same rater. Measures with stable scores are expected to have high correlations, indicating little change in scores from one administration to another. The test-retest reliability of the Conners 4 was assessed by computing the correlation of T-scores obtained on two separate administrations over a 2- to 4-week interval (14 to 30 days) within a subset of youth from the general population portion of the Normative Sample (N = 81 for Parent, N = 61 for Teacher, and N = 68 for Self-Report; in appendix F, please see Table F.1 for demographic characteristics of the youth being rated and Table F.2 for demographic characteristics of the parent and teacher raters).

Correlation coefficients provide us with a statistical measure of the degree of association between two variables. The reliability coefficients are Pearson’s correlations, ranging from -1 to 1, with higher values indicating greater consistency or agreement between ratings. Although there are several approaches to interpretation, the correlation coefficients are categorized herein as follows: absolute values lower than .20 are classified as very weak; values of .20 to .39 are considered weak; values of .40 to .59 are moderate; values of .60 to .79 are strong; and absolute values greater than or equal to .80 are very strong (Evans, 1996).

The obtained correlations, as well as those corrected for variation (Bryant & Gokhale, 1972), are provided in Tables 8.9 to 8.11. These tables also show the means, medians, and SDs at each time point. Overall, the results demonstrate evidence of excellent test-retest reliability for the Conners 4 scales and that the effect of time across administrations is negligible (i.e., corrected correlations ranged from .83 to .99 for Parent, .81 to .97 for Teacher, and .63 to .86 for Self-Report, all p < .001). As further evidence of score stability over the course of the retest period, mean scores from each time point are closely aligned, as seen in Tables 8.9 to 8.11. The stable nature of the scores, as demonstrated by the test-retest reliability coefficients, provides assurance that changes observed in scores over time is due to a true change in the symptoms or impairments, as opposed to imprecise measurement.

The stability of the Conners 4 scores was further evaluated in the test-retest samples by calculating the difference between each individual’s Time 1 and Time 2 ratings. If scores increased or decreased by greater than, or equal to, 10 T-score points (i.e., 1 SD or greater), the change was considered notable. Tables 8.12 to 8.14 present the percentage of the sample with increases and decreases in scores, with most showing differences of fewer than 10 points. These tables also present the mean differences, as well as differences in SDs, between ratings from Time 1 to Time 2 (positive differences indicate that scores increased at Time 2, while negative differences indicate that scores decreased at Time 2). The differences in scores from Time 1 to Time 2 were slight (mean differences ranged from -2.4 to 1.0 points across all forms) for Parent, Teacher, and Self-Report, indicating consistency in responses across the time interval. Additionally, the differences between the SDs were quite small (ranging from -1.7 to 2.2 across all forms), showing a similar dispersion of scores from Time 1 to Time 2. The results provide support for excellent stability of the Conners 4 scores. Taken together, these results (the proportion of the sample with minimal change in their scores, and the marginal mean differences in scores across a specific time interval) demonstrate the stability of scale scores for the Conners 4 across administrations.


Table 8.9. Test-Retest Reliability: Conners 4 Parent

Scale

Obtained
r

Corrected
r

Time 1

Time 2

M

Mdn

SD

M

Mdn

SD

Content Scales

Inattention/Executive Dysfunction

.90

.95

47.1

46

8.2

45.9

44

8.3

Hyperactivity

.84

.92

48.3

47

8.0

46.7

45

8.3

Impulsivity

.85

.90

48.2

47

8.1

47.7

45

9.5

Emotional Dysregulation

.82

.94

48.9

47

7.9

47.3

45

6.8

Depressed Mood

.69

.83

48.4

47

8.1

48.2

46

7.9

Anxious Thoughts

.64

.88

47.5

44

6.2

47.8

44

7.3

Impairment & Functional
Outcome Scales

Schoolwork

.86

.96

46.4

44

7.5

46.2

45

6.4

Peer Interactions

.94

.98

47.4

45

7.6

47.2

45

7.6

Family Life

.79

.91

48.4

46

7.0

47.5

45

8.0

DSM Symptom Scales

ADHD Inattentive Symptoms

.89

.94

47.0

45

8.2

46.3

45

8.4

ADHD Hyperactive/Impulsive Symptoms

.88

.94

48.1

47

7.7

46.9

45

8.7

Total ADHD Symptoms

.91

.96

47.5

46

7.8

46.4

45

8.6

Oppositional Defiant Disorder Symptoms

.82

.91

48.2

45

8.5

46.9

45

7.8

Conduct Disorder Symptoms

.78

.99

47.5

45

4.6

47.2

45

4.0

Note. N = 81. All correlations significant, p < .001. Guidelines for interpreting |r|: very weak < .20, weak = .20 to .39, moderate = .40 to .59, strong = .60 to .79, very strong ≥ .80.


Table 8.10. Test-Retest Reliability: Conners 4 Teacher

Scale

Obtained
r

Corrected
r

Time 1

Time 2

M

Mdn

SD

M

Mdn

SD

Content Scales

Inattention/Executive Dysfunction

.88

.89

49.4

48

9.5

49.3

46

9.9

Hyperactivity

.91

.91

50.1

49

9.8

49.1

46

10.4

Impulsivity

.85

.81

50.1

48

10.4

49.8

45

11.4

Emotional Dysregulation

.90

.92

48.4

45

8.5

49.3

44

10.7

Depressed Mood

.77

.88

49.0

46

8.4

48.0

45

7.9

Anxious Thoughts

.81

.83

49.4

47

9.7

48.6

44

9.4

Impairment & Functional
Outcome Scales

Schoolwork

.90

.89

50.4

47

10.8

49.7

47

10.0

Peer Interactions

.92

.93

48.8

44

9.7

48.9

46

9.8

DSM Symptom Scales

ADHD Inattentive Symptoms

.86

.86

49.5

48

10.0

49.6

47

10.1

ADHD Hyperactive/Impulsive Symptoms

.92

.91

50.0

47

10.0

49.3

46

10.5

Total ADHD Symptoms

.91

.90

49.8

47

10.0

49.5

45

10.4

Oppositional Defiant Disorder Symptoms

.90

.91

48.9

45

9.3

49.2

44

10.5

Conduct Disorder Symptoms

.84

.97

47.6

45

5.7

48.0

45

6.9

Note. N = 61. All correlations significant, p < .001. Guidelines for interpreting |r|: very weak < .20, weak = .20 to .39, moderate = .40 to .59, strong = .60 to .79, very strong ≥ .80.


Table 8.11. Test-Retest Reliability: Conners 4 Self-Report

Scale

Obtained
r

Corrected
r

Time 1

Time 2

M

Mdn

SD

M

Mdn

SD

Content Scales

Inattention/ Executive Dysfunction

.62

.71

48.4

48

8.2

48.0

47

9.7

Hyperactivity

.76

.78

49.7

50

9.8

47.5

45

9.5

Impulsivity

.63

.73

49.3

47

9.3

47.1

46

8.1

Emotional Dysregulation

.67

.79

49.3

47

8.7

47.8

47

8.2

Depressed Mood

.71

.76

49.5

47

9.7

47.2

44

9.0

Anxious Thoughts

.75

.76

49.7

46

10.4

48.6

45

9.4

Impairment & Functional
Outcome Scales

Schoolwork

.68

.73

48.8

48

8.8

48.1

47

9.8

Peer Interactions

.70

.72

48.5

45

9.6

48.6

45

9.8

Family Life

.71

.86

49.2

48

8.3

48.0

48

7.1

DSM Symptom Scales

ADHD Inattentive Symptoms

.57

.63

49.0

48

9.0

48.1

47

9.5

ADHD Hyperactive/Impulsive Symptoms

.76

.81

49.3

48

9.7

47.2

46

8.8

Total ADHD Symptoms

.69

.76

49.1

48

9.1

47.5

47

8.8

Oppositional Defiant Disorder Symptoms

.53

.73

49.5

48

8.5

47.2

46

6.8

Conduct Disorder Symptoms

.55

.70

49.0

46

8.7

48.9

46

7.7

Note. N = 68. All correlations significant, p < .001. Guidelines for interpreting |r|: very weak < .20, weak = .20 to .39, moderate = .40 to .59, strong = .60 to .79, very strong ≥ .80.


Table 8.12. Difference Between Time 1 and Time 2 T-scores: Conners 4 Parent

Scale

Percentage of Test-Retest Sample

M
Differences

SD
Differences

Time 1 ≥ 1
SD higher than Time 2

Scores
differed by less than 1 SD

Time 2 ≥ 1
SD higher than Time 1

Content Scales

Inattention/Executive
Dysfunction

1.2

97.5

1.2

−1.1

0.1

Hyperactivity

3.7

95.1

1.2

−1.6

0.3

Impulsivity

2.5

92.6

4.9

−0.5

1.4

Emotional Dysregulation

4.9

93.8

1.2

−1.6

−1.0

Depressed Mood

3.7

92.6

3.7

−0.2

−0.2

Anxious Thoughts

2.5

92.6

4.9

0.3

1.1

Impairment &
Functional Outcome Scales

Schoolwork

2.5

97.5

0.0

−0.2

−1.0

Peer Interactions

1.2

98.8

0.0

−0.2

0.0

Family Life

3.7

95.1

1.2

−0.9

1.0

DSM Symptom Scales

ADHD Inattentive Symptoms

2.5

96.3

1.2

−0.8

0.2

ADHD Hyperactive/Impulsive Symptoms

1.2

96.3

2.5

−1.2

1.0

Total ADHD Symptoms

1.2

98.8

0.0

−1.1

0.8

Oppositional Defiant Disorder Symptoms

3.7

92.6

3.7

−1.3

−0.7

Conduct Disorder Symptoms

1.2

97.5

1.2

−0.4

−0.6

Note. N = 81. 1 SD is equivalent to 10 T-score points. A positive M difference indicates that the parent's ratings were higher at Time 2 than at Time 1.


Table 8.13. Difference Between Time 1 and Time 2 T-scores: Conners 4 Teacher

Scale

Percentage of Test-Retest Sample

M
Differences

SD
Differences

Time 1 ≥ 1 SD higher than Time 2

Scores
differed by less than 1 SD

Time 2 ≥ 1 SD higher than Time 1

Content Scales

Inattention/Executive Dysfunction

3.3

95.1

1.6

−0.1

0.4

Hyperactivity

3.3

96.7

0.0

−1.0

0.6

Impulsivity

4.9

91.8

3.3

−0.4

1.1

Emotional Dysregulation

3.3

91.8

4.9

1.0

2.2

Depressed Mood

4.9

93.4

1.6

−1.0

−0.5

Anxious Thoughts

9.8

86.9

3.3

−0.7

−0.3

Impairment & Functional
Outcome Scales

Schoolwork

3.3

95.1

1.6

−0.7

−0.8

Peer Interactions

1.6

96.7

1.6

0.0

0.0

DSM Symptom Scales

ADHD Inattentive Symptoms

3.3

95.1

1.6

0.1

0.1

ADHD Hyperactive/Impulsive Symptoms

0.0

96.7

3.3

−0.7

0.5

Total ADHD Symptoms

3.3

95.1

1.6

−0.3

0.5

Oppositional Defiant Disorder Symptoms

1.6

96.7

1.6

0.3

1.2

Conduct Disorder Symptoms

0.0

95.1

4.9

0.3

1.2

Note. N = 61. 1 SD is equivalent to 10 T-score points. A positive M difference indicates that the teacher's ratings were higher at Time 2 than at Time 1.


Table 8.14. Difference Between Time 1 and Time 2 T-scores: Conners 4 Self-Report

Scale

Percentage of Test-Retest Sample

M
Differences

SD
Differences

Time 1 ≥ 1 SD higher than Time 2

Scores
differed by less than 1 SD

Time 2 ≥ 1 SD higher than Time 1

Content Scales

Inattention/Executive Dysfunction

8.8

79.4

11.8

−0.5

1.4

Hyperactivity

10.3

86.8

2.9

−2.2

−0.3

Impulsivity

13.2

79.4

7.4

−2.2

−1.2

Emotional Dysregulation

10.3

83.8

5.9

−1.5

−0.5

Depressed Mood

11.8

83.8

4.4

−2.3

−0.7

Anxious Thoughts

10.3

83.8

5.9

−1.1

−1.1

Impairment & Functional
Outcome Scales

Schoolwork

10.3

77.9

11.8

−0.6

0.9

Peer Interactions

7.4

80.9

11.8

0.1

0.2

Family Life

8.8

91.2

0.0

−1.2

−1.2

DSM Symptom Scales

ADHD Inattentive Symptoms

8.8

77.9

13.2

−0.9

0.6

ADHD Hyperactive/Impulsive Symptoms

10.3

83.8

5.9

−2.1

−0.9

Total ADHD Symptoms

11.8

80.9

7.4

−1.6

−0.3

Oppositional Defiant Disorder Symptoms

16.2

76.5

7.4

−2.4

−1.7

Conduct Disorder Symptoms

10.3

82.4

7.4

−0.1

−1.0

Note. N = 68. 1 SD is equivalent to 10 T-score points. A positive M difference indicates that the youth's self-report ratings were higher at Time 2 than at Time 1.


ncG1vNJzZmilmKiwpbqNm6Ooml6YvLOxjbCgp5yfrMBvusStZqago2LEpq6Ohn%2BMZYekv6Wc0Z6qrGd8mq6zuo2Gf4xnfZa7tq3LrGacp56jsrO%2Fvm2Woaydoayurc2umKWXlqq5rXvPmqmtbF%2BYtXmr056qrZeimsGmv9Nnn62lnA%3D%3D

 Share!