This FAQ answers multiple questions:
Why are some of Let's Go Learn's DORA scores unexpected?
Why do some scores seem very high or very low?
When using a computer-mediated diagnostic, formative assessment, like DORA, you may come across scores that might look a little different that behavior that you might see in the classroom. In this document, Let’s Go Learn attempts to address some of the most common misunderstandings about DORA scoring.
High Scores on the Word Recognition Subtest
The Word Recognition subtest assesses students’ abilities in decoding skills, using a combination of criterion-referenced real words and phonetically regular invented words. Words are presented to students orally, and they are asked to identify the correct word from four choices. While this subtest does accurately assess students’ word recognition ability, it is an out-of-context activity, with an oral component. As such, teachers may sometimes see unexpectedly high scores from students who have strong decoding abilities, but perhaps low comprehension scores. While students can use decoding skills to isolate the correct word, they may still struggle to read and comprehend the word in context. This subtest is used to evaluate decoding and word analysis skills in isolation, and is not assessing contextual reading skills.
These scores, however, are usually an accurate reflection of students’ graphophonic skills. The validity of this subtest is particularly high (grade level delta = 0.19, SE = 0.12), meaning that when the test is administered repeatedly with no instructional time between assessments, students will score, on average, a difference of less than two school months between assessments. Also, this subtest was correlated to both the Diagnostic Assessment of Reading (Riverside) (r = 0.81) and the Woodcock Word Identification Test (r = 0.92), both with statistically and practically significant levels of correlation, indicating very high levels of validity for the subtest.
High Scores on the Oral Vocabulary Subtest
It is important to note, when examining the scores on the Word Meaning subtest, that this subtest does not assess students’ reading vocabulary (or words that they can both read and define). Instead, it assesses students’ oral vocabulary, as oral vocabulary is often considered predicative of students’ future reading ability. Students cannot read words that do not exist in their oral vocabularies, so an assessment of oral vocabulary would help identify a gap that would prohibit students from reading achievement. It is a particularly important subtest for second language learners and for students with developmental language delays. For other students, however, the word meaning subtest may appear to give unexpectedly high scores.
These scores, however, are routinely not errors. The Word Meaning subtest has been shown to be reliable, with a statistically significant test-retest correlation (r = .60, SE = 0.19). Further, it has been correlated to the word meaning subtest of the Diagnostic Assessment of Reading (DAR), with a moderate to high level of correlation (r - .60). Further, in 2003, items in the Word Meaning subtest underwent major revision to ensure that test scores were not consistently higher than expected.
Low Spelling Scores
Spelling is the most challenging sub-test on DORA as the answers are completely student generated as opposed to multiple-choice. If students are performing well on classroom spelling tests, consider the difference in the task. If on Monday they are given a list of 10-25 words (depending on the teacher), they spend the entire week memorizing those words: writing, re-writing, creating flash cards, drilling themselves, drilling their peers, having sample tests in class and at home over breakfast. The students are given many opportunities to memorize those words and most will do well on the Friday spelling test.
When they come take DORA, they are seeing words that they have not just spent a week practicing, and they have only one chance to spell the word correctly. It is a sample of how the student is spelling, in general, without any practice. Do you administer a spelling pre-test at the beginning of the week? Do your students get the same score on Monday as they do on Friday? Probably not, but I would look at the Monday score as an indicator of how well the child really spells. With practice children will learn to memorize and spell words. Continued work on classroom spelling lists will improve students’ diagnostic spelling scores, as they continue to be exposed to more words and more complex spelling patterns.
Let’s Go Learn’s Spelling subtest has been correlated to two other nationally recognized spelling assessments: the spelling subtest of the Diagnostic Assessment of Reading (r = .78) and the spelling subtest of the Wide Range Achievement Test (r = 0.85, SE = 0.210). Both studies indicated a statistically significant correlation to both assessments, assuring the validity of the DORA spelling subtest.
Low Comprehension Scores
Many factors affect a student’s ability to successfully comprehend a text. Some students struggle with decoding the text they encounter or with the language structures (i.e., phrases and idioms) used. Other students may possess limited background knowledge about the topic of the text or they may not be interested in what they’re reading. While Let’s Go Learn’s comprehension test presents students with non-fiction topics that they are likely to have encountered in school, some groups of students may have less familiarity with the subject matter in DORA than in other comprehension assessments.
Using non-fiction passages with topics taught in most classrooms across the nation provides less variability in assessment results. The language involved in generating non-fiction passages is easier to standardize, as it does not contain conversational colloquialisms that are often regionalized in the U.S. Also, non-fiction passages offer a range of topics common to many classrooms, reducing bias due to race, gender, and culture. While non-fiction is sometimes more difficult for children to read than fiction, Let’s Go Learn has made a conscious effort to control for this by writing comprehension questions that are not too difficult and by creating an administration protocol which ensures that children only see questions within their comfort level as the sub-test raises and lowers the difficulty of passages according to success on DORA.
Another factor that can make scores on DORA seem lower is if your students have been tested using traditional teacher-mediated, pen-and-paper assessments. On these assessments there is more room for discrepancy, as teachers often ask follow-up questions to clarify students’ responses and students often become familiar with the administration protocol. Let’s Go Learn’s DORA removes some of this variability often associated with teacher-mediated assessments.
Often comprehension tests, like those utilized in annual state assessments, allow students to re-read the passage after they have seen the questions. This type of assessment can lead to false positive scores, as students learn strategies for skimming that may not be an indication of absolute comprehension ability. Allowing students to re-read passages introduces a new variable to the assessment that is difficult to control for. That is, some students choose to re-read the passage over again while other students choose not to re-read the passage. Allowing students to re-read a passage thus increases the variability of the comprehension sub-test score. By allowing students to read the passages only once, DORA provides a better indicator of how well students will perform in realistic reading situations. This gets back to the purpose of DORA, which is to provide diagnostic data for teachers to guide instruction, but could consequently result in scores on the comprehension subtest that are lower than teachers or parents might expect.
Also, because DORA is criterion-referenced—that is, based on a set of criteria identified by experts—it is possible that the items might differ from other criterion-referenced assessments you may have encountered. This does not preclude the utility or meaningful information produced by DORA‘s comprehension sub-test. It just means that one must consider its difficulty relative to other available comprehension tests.
The avoidance of false positives, as mentioned in the previous question, is also a factor that can make scores appear lower. If other comprehension measures used in the past have a lower degree of false positive aversion, then the difference when comparing DORA to this other measure may appear significant. Our philosophy is that it is worth it to avoid incorrectly labeling a low comprehension student as high, even if it means on occasion labeling a high comprehension student as slightly lower than his or her real ability. And have no doubt, comprehension measures must choose one or the other possibility. There is no way to avoid biases.
Another factor that should be considered is the student’s motivation. Longer assessments do run a higher risk of fatiguing the student. And the factor that causes the greatest test score variance is student motivation. Therefore, students need to be properly introduced to the idea of DORA. Teachers should stress that this assessment will help them do a better job of instructing the students. Also, the assessment should be broken up into manageable sessions and students should be monitored during testing. If some students seem fatigued, the teacher should consider stopping the assessment and resuming it later. The comprehension subtest is the final test of the assessment, designed intentionally so that other subtests can help to better inform the students’ starting point for the comprehension test; as a result, however, the comprehension subtest may be most effected by student fatigue of lack of motivation.
Finally, the age of the student should also be considered. Sometimes a lower comprehension score is the result of a younger student taking a computer-mediated test for the first time. Unfamiliarity with the medium can result in lower scores, as student may struggle with how the test is organized. Making sure to prepare students for the assessment by showing sample questions or discussing the assessment organization will help eliminate this confound.
The Comprehension subtest scores have also been validated, to ensure that the scores that a student receives are not abnormally high or low. In test-retest analysis, when students took DORA comprehension test repeatedly, the grade level delta scores was 0.35 (SD = 0.13). In other words, when students retake the test, 95% of students’ scores will have a difference of between 0.09 and 0.61 grade levels; almost all students will score less than half a grade level differently. This indicates the consistency of DORA’s comprehension subtest. Further, the comprehension subtest has been correlated to both the Diagnostic Assessment of Reading (DAR) comprehension subtest, and the Gray Oral Reading Test, with both indicating medium-high to high levels of correlation to these other assessments.
In summary, many factors might make it appear, on occasion, that students’ scores on DORA‘s Silent Reading sub-test are incorrectly lower than their reading ability compared to other reading measures. However, when examining the biases of each measure, the statistical soundness of DORA’s subtest validity, and interpreting DORA for what it seeks to do, these discrepancies, if any, can usually be explained or accounted for. Furthermore, there is low probability that any discrepancy between measures will be large enough to negatively affect any particular student’s instructional plan.
Support document 749
Tags: Why do individual DORA measures vary in range for individual students high variance low high scores very too high