By John Myers
My Masters of Teaching students are at the beginning of their second year and have had two practicum experiences. They have an assignment to design rubrics that measure QUALITY not QUANTITY.
This is the most challenging assignment in the course and I am happy to report on the results.
The following perspective examines the issues of moving from qauntity to quality. There are a few questions in boldface for readers to consider.
Since the early days of public mass education letter grades or percentage marks were and still are shorthand for indicators of quality. But are they? What does an “A+” or “C-“ really mean? Is a mark of 76% really better than a 74%?
The world of assessing student learning beyond the retention of facts, events, dates, etc. is very complex (Ercikan and Seixas, 2015, Mazur, 2015). Nokes, 2015, Swartz and McGuinnes, 2014). Current Ontario assessment policy articulated in its 2010 document Growing Success moves us towards a clearer sense of quality and its determination. “Determining a report card grade will involve teachers’ professional judgement and interpretation of evidence and should reflect the student’s most consistent level of achievement, with special consideration given to more recent evidence” ( p. 39). Some excerps of particular importance to consider appear earlier in the reader as well as the article by Ken O’Connor (Reader pp. 111-13). Yet what is quality? What exactly are we judging? What do grades actually mean?
Given the new emphasis on disciplinary thinking in the curriculum ‑ for example see works such as The Big Six (Seixas and Morton, 2013) applied to History curricula – the question of quality thinking is back. Seixas and Morton (op.cit.) refer to “limited” and “powerful” understandings. What might these mean in practice, especially in the area of assessment and evaluation in all curricular disciplines?
The Ontario Achievement Charts imply that quality can be based on descriptors such as “limited”, “some”, “considerable” and “thorough”. The implication is of four levels of quality (YES!, YES but, NO but and NO!). A fifth level that is less than a 50% mark is implied though not clearly stated. Ken O’Connor noted above, has done as good a job of connecting levels to marks and looking at issues in grading as anyone. Many of us use a + – system of connection. With a mark of 58, 68, 78 or 88 being plus and 52, 62, 72 or 82 being minus. Level four in Ontario has a range of 20 marks: may add to complexities.
Experimenting with these descriptors and levels for more than two decades with teachers, student teachers and students has resulted in almost universal dissatisfaction with the achievement chart terms due to their lack of precision for both teachers and students (Myers, 2004). Work in the U.K. has also seen a frustration with the limits of using vague generic levels (Historical Association, 2014, 2004).
Is this a frustration in other curricular subjects? What are some other ways of seeing quality?
The rest of this article offers both familiar and new ideas for identifying quality in student work across the curriculum.
Bloom’s Taxonomy for the cognitive domain (he and his colleagues also developed systems for classifying educational objectives in the affective and psychomotor domains) is the most popular system for designing test questions and learning outcomes around. Teacher-candidates need to be familiar with it yet also be able to look at it with a critical eye.
Among its advantages are the following:
- almost all teachers have heard of it and many try to use it in their teaching and testing
- it seems to make sense
Among its disadvantages are the following:
- it suggests a hierarchy which may not exist, especially at the “highest” levels: the differences between the categories of thinking may be less real than apparent and some tasks which elicit knowledge or recall may be harder for students than tasks striving to elicit so-called higher levels of thinking
- it suggests that you can’t teach or assess students at higher levels unless you do so first at the lower levels: an idea that has been proven to be false
- while using certain verbs may strive to elicit responses at various levels of thinking, prior knowledge or its lack by a learner is a better means of determining whether or not a certain level of thinking has been achieved
Originally designed as a method for designing test questions it has had a huge impact on thinking in education. Do its higher level thinking categories such as evaluation, synthesis, and analysis represent a higher quality student response than categories considered to be lower level such as comprehension, recall and knowledge? Learning stressed product (assessment OF learning) not process, growth, or the power of metacognition (assessment FOR and AS learning). Individual paths students might take in learning were not considered or widely acknowledged.
What we now know about the brain and the nature of learning does not support the suggested hierarchy. Recall can be simple: “Who was Canada’s first Prime Minister?” or complex: “Who have served Canada as Prime Ministers throughout our history?” As for evaluation, we often make judgements through our gut reactions: “I like Donald Trump but I don’t like Kathleen Wynne.”
Two readable critiques of Bloom’s taxonomy can be found online as downloadables:
So if Bloom’s, which still has important uses, is not a place to determine differences in the quality of student responses to questions in subject areas what other systems offer ideas for differentiating student quality?
The original taxonomy was published in 1956, and since that time we have learned more about the ways that children learn. Teachers have also revised the ways that they plan and implement instruction in the classroom. Anderson and Krathwohl (2001) revised Bloom’s original taxonomy by combining both the cognitive process, and knowledge dimensions. This new expanded taxonomy sets out levels and corresponding actions as follows:
Remember: recognizing, recalling
Understand: interpreting, exemplifying, classifying, summarizing, inferring, comparing, explaining
Apply: executing, implementing
Analyze: differentiating, organizing, attributing
Evaluate: checking, critiquing
Create: generating, planning, producing
Degrees of Quality?
The chart by McTighe and Wiggins suggests other ways at looking at quality in the construction of rubrics using categories such as:
|DEGREES OF UNDER-STANDING||DEGREESOF FREQUENCY||DEGREES OF EFFEC-TIVENESS||DEGREES OF INDE-PENDENCE||DEGREES OFACCURACY||DEGREES OFCLARITY|
|Thorough or complete||Usually / consistently||Highly effective||Students successfully complete the task independently||Completely accurate; all facts, concepts, mechanics, computations correct||Exceptionally clear; easy to follow|
|Substantial||Frequently||Effective||With minimal assistance||Generally accurate; minor inaccuracies do not affect overall result||Generally clear; able to follow|
|Partial / incomplete||Sometimes||Moderately effective||With moderate assistance||Inaccurate; numerous errors detract from result||Lacks clarity; difficult to follow|
|Serious misconceptions / misunder-standing||Rarely / never||Ineffective||Only with considerable assistance||Major inaccuracies; significant errors throughout||Unclear; impossible to follow|
McTighe & Wiggins (2004) 192.
Which of the degree categories seem to be:
- closest to the achievement chart categories
- easier for us to do since they are what we have usually done
- easiest to convert to marks or grades?
Hard to determine.
Despite the achievement chart category combining knowledge and understanding the latter is seen as stemming in large part from the former. Once again the work of Wiggins and McTighe (2005) views understanding as containing 6 facets including:
EXPLANATION: Sophisticated and apt explanations and theories, providing knowledgeable and justified accounts of events, actions, and ideas; e.g., Was the 1917 conscription crisis avoidable? Why did the Weimar Republic fall?
INTERPRETATION: Interpretations, narratives, and translations that provide meaning; e.g., What does In Flanders Fields reveal about human beings and war? What does Picasso’s Guernica tell us about our capacity for evil?
APPLICATION: The ability to use knowledge effectively in new situations and diverse contexts (a separate category in Ontario’s achievement charts); e.g., How should Canada memorialize the centenary of WW1 to best honour the legacy? What does Canada’s response to refugees throughout our history tell us about how we should respond to the current refugee crisis in Syria?
PERSPECTIVE: Critical and insightful points of view; e.g., How did events in WW1 look from the point of view of Francophones? recent immigrants from England? others? How similar were the views Enlightenment philosophes held about the role of the state in western societies?
EMPATHY: The ability to identify with another person’s feelings and world view; e.g., How might it feel if one of your family members went off to war? Why could Thomas Jefferson write that all mean are created equal in the Declaration of Independence yet be a slave owner?
SELF-KNOWLEDGE: The wisdom to know one’s ignorance and how one’s patterns of thought and action inform as well as prejudice understanding. e.g., What do you believe is worth fighting for? As an Irish Canadian how does my background limit my understanding of the Easter Rebellion in Ireland of 1916?
Does this unpacking help us monitor quality in student responses through the use of questions illustrating each facet of understanding?
The SOLO Taxonomy
One approach is hardly ever seen in North American education but, according to John Hattie (Hattie and Yates, 2014), provides a more convincing taxonomy to assess thinking and conceptual learning comes from two Australian educators, John Biggs and Kevin Collis (1982).
Taking work from Jean Piaget they designed a scheme for observing learning in any subject and grade. The level of complexities observed and assessed go from one to many and from concrete to abstract. Their Structure of Observed learning Outcomes (SOLO) taxonomy begins with a level (prestructural) in which a student needs help just to get started with grasping an idea. A typical student response: “I don’t know?” “What am I supposed to do?”
The next two levels represent surface understanding. The unistructural level represents the grasp of a single idea and draws a conclusion without reference to other data; e.g., if students are to examine sources offering conflicting versions of an event and only reference one idea.
The multistructural level may refer to more than one idea but not connect these ideas to a thesis. James Duthie (2012) has called this the “narrative trap” (p.124) in which students respond as if the question or topic was “Tell me all you know about . . . .” Even if students recognize a contrary position they dismiss it rather than assess its merits.
A deeper level of understanding comes when students can connect ideas and see their links so that they weigh alternatives and draw a conclusion reconciling original positions in conflicting sources. This relational level is a qualitative shift in thinking.
With practice students can come to an extended abstract level in which they can step outside the task, bring in examples from elsewhere in the curriculum or beyond and generalize to a broader conclusions. For example, determine criteria for the reliability of a source.
The use of graphic organizers help students focus their thinking to make sense of data; e.g., a venn diagram to help students move from listing to comparing is one of dozens of graphical ways to help students visualize their thinking.
The nature of the observed thinking makes it clearer for both teachers and students when establishing success criteria and providing feedback. Differentiation of learning tasks can also be based on the SOLO levels. Teaching strategies using graphic organizers to help students avoid Duthie’s “narrative trap” may be an easy place to start in grades 7-10.
For those wishing to look at how this taxonomy compared to Bloom’s as well as a general overview of learning outcomes check out http://www1.uwindsor.ca/ctl/system/files/PRIMER-on-Learning-Outcomes.pdf
and also New Zealand consultant, Pam Hook at www.pamhook.com for more ideas and free resources. Closer to home there is another adaptation to this work through
The ICE Approach: a 3-stage framework for assessing learning growth (Young and Wilson, 2000). The authors of this system see it as a simpler way to use the work of Biggs and Collis in our busy teacher lives.
As you look below at the adaptation ask yourself if this is so and why.
|IDEASARE DEMONSTRATED WHEN STUDENTS . . .||• convey the fundamentals• present basic facts• provide vocabulary & definitions• offer details• demonstrate simple understandings of elementary concepts|
|CONNECTIONSARE MADE WHEN STUDENTS . . .||• demonstrate relationships or connections among basic concepts• demonstrate relationships or connections between what they have learned and what they already know|
|EXTENSTIONSARE REVEALED WHEN STUDENTS . .||• use their new learning in novel ways, apart from the initial learning situation• answer the hypothetical questions: So what does this mean? How does this shape my view of the world?|
It combines thinking from Bloom’s Taxonomy and work in cognitive development inspired in part by SOLO. Like SOLO it stresses growth and promotes self-assessment and self-adjustment. This framework can be applied to rubric design in which levels represent the stages as follows:
QUALITATIVE RUBRIC FOR A HISTORY PROJECT (adapted from Young and Wilson, Ibid. p. 40)
|Comprehensiveness||Contains thesis and concluding statements, a body of content and bibliography.||Links between facts and opinions, inferences and evidence.||Links made between the past and the present.Elaborates on how project shaped their thinking.|
|Accuracy||All statements are accurate.||Conditions and qualifiers stated where appropriate.||Analogies made with similar situations studied previously or with current events.|
|Presentation||Report is legible or clear.||Attention paid to the needs of the reader/audience.||Attempts made to present idea in a novel way.|
Given the need to convert quality into grades, which of these systems offers the best approach for quantity / quality integration?
Anderson, L. W. and Krathwohl, D. R., et al (Eds..) (2001) A Taxonomy for Learning, Teaching, and Assessing: A Revision of Bloom’s Taxonomy of Educational Objectives. Allyn & Bacon. Boston, MA (Pearson Education Group).
Biggs, J.B. and Collis, K.F. (1982). Evaluating the Quality of Learning: The SOLO Taxonomy (Structure of the Observed Learning Outcome)s. New York: Academic Press.
Duthie, J.A. (2012). A Handbook for History Teachers. Lanham Maryland: University Press of America.
Ercikan, K. and Seixas, P. (eds.)(2015). New Questions in Assessing Historical Thinking. New York: Routledge.
Hattie, J. Yates, G. (2014). Visible Learning and the Science of How We Learn. New York, Routledge.
Historical Association (2014) Teaching History Assessment Edition (#157 December).
Historical Association (2004) Teaching History Assessment Without Levels (#115 June).
McTigne J. and Wiggins, G. (2004). Understanding by Design: Professional Development Workbook. Alexandria VA: Association for Supervision and Curriculum Development.
Mazur, Eric (2015) Assessment: The Silent Killer of Learning https://www.youtube.com/watch?v=zB-MxdOjl9w
Myers, J. (2004). Tripping over the levels: Experiences from Ontario Teaching History Assessment Without Levels (#115 June) 52-59.
Nokes, J.D. (2015). Cutting edge theory and research on assessing historical thinking. Theory and Research in Social education. 43 (3). 434-440.
Seixas, P. and Morton T. (2013). The Big Six: Historical Thinking Concepts. Toronto: Nelson Education.
Wiggins, G. and McTighe, J. (2005). Understanding by Design: Expanded Second Edition. Alexandria VA: Association for Supervision and Curriculum Development.
Young, S.F. and Wilson, R.J. (2000). Assessment & Learning: The ICE Approach. Winnipeg: Portage & Main.
Which of the above systems promotes the professional judgement considered essential to sound grading?
John Myers teaches at OISE. He is a regular contributor to Rapport.