On the case for standardised testing

Not just in favour, but in favour of more

Jan 27, 2022

I’ve had a thought experiment bubbling away for a while now. It started last year during the pandemic, when there was a leaked discussion about the cancellation of final examinations, and a credible whisper that results might be based solely on the English examination. At that time, it seemed far more practicable to run one compulsory subject than try to roll out the whole catastrophe. The logic was that English results are a pretty reliable predictor of overall HSC success, which has a ring of truth about it, given that other predictors like NAPLAN, even the earliest NAPLAN, are a good indicator of future success.

I’ve written before about the pushback against exams - the question of bias in standardised testing materials, the conflation of numerical results and self-image and the general anti-intellectualism that pervades teaching. So some might find this particular thought experiment a bit radical. Hear me out. Let me test my thinking. There will be gaps in my vision but I do feel the presence of a rethink in the bureaucratic air.

On the proposal

In my vision1, students from Years 7-12 would be tested twice a year, using nationally prescribed materials. Like the HSC, the selection would offer an element of teacher choice but be standardised enough to make comparative and fair judgements. The rest of the year’s programming would be free choice, within the constraints of the syllabus.

Teachers could sign up to marking sprints, conducted remotely and using Daisy Christodoulou’s Comparative Judgement. Because Daisy’s online system is essentially a ranking method which relies on assessment of whether one script is simply better or worse, the accuracy of marks would be far more reliable. Teachers spend a shorter amount of time on each response, but the response sees many more professional eyeballs than traditional marking methods. It would also allow all teachers around the country to access the valuable professional development that comes from this experience. After the initial benchmarking by expert teams, marking could be completed asynchronously. Alternatively, benchmarks could be set after the fact, based on the quality of the responses of the cohort as a whole.

The shift would potentially save money due to being faster and requiring little if any senior marker moderation. The testing could be a combination of online and hand-written tests and could be weighted more heavily as students approach their senior years.

On adversity-proofing

With 12 tests over a students’s academic career, the system would essentially be pandemic-proof. If they missed one test, the data would be a reliable enough predictor of their future success at university. The net effect could also be a lowering of stakes, with one big fat test replaced with a normalised schedule of smaller ones. Potentially this could be a more accurate indicator of what students can do.

On time for teachers

Examination and curriculum materials would be externally written, freeing up time for teachers to focus on formative assessment. Marking loads would be halved. Quality criteria would be standardised and utilise detailed progressions statements to indicate. And again, teachers would be able to spend time on improvement, using timely data to inform their future actions.

On equality of outcomes

Using Comparative Judgement would also enable teachers to track student progress against their peers nationally. This might sound objectionable to some but it could be up to the school’s discretion about how to use this data. A detailed national curriculum and materials would potentially ensure a more cohesive and higher standard of knowledge and skills in the junior years, and a common testing system could be the key to fair, equitable and reliable measures of student learning.

I’m sure this plan is full of holes. All experimental thinking is. And I’m prepared for you to poke more deeply into it. I’m interested in the ethical, the pastoral and the pragmatic. But right now, I’m feeling that opening up these conversations could be a good start in our common aim of getting the best out of assessment, for our students and for us. It might be time for radicalism.

This method may not align perfectly to discussions about mathematics, which could be marked by computers, but could even work for Mathematics Extension 2.

John Brown

It is hugely heartening to hear fresh thinking about assessment from the ground up. You propose averaging grades across a number of secondary years, proofing against black swan events like covid. One objection might be that by loading all the stakes at the end of secondary school students get to enjoy most schooling consequence free. There might be an argument about maturation, that only middle teens can grasp that their efforts have long term consequences. Would we really want adult opportunities to be influenced by choices we made when thirteen! I would not!. Also if grades mattered from age 11 parents and schools would game the system and put pressure on early teens taking away their last innocent years. It would also be a system that favoured those that matured early, when we are really interested in how people end up, not how smoothly and evenly they transition.

Expand full comment

1 reply by Rebecca Birch

1 more comment...

Rebecca Birch - On Education

Discussion about this post

Ready for more?