ÁñÁ«ÊÓƵ

Exams: call time on the academy¡¯s Hunger Games

<ÁñÁ«ÊÓƵ class="standfirst">University examinations teach students how to compete but teamwork is the vital life skill, says Kevin Fong
March 5, 2015

I hate exams. I hate them more now I have to set them than I did when I had to sit them. This is what I¡¯m thinking while we¡¯re hunkering down for the annual exam scrutiny meeting. ¡°Does anyone need to leave early?¡± asks the chairman. This strikes me as a brave way to start. A forest of hands shoot up and I swear I can hear someone saying ¡°Ooooh! Oooooh! Me! Me! Me! Me!¡±

?

I have deep respect for the chair. This is a difficult act of leadership. Like being the captain of a shot-up Second World War bomber that¡¯s aflame and limping home over the English Channel, hoping that your crew will stay and help you wrestle it back to Blighty, but at the same time feeling obliged to acknowledge the pile of parachutes in the middle of the cabin.

The time comes for my exam to be scrutinised. The dwindling panel pore over the text trying to eliminate ambiguity. There¡¯s a particular syntax that goes with multiple choice questions that I have yet to understand fully. My colleagues patiently hammer the phraseology into shape, the way parents might try to resculpt their three-year-old son¡¯s first attempt at icing fairy cakes.

There¡¯s worse to come. My multiple choice questions aren¡¯t hard enough this year. We have to make them harder. Somebody suggests negative marking: awarding a mark for a correct answer, giving a zero for no response and subtracting one mark for an incorrect answer. This sounds like an easy fix. Better still, there would be no requirement for rewriting questions. But there¡¯s a problem. Medical schools used to love that format because, superficially, it looks like an examination method that rewards certainty and penalises guesswork. When examiners looked at the outcomes properly, though, they discovered that it did exactly the opposite. Bizarrely, when you look at the stats, the optimum technique for this type of paper is to answer absolutely everything, even stuff that you have only the vaguest of inklings about. And so multi-choice becomes multi-guess and the reckless chancers emerge triumphant.

ÁñÁ«ÊÓƵ

So we abandoned that idea and chose to add an additional choice to each of the stems; getting them to choose from five instead of four options in each question. This scheme is marginally better, although a trained chicken could still get 20 per cent. We¡¯ll have to do some rewriting.

We wring our hands about grade inflation, wondering why, year on year, students seem to be doing better and better. But really, if the same teachers are chucking the same bodies of knowledge at successive classes year after year, and then examining them in the same format, then anyone who isn¡¯t seeing inflating grades probably should ask themselves what they¡¯re doing wrong. Good students learn how to take exams. It¡¯s a technique and by the time they reach university most of them have got it down pat. If, year on year, your course or the way you teach it is getting better, then the students should be getting progressively better grades.

ÁñÁ«ÊÓƵ

And yes, you can normalise the scoring so that only a tiny proportion of your class ever get a first-class degree, but that scheme also has significant flaws. Not least because, by pitting student against student, you create a sort of academic Hunger Games when what we really should be preparing them for is teamwork within complex systems.

The problem, I think, is that the exam isn¡¯t fit for purpose. Not my exam but all exams. In a world where everything is getting leaner and meaner, when employers are deluged with application forms and don¡¯t have the luxury of sorting through the personal statements one at a time, the final grade becomes everything.

Charles Goodhart, emeritus professor of banking and finance at the London School of Economics, once suggested that ¡°when the measure becomes a target it ceases to be a useful measure¡±. And in that lies the problem with our system of final examination. For the students it is no longer a measure of their abilities. It has become the target and one that I fear gets in the way of any attempt at delivering a proper education.

I¡¯m not sure what the solution is. Maybe we should seriously consider the idea of a more multidimensional assessment of student performance. One that gives them the option of counting extracurricular activities alongside exam results, something that tells us more about the person than their aptitude for scholarly testing. Maybe we should abolish final examinations altogether. And yes, all of that would be a total nightmare to administer but we¡¯ve got to do something because left untended, as the students get hungrier (for premium exam grades, if not literally), and universities become more discriminating about their intake, we can only expect further grade inflation and a measure that becomes progressively less useful. And fine if everyone¡¯s fine with that. But I rather suspect that they¡¯re not.

ÁñÁ«ÊÓƵ

Register to continue

Why register?

  • Registration is free and only takes a moment
  • Once registered, you can read 3 articles a month
  • Sign up for our newsletter
Register
Please Login or Register to read this article.
<ÁñÁ«ÊÓƵ class="pane-title"> Reader's comments (1)
Kevin's description highlights the all-too-common situation in assessment within (particularly undergraduate) university courses in which there would appear to be no routine psychometric input. A valid assessment will start with input from a team of individuals with logistical expertise, content expertise and psychometric expertise. For example assessment items are structured in a particular way to minimise construct irrelevant variance and match predefined purpose of the test. Also, there is no such thing as an item that is too easy / too hard. There is only the issue of constructive alignment with the curriculum. Defining the performance standard is a separate exercise (standard setting) as is ensuring the standard of the consistency between years (equating). The proper application of standard setting and equating also deals with 'grade inflation' from a statistical point of view (but not a political one - if the college down the road gives more firsts than you why should you moderate your numbers) Statistically there are optimum numbers of distractors (options) for MCQs (again depending on the purpose of the test). Adding non-functioning distractors will add nothing to your test. 'Normalising' scores has a particular meaning in psychometrics but in practice this term may be used to describe many different and magical procedures that turn unacceptable score distributions into acceptable score distributions. Kevin may very well be correct about assessment needing an overhaul but any / all methods will suffer in the areas mentioned above without the proper input from the appropriate experts. We wouldn't dream of trying to put together an exam without content experts but it is still common to create, administer, generate score and make decisions without appropriate psychometric input.
<ÁñÁ«ÊÓƵ class="pane-title"> Sponsored
<ÁñÁ«ÊÓƵ class="pane-title"> Featured jobs