Open exams and brackets

5 minute read


Recently, I was told by a senior academic that I should ensure that only about 25% to 30% of the students in my class should be awarded a grade of A, that 40% to 50% are awarded a B, and that the remainder should be C and below. This started me wondering: what is the right proportion per grade? And, once we decide that, how does a teacher achieve it?

Harvard: the most frequently awarded mark is an A

Back in 2017, Matthew Q. Clarida and Nicholas P. Fandos wrote an article in The Crimson that quoted Harvard’s Dean of Undergraduate Education Jay M. Harris saying: the most frequently awarded mark is an A.

Now, Harvard is one of the premiere universities worldwide, so one would expect the students to all be top notch.

Criterion-referenced Assessment vs Norm-referenced Assessment

After digging at this idea a little, I found that there is a distinction between teachers who grade to a curve and those who expect mastery of the material by their students.

Mastery of the material means that the teacher has laid out for the student all the things that the student has to know or be capable of doing before they can pass the class. A student who can do all of them can achieve an A. One way to do this is to use what is called criterion-referenced assessment.

Grading to a curve means that, regardless of the assessment content, the students are ranked from strongest to weakest with the top fixed percentage achieving an A. This approach is called norm-referenced assessment.

Criterion-referenced Assessment

Criterion-reference assessment requires that the teacher enumerate the required knowledge and skills areas that the students required, as well as the required levels to achieve the different grades. The easiest way to do this is to use rubrics: matrices with each row indicating the required knowledge or skill and each column indicating the level (A through F).

Such tests are often used in the civil arena. Driver’s license tests are one example.

Norm-referenced Assessment

Norm-reference assessment aims to compare the students with each other, or with an overall average.

Such tests are used for entrance into colleges and to identify specific abilities or disabilities.

Taking extremes

In many areas, I like to think about what would happen if an idea is taken to its extreme.

Extreme criterion-referenced assessment

Many of the students I teach do not have English as their first language. Also, my English tends to fall on the British (Australian) side of the divide, and I’m teaching in US institutions. Sometimes, this has led to misunderstandings of exam questions during standard exam conditions.

The pandemic has led me to move my exams to being oral exams taken online via Zoom, Discord, or Teams. This allows me to correct these misunderstandings at exam time.

However, more recently I’ve moved to publishing my exam paper before the exam (making it “public” to the class). I generally include far more questions than can adequately be covered in a given oral exam, but I make it clear that only a subset of the stated questions will be asked of the student. I generally release the exam two weeks before the exam period.

I do this because publishing the actual exam questions gives students a clear idea of the task ahead of them. Also, because it’s published and I will answer questions about the paper before the first student takes the exam, I can correct any misunderstandings before the students take the exam. Questions and answers must be public to the class as a whole.

This approach allows me to state the things I’m looking for in an A student:

  • correct answers
  • clear answers
  • detailed answers
  • concise answers
  • no incorrect information

which starts the formation of rubrics for each skill or knowledge area.

Extreme norm-referenced assessment

I am not a fan of norm-referenced assessment, but I think some aspects of it might help students to learn.

I haven’t implemented this yet, but my idea is that we can take a leaf out of the March Madness NCAA Basketball Tournament and line the students up in brackets to see which student performs the best.

Rather than take place over the whole semester, I see this as working on a weekly basis. Answering more questions, more accurately, more quickly might be a way to gamify education as a way of engaging students.

Of course, for those students who are averse to competition, this approach might not be the best. So I don’t see it working for all students.

It works

The last two semesters I’ve published my midterm and final exams at least a week before the students are required to take them. Believe it or not, students still fail.

One thing that open exams does is it takes away excuses for students not performing well on exams due to the surprise element.

Where it doesn’t work is heeding the advice of that senior academic. For some topic areas, this approach seems to yield the suggested distribution. For others, it doesn’t. And I suspect that this is more to do with my mastery of the relevant topics than the students’.