The practice of averaging grades needs to be stopped


windblown pic of me in Alaska Train

Throughout your academic career, your grades have consistently fallen within an average range. Similarly, your instructors also calculated grades in the same manner when you were a student, and it proved to be effective. This method also proves to be effective for your own students.

Is it true? Similar to how we educate our pupils, we do not want to succumb to the fallacy of Argumentum ad populum where something is considered true or good solely because numerous people believe it to be so. Now, let us examine the argument against the practice of averaging grades.

Taking refuge in mathematics

Merely being solvable through mathematics doesn't automatically qualify a task as an effective teaching method. The 100-point grading system tempts teachers to use averaging, which seems objective and trustworthy due to its mathematical foundation. However, statistics can be used and manipulated to alter outcomes in many ways.

The difference of one percent is the determining factor in whether a student is accepted or rejected from graduate school.A student who obtains a score of 90% will be awarded an A grade. However, if a student scores 89%, they will receive a B grade. Nonetheless, this slight variation in scores cannot always imply a noteworthy variation in their mastery and comprehension of the topic. Both students may have the same level of mastery, yet the 90% student is rewarded with scholarships and higher-level courses while the 89% student is not. This system seems flawed.

At the start of my profession, I had a pupil who got a grade of 93.4% in my subject. The academic institution has set the A grade range from 94-100, leaving the student only 0.6% away from obtaining an A. The student appealed to me if I could adjust the score to 94%, making it possible for him to have straight As in all his classes. I clarified that it was 93.4% and not 93.5%, thus I could only round down if ever to adjust the score. I mentioned that I could justify rounding up to 93.5%, but not to 93.4%.

I relied on a very small difference in grades to justify my decision instead of thoroughly questioning the student about their progress during the grading period and using that evidence to make a fair evaluation. Though I knew deep down that I should have done better, I felt more secure using mathematical calculations to avoid making a tough judgment. Looking back, I am sorry that I didn't have the bravery to be more involved in evaluating the student's comprehension.

We cannot rely on averaging simply because it seems reliable due to its mathematical process. The stakes are too high.

Manipulating or fabricating grade records

Imagine a situation where a teacher allows Martin to take the final exam twice. Martin scores an F on his first attempt, but he studies hard and achieves an A on his second try. Despite Martin's clear understanding of the subject and exceptional performance, the teacher averages the two grades and gives him a C, which does not accurately reflect his abilities based on the standards. It is significant to understand that tests are deemed a dependable measure of a student's comprehension of a particular subject.

When we use grading endpoints like A and F, stating that there is a clear difference, it is highly inaccurate. This leads to similarly inaccurate reports that compromise the integrity of grades, such as when we mix grades that are closer together on the scale, such as B with D, B with F, and A with C.

Let's take into account a larger set of data: Cheryl receives grades of 97, 94, 26, 35, and 83 on her tests, which translate to an A, A, F, F, and B according to the school's grading system. Nevertheless, when these numbers are averaged, each score is considered equally important, resulting in a total of 67, which corresponds to a D grade. This is an inaccurate way of assessing Cheryl's performance based on individual standards.

Fortunately, numerous educational institutions are shifting towards dissection, wherein pupils attain singular grades for specific specifications. This approach will significantly minimize the discrepancies created by collective grades that blend everything into a single, compact symbol and assist in eradicating teacher apprehensions concerning students who try to manipulate the system by transforming zeros into 50s on a 100-point scale. These learners strive to accomplish the minimum requirements- overlooking some assessments and performing exceptionally well on others- to pass quantitatively. In classrooms where instructors do not compute average grades, learners are unable to do this.

Students must focus on mastering the subject matter rather than engaging in mental manipulation tactics.

Responding to the accusation

Normal standards such as "average," "above average," and "below average" are no longer sufficient in modern classrooms which aim to be outcomes-based and have evidentiary, criterion-referenced assessments and grading. Rather than comparing students to one another, educators are more interested in measuring their individual ability to accomplish specific tasks and attain societal standards in their given grade level and subject. For example, a teacher's opinion that Toby is performing above average is less relevant than knowing if Toby can properly complete a specific task like writing an expository essay or interpreting graphs. The emphasis is placed on evaluating the advancement and success of students in correlation to their own capacities and societal norms, rather than solely evaluating them against their classmates, rather than just how they compare to their peers.

It is necessary for us to refer to specific criteria while making instructional decisions, providing feedback, and tracking progress. Stating an average without context can be confusing and lead to inaccurate reporting of student performance against standards. It is important that grading reports are precise, and we should not misuse the concept of averaging.

Averaging was created in statistics to reduce the impact of individual sample errors on experimental design. This concept can be applied in the classroom to better understand its function.

Picture a situation where a learner is participating in an exam that emphasizes a specific subject and is delivered in a specific style. The student may have had breakfast, or he may have skipped it. It is possible that he had a peaceful or disturbed sleep the night prior to the exam. It is uncertain whether his parents are experiencing a divorce or not. He may either be in a relationship or be unattached. The student may have studied for the test or not. Furthermore, the student may also have a high-stakes drama or music competition, or sports event, later in the afternoon, or he may not. The student's performance on the test at this specific time will be influenced by various factors..

After three weeks, we administer another examination to our students on the new material covered in our course. Have there been any changes in the students over this time? There have been hormonal changes if nothing else. Moreover, the second examination encompasses an alternative subject matter and may be delivered in a dissimilar form. During the first exam, one student had eaten well but failed to study. Although he slept well, his parents argued constantly.

He had participated in a drama, music or sports performance and did well. During that period, the student was out of any romantic relationships. However, during the second exam, the student had a girlfriend and had studied. Regrettably, he didn't sleep well and also skipped breakfast. On the other hand, his parents had stopped arguing, which created a calmer environment at home.

The second testing scenario has undergone a significant change, which has compromised the integrity of maintaining a consistent experimental framework. Consequently, it is no more satisfactory to merge the results of the first test with the results of the second test, as was earlier practiced to diminish the impact of any isolated discrepancies in data.

The digital grading system

The electronic gradebooks calculate grades only because it has been made a policy, without considering if it's the most appropriate method educationally. Therefore, the district utilizes technology that aligns with this decision. Instead of compromising effective grading methods due to technological limitations, it would be more beneficial to first determine our grading approach and then seek technology that complements it.

When administrators or a school board instruct us to do something that we recognize as educationally incorrect, how can we ensure that we act correctly? Despite being a difficult scenario, my recommendation is to act ethically within our own classrooms, and then convey this approach using school or district language to maintain job security.

In our own classrooms, we can conduct an experiment by comparing a group of students' grades with and without averaging them to determine their correlation with standardized testing. By getting involved in the physical process, we can enhance our comprehension of the idea in contrast to simply being told about it by someone else.

We have various ways to engage in the topic of grading and averaging such as reading articles and joining online discussions. Additionally, we can speak with faculty members and even offer to be part of the committee responsible for revising the gradebook format.

We are dealing with actual people, not just numbers and figures. Our pupils have genuine aspirations, fears, and a bright future ahead of them. They deserve teachers who are considerate and break away from traditional techniques while also acknowledging the moral wrongdoing associated with deliberately manipulating grades. Let us fulfill this responsibility and free the upcoming generation from the disadvantage of generalizing.


Author: Ryan Wormeli


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>