TMCnet Feature
May 03, 2013

Controversy Erupts over Using Software, Rather Than Human Beings, to Grade Student Essays

By Ed Silverstein, TMCnet Contributor

Can computer software – instead of an instructor – effectively grade essays written by elementary and secondary school students – or maybe even those in college classes?

Already, software programs are being used in Louisiana, North Dakota, Utah and West Virginia to grade written responses in secondary schools, news reports said.

Some argue that such a software product can correct academic work. One promising yet controversial example in higher education is coming from EdX, a joint project by Harvard and MIT (News - Alert). It will be available for free online, and pledges to give students instant feedback and free up a professor’s time in colleges. Also, Sanford University startups, Coursera and Udacity, which will develop “massive open online courses,” back automated assessments.

“The attraction is obvious: once programmed, machines might reduce the costs otherwise associated with the human labor of reading, interpreting, and evaluating the writing of our students,” according to a recent statement from the National Council of Teachers of English (NCTE).

Yet, critics contend a human being with advanced skills is needed for the task – whatever the grade level.

“When we consider what is lost because of machine scoring, the presumed savings turn into significant new costs -- to students, to our educational institutions, and to society,” the statement adds.

“Logic, clarity, accuracy, ideas relevant to a specific topic, innovative style, effective appeals to audience, different forms of organization, types of persuasion, quality of evidence, humor or irony, and effective uses of repetition” – are just among the factors computers cannot judge, NCTE argues.

“Computers use different, cruder methods than human readers to judge students' writing,” the statement adds. “For example, some systems gauge the sophistication of vocabulary by measuring the average length of words and how often the words are used in a corpus of texts; or they gauge the development of ideas by counting the length and number of sentences per paragraph.”

It leads to other negative results in classrooms, critics claim.

"It sends a message to teachers to design the most stereotypical, dull assignments that can be graded by a machine," Chris M. Anson, director of the writing-and-speaking program at North Carolina State University, told The Chronicle of Higher Education.

Also, such software can be fooled. MIT’s Les Perelman, has written “nonsense essays that have fooled software grading programs into giving high marks,” according to a recent report in The New York Times.

On the other hand, computer software can evaluate grammar, spelling and punctuation. Also, human beings make more errors than computers when grading, Mark D. Shermis, a professor at the University of Akron, argues. “Machines were about as reliable as human graders in evaluating short essays written by junior-high and high-school students,” The Chronicle reports from his study on the subject.

Shermis also called the NCTE statement an example of "political posturing" and a "knee-jerk reaction," The Chronicle reported. "There is no evidence to suggest that scoring models for longer writing products are somehow 'worse' than for short, impromptu prompts," he claims.

Software can analyze the “accuracy of an argument based on probability” and “can identify a number of words and phrases that are highly likely to lead to the correct conclusion,” he told The Chronicle. 

Edited by Stefania Viscusi
» More TMCnet Feature Articles


» More TMCnet Feature Articles