4 de setembro de 2011

Software can now mark essays, but will it make students write like robots?


AI makes the grade

Software can now mark essays, but will it make students write like robots?
IT IS an idea any overworked teacher would welcome - computers that automatically mark piles of exams and homework. Tens of thousands of students around the US are already being evaluated by such systems. But can we trust the artificial intelligence that powers them to make appropriate judgements? Two new real-world tests suggest that it can work surprisingly well.
In one experiment, conducted at the University of Central Florida in Orlando between January and May this year, Pam Thomas tracked the impact of an automated grading system on the performance of the 1000 or so students in her first-year biology class.
The students answered questions using SAGrader, an online service developed by Idea Works of Columbia, Missouri. SAGrader parsed their answers, which could be several paragraphs long, using artificial intelligence techniques designed to extract meaning from text. For example, given the phrase "the heart pumps blood", SA Grader would identify two entities in the text - "heart" and "blood" - and the relationship between them. These details were then analysed using a rubric compiled by Thomas.
She says her students like SAGrader because it provides feedback on their work in less than a minute. They can then resubmit their work, having taken the feedback into account, and may gain a higher score.
The software also had a fringe benefit. In end-of-term multiple-choice tests, the average mark scored by Thomas's students was 12.5 per cent higher than in previous years. "We were amazed," she says. She attributes the improvement to SAGrader encouraging students to work through problems multiple times. "It taught them how to think through the test questions."
Andrew Klobucar made similar observations after watching around 30 of his students at the New Jersey Institute of Technology in Newark use a tool called e-rater. Developed by ETS in Princeton, New Jersey, the system analyses essays and provides feedback on everything from grammar and spelling to the use of paragraphs.
Like Thomas, Klobucar says that the software's feedback encourages students to produce several drafts of each assignment. Students using e-rater wrote almost three times as many words as their peers who did not use it, and also showed a "substantial improvement" in performance. But Klobucar says that after just two years of trials, it is too early to attribute the boost to e-rater.
Students may be more receptive to critical feedback from software as it can seem less punitive, Klobucar adds. He also says students enjoy the "gamification" side of e-rater, treating the system as a game in which doing well means redrafting work to get a higher score.
Both e-rater and SAGrader, as well as a widely used tool developed by Pearson Education, based in Upper Saddle River, New Jersey, have performed well in internal company tests. In some cases, the tools even produced more consistent scoring than human graders, perhaps because people can tire and make errors. The systems are also gaining traction - Pearson's software is already used to evaluate 70,000 students in state-wide tests in Maryland.
But some teachers argue that automated marking tools teach students poor writing style. Les Perelman, who teaches writing at the Massachusetts Institute of Technology, says the tools often ignore the structure of an argument within an essay, instead rewarding students who pack information into lengthy pieces.
The auto-graders also use overly simple rules, Perelman says. A student might, for example, be rewarded for using "egregious" instead of "bad", or for discussing a topic using three examples - a common rhetorical device. New students at MIT often have to unlearn such practices. "The first thing we have to do is deprogram them," Perelman says.

Nenhum comentário:

Postar um comentário