Home
About
Archive
Electronic School: The School Technology Authority School Board Corner



Current Issue

Search

Forum

Reviews

Meetings

Socket

Links

Spin

How to Advertise

Cover Story: June 2000

Electronic Exams: Throw away the No. 2 pencils -- here comes computerized testin. By Kevin Bushweller

 

A teenage boy in Anaheim, Calif., goes online to complete a part of his chemistry final exam. Meanwhile, an Anchorage, Alaska, third-grader clicks her computer mouse to select an answer on a district reading test. Across the country, a Chesterfield County, Va., fifth-grader takes a practice test on the web for a state-mandated math exam.

These three students have reached what the Educational Testing Service (ETS) calls "the next frontier" in testing: computer-based exams. This brave new world of testing is technology's response as the nation's schools place greater reliance -- some would say overreliance -- on testing. Rightly or wrongly, pioneers of computerized testing are forcing educators to ask: Should paper-and-pencil tests be scrapped in favor of computerized exams?

Advocates argue that computerized tests are more appropriate for today's technology-savvy youngsters. What's more, they say, computers can provide instant analysis of the strengths and weaknesses of individual students, whole classes, and entire schools and districts, while teachers and administrators must wait weeks and sometimes months to see how students perform on paper tests. Once computers are in place, too, advocates say, online exams can be edited and updated at minimum expense, and printing costs virtually disappear.

But critics of computer-based testing say educators are accepting this trend all too easily. The children, they argue, will suffer the most. "The subtext of computerized testing ... is standardized children to fit into a point-and-click workplace where their job is mainly to feed into the computer whatever it asks of them," says Clifford Stoll, a former University of California-Berkeley scientist and Internet pioneer turned cyber-cynic. "They feed into each other -- standardized tests, standardized computer programs, standardized children," adds the author of High Tech Heretic: Why Computers Don't Belong in the Classroom and Other Reflections by a Computer Contrarian. "If there's one thing human beings are not, it's standardized."

The ease with which computers can be used for testing also bothers Alfie Kohn, an educational philosopher and harsh critic of the nation's increasing emphasis on standardized testing. Kohn has no problem with using computers for instruction but says that if technology makes testing more efficient, he worries the nation could whip itself into a testing frenzy. "U.S. students are tested more than ever in history," says Kohn. "We need to figure out ways to cut back on the insanity, not make it more efficient to test."

Still, the computerized testing revolution marches on. Already, prospective graduate students in liberal arts and business take the Graduate Record Exam (GRE) and Graduate Management Admissions Test (GMAT) via computer. (One especially futuristic twist of the GMAT is its e-rater, a software program that, in partnership with human evaluators, grades essay answers.) Prospective teachers, too, can now choose to take computerized or paper versions of the Praxis I test for beginning teachers.

High-stakes exams for high school students are heading in the same direction. The Scholastic Assessment Test (SAT) is still a paper-and-pencil test, but SAT officials envision a day when it will follow in the footsteps of the GRE and GMAT. In the not-so-distant future, the ACT college entrance exam will also be computerized, according to ACT officials.

Adaptive and analytical

A big draw of computer-based exams is they are more adaptive than paper-and-pencil tests, says Tom Ewing, a spokesman for ETS. As a test-taker answers questions correctly, the computer makes the questions a little bit harder. If the person starts answering incorrectly, the level of difficulty drops. On computerized exams, then, unlike paper exams, everyone does not necessarily answer the same number of questions -- or even the same questions -- and the order of the questions is often different. Says Ewing: "It sort of personalizes everybody's exam."

Recently, the Northwest Evaluation Association (NWEA) in Portland, Ore., a nonprofit educational research organization, designed a computerized adaptive testing program that school districts can tailor to their local or state standards. This kind of testing "gets you to the knowledge a student has much quicker," says Jeff Bristow, director of testing and evaluation for California's Capistrano Unified School District, which plans to use NWEA's computerized adaptive tests. "When you have a program that can change what questions are given to the student based on the student's responses, you can do very effective teaching."

Several studies, however, have found students scored lower on computerized tests than on paper exams when there were tight restrictions on time, according to Steve Garrison, a testing researcher for the Anchorage, Alaska, schools. Garrison says the studies offered several possible explanations for this finding, including trouble reading a monitor or using a computer mouse, and confusion about how to use the software.

Generally, though, Garrison says, most research -- including a study he conducted in his own school district -- has found the format of a multiple-choice/essay test (paper vs. computer) doesn't have much impact on achievement. Students do well or poorly because they know the material or they don't -- it's as simple as that, he says.

The real benefits of computerized tests, Garrison points out, are that they provide immediate and sophisticated analyses. Teachers can use that information to make adjustments to their teaching within days after the exams are administered. If a group of students, or an individual student, is having trouble pinpointing the main idea of a reading passage, for example, the computer picks up on that, and the teacher can focus more attention on honing that skill. Computerized test questions can also be edited and updated at any time.

In Anchorage, for the second year, third-graders are taking an online reading exam. The two-part test evaluates students' abilities to read independently. On the multiple-choice part of the exam, questions appear on the screen, students click on what they think is the correct answer, and the computer spits back instant performance reports. The second part of the exam consists of short essay questions. These questions are also online, but the youngsters compose their responses on paper, and the essays are graded by teachers.

"Taking tests is not a fun thing, but for the kids, taking [tests] by computer is more exciting than getting out a test booklet," says Alison Haigler, a third-grade teacher who is in her second year of teaching at Kasuun Elementary School in Anchorage. "They all enjoy working on computers -- I think that gets them a little more motivated. If I had had the option to take tests by computer, I would have done it."

The equity question

But schools should be aware that computerized testing raises "tremendous equity issues," says Jane Healy, an educational psychologist and author of Failure to Connect: How Computers Affect Our Children's Minds and What We Can Do About It. "Parents and teachers should be very concerned that the conditions under which their children are tested are fair. ... Using computers may put some children at a disadvantage and others at an advantage."

Indeed, a Boston College study highlights Healy's concern -- at least for essay exams. What's more, the study adds a cyber-age twist to the equity argument: Paper-and-pencil tests might put computer-savvy students at a disadvantage.

The college's Center for the Study of Testing, Evaluation, and Educational Policy compared groups of eighth-graders in Worcester, Mass., taking essay-style exams. Study results showed that students who were accustomed to writing on a computer did substantially better on test questions they answered electronically than on those they answered by hand. However, the study also found the opposite effect: Kids who were not used to writing on a computer and who took computerized tests scored lower than their low-tech counterparts who took paper-and-pencil tests.

That's why educators who are using computers to test children should offer paper-and-pencil tests as options, Healy says. In Anchorage, about 25 percent of the district's third-graders opted to take paper-and-pencil versions of the reading assessment.

Beyond access, Healy says computerized tests pose other problems. She says some young children have vision difficulties that can make it hard to read a computer screen. And, although she thinks it's appropriate to use computers to test high school students, she worries the practice could be developmentally inappropriate for younger children: "An 8-year-old is a totally different thing [than a high school student]."

High-stakes practice

Still, the push for cyber testing is strong, especially for high-stakes tests. Virginia's rigorous Standards of Learning (SOL) tests, for example, are likely to be administered online by 2003 or earlier, says Kirk T. Schroder, president of the Virginia Board of Education.

The quicker turnaround time of computerized assessments, Schroder says, will offer students who fail the SOLs more opportunities to shore up weak spots before retaking the tests. Virginia's high school graduating class of 2004 will be required to pass the SOLs to graduate.

At Bellwood Elementary School in Chesterfield County, Va., students are gearing up by taking computerized SOL practice tests. This year, the exams -- designed by eduTest.com of Richmond, Va., -- are being used primarily to improve youngsters' math and science skills. Once the practice tests are taken, the computer calculates mean scores for each class, shows individual student performances, and pinpoints areas in need of improvement, such as identifying the main idea of a reading passage or adding fractions.

"It's very helpful," says Bellwood Principal Ernest Hicks, whose struggling school -- where nearly 60 percent of the students are classified as living in poverty -- has been administering computerized practice tests for two years. In 1998, 38 percent of his third-graders passed the SOL reading test. Last year, the percentage jumped to 53 percent. During the same period, the percentage passing the math exam increased from 52 to 57 percent; in science, the percentage increased from 44 to 64; and in history, it increased from 33 to 52 percent.

Fifth-graders have also improved, with the percentage passing the reading and writing exam increasing from 61 to 70 percent. The percentage passing also increased from 33 to 35 percent in math, from 50 to 66 percent in science, and from 33 to 52 percent in history.

Hicks says the computerized tests are one of several tactics his school uses to improve its SOL scores. The teachers, he says, feel "very strongly" that the cyber practice exams are making a big difference.

"It makes life a whole lot easier," says Linda Mustain, a third-grade teacher at Bellwood. "I have a better idea of how the group and individual kids are doing. It allows me to zero in on weaknesses."

The Anchorage public schools are also using web-based practice tests to help students prepare for a state exam that high school seniors must pass to graduate. The practice tests are a combination of multiple-choice and essay questions. The computer gives students feedback on how they perform on the multiple-choice questions, pinpointing areas of weakness and suggesting tactics to bolster scores.

Model answers for the essay questions are also provided -- however, unlike the GMAT's e-rater, the computer does not evaluate student writing. Since practice tests were put on the web in December, about 13,000 students and 1,500 teachers across the state have registered to use them, Garrison says.

In the classroom

More and more, teachers are turning to computer-based testing for regular classroom exams as well. Last year, students in Marcia Sprang's Advanced Placement (AP) chemistry class at Esperanza High School in Anaheim, Calif., logged on to the Internet to complete a small portion of their final exam. IMMEX computer testing software developed by UCLA researchers evaluated her students' abilities to predict the products of chemical reactions and identify unknown substances -- necessary skills for success on the national AP chemistry exam.

Unlike many computerized exams, the online portion of Sprang's exam was not multiple choice. Rather, students used the computer to work toward an answer, showing their work along the way by conducting virtual experiments and accessing the program's reference library. Some information was deliberately not made available -- this was the specific content knowledge that students needed for the national AP exam.

Using an Internet browser, Sprang tracks how students go about solving problems on the IMMEX web site. What experiments did they conduct? Did they search for the right information in the program's cyber library? And, if they failed to answer a question correctly, did they follow most of the right steps? If so, they could receive partial credit. With multiple-choice questions, there's no such thing as partial credit -- you're 100 percent right or wrong. Says Sprang: "I can have a window into their thinking."

It's also tougher to cheat. Since the computer doles out "clone" questions -- in other words, students solve similar problems but with different answers -- peeking over someone's shoulder to steal an answer doesn't do much good, Sprang says. The teacher's IMMEX homework assignments use clone questions, which Sprang says prevents the all-too-common practice of students copying each other's homework answers.

Charlene Krider, a senior in Sprang's class, likes doing homework online and looks forward to the computerized portion of the chemistry final exam. "I can get sloppy on paper," she says. "The Internet helps me be more organized."

Other students aren't so thrilled. To begin with, they must learn how to use Internet browsers to navigate skillfully through problems, and they get frustrated when the IMMEX web site freezes in the middle of solving a problem. "Not all students are equally comfortable and confident about their computer skills," Sprang says. To help, she takes students through practice runs of the software and encourages them to spend time practicing on school computers.

Still, using the web to do homework and take exams is "a bad deal for me," says Chris Reese, a senior in Sprang's class. "I'm just really not comfortable with that -- I prefer to have the book there in front of me or the teacher there. That's just the way I learn better. Online, you just have to go to too many different places."

Sprang is aware of the potential drawbacks of relying too heavily on computers. Indeed, she thinks it's a bad idea for students to learn and be evaluated entirely online. Conducting real lab experiments (not virtual ones) and drawing scientific diagrams by hand are important experiences for her students. That's why she administers only a portion of her final exam online.

The students' choice

Increasingly, though, teachers are finding students prefer the online format. Nancy Moreau, a physics teacher at Roy C. Ketcham High School in Wappingers Falls, N.Y., offered some of her classes the option of taking midterm exams online.

Nearly nine of every 10 students chose the computer option. They used an interactive testing and homework assignment program called WebAssign, which was designed by North Carolina State University professors and is in hundreds of high schools and colleges across the country.

"I love WebAssign and so do my students," says Moreau. "Using the computer, I have been able to provide immediate feedback, generate individual materials, ... and use the problems in the textbook more effectively."

Kevin Bushweller is a senior editor of Electronic School.


Sidebar: Intelligent Machines: Can they really grade student writing?

Computers can play chess, but can they grade essays? Developments in artificial intelligence both delight and dismay teachers, who are intrigued by the potential time savings of essay-grading software but skeptical that a machine can understand the nuances of the written word as well as a human can.

"I have mixed emotions about [essay-grading] software," says Terri Washer, an English teacher at Crossroads Academy, a public alternative high school in Grovetown, Ga. "My initial reaction [when I heard about it] was similar to the old Alka-Seltzer jingle, 'Plop, plop, fizz, fizz, oh what a relief it is.' It would be great to have the drudgery of grading essays off my shoulders. However, what about the personal contact with students?"

And what about results? Does the new grading software dismiss Shakespeare as gibberish, or recognize well-written, reasoned responses?

In the case of the e-rater -- essay-grading software used for the Graduate Management Admissions Test (GMAT) -- the results are mixed. It would likely do a poor job judging Shakespeare because it is incapable of appreciating creativity or literary innovation. But, so far, the e-rater has done a good job evaluating the analytical essays of future business school graduate students.

The e-rater, in partnership with a human grader, evaluates each GMAT essay. In the past, two human graders read each essay, and if there were a discrepancy in scores, a third human scorer served as the arbitrator. Today, the human arbitrator is used if the e-rater and the human scorer differ by more than a point on a six-point scale. But the arbitrator is rarely needed, says Rich Swartz, senior development leader at the Educational Testing Service (ETS), which administers the GMAT. The e-rater and human scorer are within a point of each other 48 percent of the time, and they are an exact match 50 percent of the time, according to Swartz.

Relying on a branch of artificial intelligence used in speech-recognition technologies, the e-rater is "taught" how to be a good writing evaluator. Batches of model essays -- excellent, mediocre, and bad -- are fed into the computer so it can distinguish a good essay from a poor one.

Still, Swartz says, "We do not believe the e-rater should be used as a single scorer for a high-stakes exam." That's because, as Swartz concedes, savvy test-takers could probably figure out clever ways to write essays that fool the machine into thinking they understand the topic, when really they don't. But the presence of a human reader makes this less likely.

Also, Swartz says ETS researchers have experimented with fooling the machine by using key words and phrases, but "the result is the e-rater usually scores those efforts kind of low."

Cindy Matthews, a sixth-grade teacher at Platt Middle School in Boulder, Colo., appreciates the strengths and weaknesses of computerized essay evaluators. Her classroom is pilot testing a program called Summary Street, an essay evaluator developed by University of Colorado psychologist Thomas Landauer, who stirred quite a bit of controversy a few years ago when he and some colleagues released the Intelligent Essay Assessor. They claimed it could grade certain types of student essays as well as a real professor.

Summary Street evaluates how well students summarize long passages of text on a variety of topics. It assesses the accuracy, comprehensiveness, and appropriate length of summaries, as well as critiquing problems such as redundancy, poor spelling, and extraneous sentences.

Using the program has improved her students' summarizing skills, Matthews says. They are better at pinpointing the main ideas of reading passages and using appropriate examples to support generalizations. And they are writing more -- producing two or three rough drafts before turning in a final essay, where they once wrote only one rough draft because that's all Matthews had time to read.

But the teacher cautions: Summary Street "only does part of the job." It doesn't evaluate an essay's structure very well or appreciate clever or imaginative ways of organizing a piece of writing, she says. Beyond that, she says it is incapable of judging the style of a student's writing. For instance, Matthews says a skilled sixth-grade writer might use metaphors -- describing terrace farms as "floating gardens," for example -- but the literal-minded computer would not appreciate this clever metaphorical leap and might even penalize the student for it.

Landauer concedes that essay-grading software has limitations. But, he points out, "It doesn't get tired and give bad grades because it's in a grumpy mood."

Still, Matthews warns educators not to rely too much on computerized evaluations. "I don't see the day when a machine replaces a teacher," she says. "What I see is the machine enhancing writing with the teacher." -- K.B.

 


FOR MORE INFORMATION

Here is a partial list of nonprofit organizations and for-profit companies that have produced testing software for use by students, parents, and educators:

Reproduced with permission from the June 2000 issue of Electronic School. Copyright © 2000, National School Boards Association. Electronic School is an editorially independent publication of the National School Boards Association. Opinions expressed by this magazine or any of its authors do not necessarily reflect positions of the National School Boards Association. This article may be printed out and photocopied for individual or educational use, provided this copyright notice appears on each copy. This article may not be otherwise transmitted or reproduced in print or electronic form without the consent of the Publisher. For more information, call (703) 838-6739.

Got a comment about this article?
Voice your opinion on our message board!

Want to stay in touch?
Sign up for our e-mail newsletter!

Letters to the Editor: letters@electronic-school.com
Free trial subscription: subscriptions@electronic-school.com
Article submissions: editor@electronic-school.com
Reprint requests: reprints@electronic-school.com
Advertising inquiries: advertising@electronic-school.com
Webmaster: webmaster@electronic-school.com


Home / About / Archive

© 2000, NSBA