Home
About
Archive
Electronic School: The School Technology Authority School Board Corner



Current Issue

Search

Forum

Reviews

Meetings

Socket

Links

Spin

How to Advertise

Cover Story: September 1999
Smart Data: Mining the school district data warehouse. By Lars Kongshem

A reliable gut feeling goes a long way: School leaders have always had an uncanny knack for sensing which students are headed for trouble, which curriculum programs work well, and how best to improve student achievement.

But in today's complex, modern school systems, many educators are looking for ways to augment their instincts with solid data -- and to back up their hunches with hard facts. As pressure mounts on public schools to increase their effectiveness and accountability, pioneering school districts are investigating the use of technology to support the decision-making process.

Enter data-based decision making, a strategy that aims to give all stakeholders access to easy-to-use computing tools that can help them analyze, make sense of, and act on information about every student and every facet of the district's operations.

There's certainly enough data to go around. School systems collect a mind-boggling variety and amount of information for internal use as well as for state and federal reporting purposes: test scores, grades, attendance and discipline reports, demographic and ethnicity information, medical data, and information on participation in special education, English as a Second Language, and free and reduced-price meal programs. Combine all that with data from scheduling, personnel, financial, transportation, and other district management systems, and you've got a potential gold mine of knowledge -- if only school leaders could get to it.

And there's the rub: Too often, the school district's own data is not accessible in a useful form to the people who need it the most. For starters, the information is typically entered and stored on many different computer systems, each serving its own purpose and using its own format. Quite often, lack of consistency makes it extremely difficult to correlate data by drawing on information from several databases. What's more, the level of technical difficulty involved usually makes it impractical for administrators to perform their own interactive queries on the data; instead, they must wait for infrequent reports from the data processing department.

The end result is that school districts have become data-rich but knowledge-poor. Many questions that school districts could -- and should -- be asking go unanswered, such as: What is the relationship between attendance and literacy? What is the connection between teacher training and student test scores? What characteristics are shared by students who drop out, and what attributes are common to those who succeed? Why are some teachers more effective than others, and how can the district use that information to help other teachers improve? Which programs are the most cost-effective? What is the relationship between early childhood education and later academic success?

Digging deep

School districts looking for answers to these types of questions are turning to data warehousing, a technology first popularized by retail chains such as Wal-Mart. Data warehouses are vast repositories of information that import, standardize, and integrate data from the district's operational systems -- the databases and computer systems used for day-to-day operations. By gathering all the data in one place and tying it together, data warehouses allow administrators to "mine" for valuable information and relationships between data elements that would otherwise remain hidden or inaccessible in mountains of unstructured and disconnected facts. The result: a tool for smarter decision-making.

Although data warehouses have been used in the corporate world for over a decade, the technology is still a relative newcomer to K-12 education. One of the pioneers has been the Broward County (Fla.) Public Schools. As the fifth largest school district in the country, Broward County encompasses 201 schools and enrolls 230,000 students -- and, naturally, produces a lot of data.

Yet when the district's administrators applied for a Reinventing Education grant from IBM in 1996, their proposal described a pressing issue: "Our concern was that people with decision-making ability didn't have access to the data to make those decisions," says Nancy Terrel, the district's director of strategic planning and accountability. "What we were asking for was a data warehouse, but we didn't know that at the time." The grant proposal was successful, and the district won $2 million worth of in-kind services from IBM to implement a data warehouse -- the first in a K-12 school district, Terrel says.

As a first step, IBM representatives met with administrators and teachers to gain an understanding of the kinds of data district employees needed access to. Not surprisingly, this "varied widely by constituent group," Terrel says. Typically, teachers look for data about students, while administrators seek an overall view of how a particular school or the district as a whole is doing. Accordingly, the data warehouse was designed to accommodate a variety of information needs.

This process also revealed that teachers were looking for certain types of data that were not being collected by the district, says Frank Petruzielo, Broward's superintendent at the time. One example: "We realized that we needed to be collecting data on student learning styles, so we developed an inventory of preferential learning styles," Petruzielo says. "The idea is to give teachers a heads-up on students. We wanted to get the data from our computers to our teachers. It's a step toward making education more scientific and data-based."

The district worked with IBM to build a solution that fit within the existing structure of the district. Because the school system is populated with both Macs and PCs, the district selected dual-platform query software to ensure that all decision-makers would have fingertip access to the information in the warehouse from any brand of desktop computer, Terrel says. And since the data warehouse resides on the district's existing IBM AS/400 mini-mainframe computer, no additional server hardware was needed.

The district started small, with a pilot project involving just three schools: one elementary, one middle, and one high school. The staff received training from IBM on how to use the query software and how to analyze the data -- an important part of the process, Terrel says.

In fact, the district found that the instructional staff were very interested in inservice training in data analysis. "People were saying, 'OK, now I have this data -- so what do I do with it?'" Terrel says. "Professional development opportunities were filled up before they were even announced. It was gratifying to see people wanting to learn how to read and analyze data in order to make schools more effective and increase student achievement."

Another unexpected effect of giving instructional staff access to the warehouse was a much greater sense of ownership in the data-entry process, Terrel points out: Previously, student record-keeping and data entry were seen as tiresome chores performed solely for the benefit of others; but with access to the data warehouse, teachers saw the need for having up-to-date and accurate information on students. "If you're going to be using the data, you have much more of an investment in making sure the data is correct," Terrel adds.

Although the district is still just beginning to scratch the surface of what's possible, the benefits of the data warehouse were obvious from the start.

"Before we had the data warehouse, we were manually going through printouts for information. Now we can develop many different arrangements and presentations of the information, and we can drill down into the data to ask further questions," Terrel says. "Teachers used to have to go through individual student folders to find test scores, whereas now it's right on their computer screens."

Catching patterns of behavior while there is still time to do something about them is another early benefit, Terrel says. "Schools have reported finding patterns of absenteeism. Now, we can intervene sooner." Similarly, the warehouse can alert administrators to teachers whose students are not testing well, giving the district an opportunity to prescribe professional development. Another example: A specific school might have high overall test scores, but hidden in the statistics could be a group of children who are not doing well. With the data warehouse, finding those kids is just a matter of a few mouse clicks.

Since September 1997, when the pilot phase of Broward's data warehouse project was complete, the district has been slowly rolling out the project to all the schools in the district. Today, access to the warehouse is provided to the superintendent, principals, assistant principals, guidance counselors, most teachers, and central office staff -- including those in the research and evaluation department, dropout prevention, psychological services, and social services, Terrel says.

"Today, at least one computer in every school has access to the warehouse," Terrel says, who adds that she expects the project to be complete within the next two to three years: "I think people will find it hard to believe that there was a time when the data warehouse wasn't there."

Warehouse shopping

The growing interest in K-12 data warehousing has led to the recent development of several solutions that focus specifically on the school market and that are sensitive to the unique needs of educators. Naturally, costs are coming down, too.

One example is the Educational Information Management System (EIMS), a joint public-private partnership between the non-profit Connecticut Academy for Education in Mathematics, Science & Technology and technology consulting firm KPMG Peat Marwick. Dubbed Learning Landscape, the data warehouse solution was developed with financial support from the National Science Foundation and the U.S. Department of Education's National Center for Education Statistics.

"When I was a superintendent, I was frustrated that I couldn't get my hands on the kind of data that I needed," says Philip Streifer, who is the director of the EIMS project and an associate professor of educational leadership at the University of Connecticut. "Decision-support tools have been used by corporations for a long time. It's high time somebody took this technology and made it affordable to schools."

Indeed, cost was a crucial criterion during Learning Landscape's development, Streifer says: "I told KPMG, 'If schools are going to use this thing, it has to be under $100,000.' Most companies just laughed at that, but we met the price point."

Currently fully operational in two Connecticut school districts, Learning Landscape is now being marketed to school systems around the country at a cost of $89,000 for districts with enrollments under 8,000 students -- plus a $12,300 annual fee that covers hosting services and technical support. (The price for larger districts depends on enrollment.) The system's cost includes monthly extractions of data from the district's operational systems, as well as training and professional development on the use of the warehouse. Because the data warehouse is hosted off-site at GTE Internetworking, districts can bypass the expense of purchasing and maintaining expensive mainframe servers, Streifer says. And because the warehouse is Internet-enabled, district employees need only a web browser to gain access to it.

To make the system user-friendly, EIMS built standard templates into Learning Landscape that allow quick access to common queries such as longitudinal benchmarking, equity issues, cost control, and what Streifer calls "dipsticking" -- identifying areas for improvement. "Of course, the types of questions you can ask are limited only by the kinds of data you've collected," he adds. "It's cool -- I'm ready to go back to the superintendency and use this."

One immediate benefit of the system is the ability to automate the generation of mandated state and federal reports, Streifer says. A massive financial report required by the state of Connecticut used to take school district business offices four to six weeks to complete manually over the summer. Now, he says, "this report can be generated by clicking a button. That's why data warehousing is going to be useful for states as well."

Another recent entry in the K-12 data warehouse market is Vision Associates' eScholar, an education-specific software solution built around technology from IBM and Brio. The package is priced at $2 to $3 per student, plus an additional 75 cents per-student annual licensing fee, and is designed to be implemented in as little as two to three months.

Currently installed in two school systems and with 55 new installations under way, eScholar is intended for any size district. The data warehouse can run on a low-cost Windows NT-based PC, a medium-range Unix server, or a full-strength IBM AS/400 mini-mainframe, says eScholar brand manager Keith Gile: "Accountability is not just for big school districts -- and neither is data warehousing. We're finding that every school district needs this." As with Learning Landscape, access to the information in the warehouse is provided via web browser.

"Without a data warehouse, most of the time is spent looking for data, rather than analyzing it," says Chris Watkins, alliance manager for Vision Associates. Because data warehouses allow for interactive follow-up questions, they are much more useful than canned reports from the district's data processing department, adds Gile: "A report is an endpoint, whereas a data warehouse is a starting point. There could be trends out there that no-one is aware of."

In Elizabethtown, Ky., administrators at the 13,000-student Hardin County Public Schools are beginning to search for these hidden trends. The district started using eScholar in May, and the staff expects to be running queries on live data this fall. "We have a lot of data, but it's all in different formats and software packages," says Superintendent Lois Gray. "Before we got the data warehouse, we could look at attendance, but we couldn't easily correlate it to students on the free and reduced-price lunch program, gender, test scores, and other variables."

One of the issues the district is eager to investigate: In a group of at-risk students, there are typically a few students who do well academically even though most of their peers struggle. What is different about them? "Is it the parents? We'd like to know what the factors are," Gray says.

This fall, once the first phase of the data warehouse implementation is complete -- the district has spent more than $100,000 so far -- administrators hope to explore possible causes of low reading scores at the high school level, says Assistant Superintendent Ron Bryan. "We might look at teacher attendance, how much professional development teachers have received in reading instruction, and how much money is spent on instructional materials," he says. "Combining data from three different sources like that wouldn't be impossible to do without a data warehouse, but it would be very difficult."

With Fort Knox next door, the district is also taking advantage of the warehouse to generate the reports necessary to receive federal impact aid. "Every report we have to complete is taken into account," Gray says. Even the bus routes are integrated into the warehouse, she adds: "It knows where the special education students live and can tell us whether there's an attendance problem in a particular geographic area."

Putting the pieces together

Building a data warehouse involves much more than simply purchasing and installing software and hardware, experts who talked with Electronic School agree. Here's a brief guide to building your district's data warehouse:

* Find an experienced vendor. The first step should be to look for a vendor that has extensive experience building data warehousing solutions in a K-12 environment, says Jane Lockett, a senior IBM consultant and former educator who has helped several school districts implement data warehouses.

"A data warehouse is not something you buy off the shelf," Lockett says. "It requires a specific methodology, not just technology." Lockett advises school leaders to ask prospective vendors about their experience and track record, especially since experienced vendors will be able to build a data warehouse relatively quickly: "What types of data warehouses have they built? Have they built warehouses in the manufacturing sector or the public sector?" As for cost, Lockett advises school districts to budget for $250,000 to $350,00 -- depending on the district's current technology infrastructure and level of preparation.

* Analyze the district's needs. Because a data warehouse needs to be aligned with your district's strategic goals and objectives in order to be effective, any data warehouse implementation should begin with a dialogue between the vendor and district staff that focuses on the school system's goals and business needs.

"We start with a workshop that includes the superintendent, technology coordinator, and other administrators and technical staff," says Keith Gile of Vision Associates. "We try to find out what the business rules of the district are, such as reporting requirements. We approach the school district as if it were any other business."

* Cleanse the data. The old computer adage "Garbage in, garbage out" applies in no small measure to data warehousing. There are two parts to this problem: First, ensuring that the data is being collected and entered accurately into the district's operational computer systems; second, extracting the data from those systems on a regular basis and moving it to the data warehouse for analysis.

Dealing with gaps in data collection can be tricky, says Lockett. "At one client site, over 20 percent of the students had no valid ethnicity listed in the district's student information system -- and some students had no gender," she recalls. A likely cause: "Many clerks simply don't have time to input all the necessary data." The only real solution to this problem is for the district to make accurate data entry a high priority.

The process of extracting the data from the district's operational computer systems and getting it into the data warehouse can be quite labor-intensive -- at least initially -- because of the wide variety of data formats in use. In most cases, however, subsequent updates can be largely automated, requiring little manual intervention.

"Our consultants come onsite for the initial extraction of data," says Philip Streifer of EIMS. "That process forces the district to do a lot of data cleansing. We identify a lot of things during the first initial process that help make later uploads 'clean,'" he says.

The process might soon get easier: As school management software vendors build support for Microsoft's proposed Schools Interoperability Framework (SIF) into their products, the data extraction process should become much simplified, says Microsoft technical evangelist Manish Sharma. Announced earlier this year, SIF defines a common data format for data exchange between operational systems -- an innovation that will also benefit data warehousing, Sharma says.

* Start small, train users, and go slow. The effectiveness of a data warehouse solution is often proportional to the number of people who have access to it. On the other hand, many experts caution that school districts should start small and roll out access to the data warehouse slowly, in part because users need to be adequately trained.

"This takes time, education, and training," says IBM's Lockett. "It isn't just something you give to everybody. We can't make any assumptions about users' ability to understand data." Broward County's Nancy Terrel agrees: "You have to lay the foundation very carefully. We rushed into our pilot sites when we may have been better off laying more groundwork. We also made too many assumptions about the users' knowledge of working with data query tools. You have to start with baby steps and get the idea institutionalized."

Who should get access first? The consensus: Start with the superintendent and top administrative and technology staff, then slowly push the access further down the organizational chain. That helps prevent surprises, warns Vision Associates' Keith Gile: "Are you prepared to deal with the answers you could be getting? A phased rollout is best, so that the superintendent has a handle on it first."

The school district should also be vigilant in assigning levels of access on a need-to-know basis, Terrel says: "There have to be safeguards in place. For example, a guidance counselor may have access to records that a teacher can't see, such as psychological tests."

Eventually, districts might choose to allow students and parents to connect to the data warehouse from home, using a web browser to view a limited portion of information relevant to them, says Gile: "We see data warehousing as a tool for the masses. Students might want to know how they stand in relation to other students. A parent might say, 'I'd like to see all the data on my daughter.'"

That's a controversial idea, however. "To share information in this way requires a culture change," Lockett says. "Parents are not trained. It would be more appropriate for a teacher to present a view of the data to the parent, rather than the parent accessing it directly."

* Sell the concept. Streifer puts it bluntly: "It's not considered 'sexy' to pay for administrative support tools. This can make it hard to sell the concept to the board and the community." Once sceptics come to see the technology as a tool to boost student achievement and school effectiveness, however, chances are good that a data warehouse will enjoy wide support.

Big Brother?

With the advent of data warehousing, it appears the concept of a "permanent record" has finally become reality. Could data warehouses backfire on school districts by arousing paranoid anxieties among the public?

Not likely, says Philip Streifer of EIMS: "I interviewed a number of superintendents in Connecticut, and they all said, 'School districts are being beat up in terms of accountability. School administrators need this, period.'"

The bottom line? Perhaps IBM's Jane Lockett puts it best: "Any information that is going to help a child to learn, a parent's going to want the district to have it."

Lars Kongshem is associate editor and webmaster of Electronic School.

Reproduced with permission from the September 1999 issue of Electronic School. Copyright © 1999, National School Boards Association. Electronic School is an editorially independent publication of the National School Boards Association. Opinions expressed by this magazine or any of its authors do not necessarily reflect positions of the National School Boards Association. This article may be printed out and photocopied for individual or educational use, provided this copyright notice appears on each copy. This article may not be otherwise transmitted or reproduced in print or electronic form without the consent of the Publisher. For more information, call (703) 838-6739.

Got a comment about this article?
Voice your opinion on our message board!

Want to stay in touch?
Sign up for our e-mail newsletter!

Letters to the Editor: letters@electronic-school.com
Free trial subscription: subscriptions@electronic-school.com
Article submissions: editor@electronic-school.com
Reprint requests: reprints@electronic-school.com
Advertising inquiries: advertising@electronic-school.com
Webmaster: webmaster@electronic-school.com


Home / About / Archive

© 1999, NSBA