College campuses are hothouses of data, including course schedules, degree requirements, and grades. But much of the information remains spread out across software systems or locked on university servers. Now a crowd of start-ups has emerged with hopes of prying out those rich data sets to build an app economy for universities—a world of new personalized services that could transform the student experience.
The idea of opening data to consumers has already spread to such industries as health care and energy.
In 2010 the Department of Veterans Affairs introduced a "blue button" for online health records that lets patients download their information with a single click. Consumers looking to reduce their energy bills can use a similar tool: The White House recently promoted a "green button" for utility companies that lets customers download their energy-consumption information.
Now a "MyData button" for students is on the horizon. A government campaign is urging colleges and companies that hold student data to make information like grades and test scores more portable and user-friendly.
The thinking behind data buttons goes like this: Armed with information, consumers can plug it into smartphone apps and Web tools to make better decisions and save money.
But advocates of unlocking more data in higher education face tough challenges.
Chief among them are privacy concerns and questions about how to interpret federal student-privacy laws in light of new technologies. And college officials say making data available safely requires expensive systems they can't easily afford.
Some say there's more reason than ever for colleges to dole out data, however. Students are less and less likely to follow the traditional path to a degree, graduating from the same institution they first entered. A report this year from the National Student Clearinghouse Research Center found that one-third of all students switched institutions at least once before graduation. If students can take their data with them across institutional lines more easily, some argue, they'll be better prepared to earn degrees.
Education companies that hold student data could start displaying the MyData button in their programs in a matter of weeks, though the timeline is up to the participants, says an official at the U.S. Education Department familiar with the effort. This summer the button will also let students download their federal financial-aid data in a machine-readable format, which software applications can digest easily.
Some education-technology entrepreneurs haven't waited for the MyData program to try to unlock some university data. Four years ago, Ben Greenberg and Rui Xia created a textbook-price-comparison tool called TextYard while they were students at Indiana University at Bloomington. TextYard "scraped" book information from college stores' Web sites, essentially running a software robot that plucked information from publicly available listings.
Others have used scrapers to build applications for course scheduling and degree planning. TextYard, Mr. Greenberg acknowledges, operated in a legal gray area, since the developers of some scrapers have been accused of violating Internet-trespassing laws.
Mr. Greenberg helped build the site, he says, because the formal process for obtaining the lists of textbooks assigned to specific courses was cumbersome, requiring him to file slow-motion requests for public records. In February, before leaving TextYard behind to move on to other projects, the pair posted the tool's source code online, allowing other developers to take it and build their own versions. Mr. Greenberg included a legal defense of TextYard's scraping tactics for would-be copycats.
He believes that most colleges have been too slow to adopt open data as a means of helping students, and that developers will continue to work around the system until vendors and administrators embrace open data.
"Whenever it's a level playing field—which is what open data does—the more innovative, efficient side will always win," Mr. Greenberg says.
Scrapers are popular tools for extracting information from college Web sites, but they're fragile; they can slow down loading times for other visitors if they are programmed badly. And when colleges redesign their sites, many scrapers stop working.
Realizing that students are already digging for data on their sites, some institutions are creating their own platforms to satisfy the demand.
This year Harvard University welcomed a contractor who joined its information-technology staff in an effort to create live data feeds from things like course catalogs, for use in computer-science programming projects. The team calls its work "data wrangling"—a not-so-subtle hint at the difficulty of coaxing entrenched software systems into providing data that easily plug into applications.
In the past, says Katie L. Vale, director of academic technology at Harvard's Faculty of Arts and Sciences, computer-science courses had to use "dummy" data feeds, which only mimicked real information.
"The data was coming from really old, crusty systems, and it was not in a state that would make it easy for us to get a data feed that the students could use" for their programming projects, she says.
A long-term goal of Harvard's effort is to give students a Web portal where they can access data on courses, shuttle schedules, and dining-hall menus, among other things, says Ian T. Wall, the university's associate director of enterprise data and business-intelligence services. Students have even asked him about creating real-time data for campus laundry facilities so they can tell when washing machines are empty.
At the University of Waterloo, in Canada, an open-data platform is part of a larger regional effort to increase transparency in government. Some departments were hesitant to play along at first, says Giles D. Malet, a software-integration specialist at the university.
"There was a bit of reluctance, because people often said, 'That's our data, we can't just release that'," he says. "But we made the argument that students are grabbing it anyway, and probably screwing it up and making mistakes, so the best thing we can do is give them some good, clean data with a bit of control."
Any information that's publicly available and not hidden behind a password-protected screen is fair game for developers, he adds. So far, course and exam-schedule information is available, and there are plans to extend the platform.
The goal is to empower the university's network of student programmers. "They look at some of our systems and say, 'That's terrible. I could do a better job,'" Mr. Malet says. "So we say, 'Here's the data; see if you can.'"
A laundry app may sound like a trivial use of university data, but entrepreneurs believe they can use other kinds of information to transform the college experience—like an application that helps students select which college to attend on the basis of their career goals. Or a degree-planning tool that could help students graduate with less debt.
First, a Blueprint
With access comes responsibility, though. Some administrators have been wary that using third-party applications might lead to violations of federal student-privacy laws like the Family Educational Rights and Privacy Act, known as Ferpa.
Theresa Rowe, chief information officer at Oakland University, says her role is to act as a steward and protector of students' data. So she can't accept a lower standard of security from a third-party vendor than what she would use at her own institution.
Limited budgets also make it hard for some information-technology departments to take the time to build the kinds of open-data platforms that start-ups are clamoring for. Rich Hershman, vice president for government relations at the National Association of College Stores, says that when Congress debated the Higher Education Opportunity Act, it decided against requiring bookstores to give textbook-adoption information to third-party entrepreneurs like Mr. Greenberg, because building those systems can be costly.
Those added costs might well be passed on to students, says Ms. Rowe. "Is it really worth it to have my students pay more tuition so that I have more funds to do projects that enable your business to be successful?" she asks of entrepreneurs who want colleges to wrestle data into application-friendly formats.
She also hopes the Education Department's involvement in open-data efforts will provide clearer guidance on questions of compliance with student-privacy laws.
Michael D. Sessa, executive director of the Postsecondary Electronic Standards Council, which works with vendors and institutions to hammer out digital standards for higher-education data, urges open-data advocates to draw a blueprint before building the house. Instead of focusing on mere access to data, he says, stakeholders need to determine what kind of data they want and what they'll be using it for.
"Can we figure out what's three steps down the road in this progression?" he asks. "Or do we have to go through every step every time?" The heightened interest in open data is good for higher education, he says, but start-ups need to do a better job of proving that students will benefit from using the companies' products.
In the case of the MyData button being promoted by the Education Department, it's not clear how many different types of information will be made available, although the data will exist in machine-readable, open formats. Participants will be required to specify how the exported data are formatted. Because participants are not required to export data in an identical format, a department official explains, developers may have to do more work upfront, but the information will get into students' hands more quickly.
At least one company, Fidelis Education, has committed itself to use the data students can download from the Veterans Administration's blue button.
As an enterprise that helps veterans pursue higher education and training for civilian careers, Fidelis plans to use the blue button's military-service data in the admissions process to verify that applicants are who they claim to be. Gunnar Counselman, a co-founder and chief executive of the company, says having access to an even more robust set of data about alumni satisfaction and employment could provide students with a personalized way to pick colleges that goes beyond rankings.
He's not convinced that such data will be available anytime soon. But the emergence of start-ups has had a "Hawthorne effect" on universities, he says—they're more open as a result of being observed so intently by outsiders.
The blue button for health care was "very useful in that it got the ball rolling," Mr. Counselman says. But he would be disappointed if the type of education data made available were limited to course schedules and textbook information.
"The question of what's the power of open data is just a lot more fundamental than that," he says.
Even so, he says, there is tremendous promise to be found in partnerships between scrappy, data-focused enterprises and traditional colleges.
"Institutions that have autonomy don't change overnight," he says. "So you start with what's easy, and then you keep the pressure on to get to the goal."
Correction (4/23/2012, 1:49 p.m.): This article originally misstated the job title of Katie L. Vale at Harvard University. She is director of academic technology for Harvard's Faculty of Arts and Sciences, not for the entire university. The article has been updated to reflect this correction.