San Diego–“We may already be in the red in terms of our ability to store information,” said Christopher L. Greer last week to an interested, and vaguely intimidated, audience of scientists and other academics. Gene sequences, distant pulsar signals, YouTube videos, e-mail — it’s all too much to keep track of.
Or perhaps not. Mr. Greer, who works on networking policy for the White House Office of Science and Technology Policy, was addressing a session called “Managing the Exaflood” at the annual meeting of the American Association for the Advancement of Science. It was actually an optimistic gathering, where researchers presented ideas for getting a handle on all this data — an exabyte is one billion billion bytes — and using it productively.
Larry Smarr, a professor at the University of California at San Diego, demonstrated a method of coupling genetic sequences from ocean bacteria — and there are lots and lots and lots of those — with environmental information about those orgamisms. And there is lots and lots of that, too. At the California Institute for Telecommunications and Information Technology, which he directs, Mr. Smarr has a tiled wall of flat-screen monitors. (Imagine a war-room set from a spy movie.) Starting with a global view of the planet’s oceans, he can zoom in to a specific gene sequence of a specific bacteria, annotated with information about its environment and many other things. As gene sequences get cheaper and easier to amass, Mr. Smarr said, this kind of visualization is one way to work with them.
And Google, a major source of information overload, can also help manage it, according to Google’s chief economist. Hal Varian, who is a professor at the University of California at Berkeley as well as a researcher for the search-engine giant, showed off an analytic tool called Google Insights for Search. Using the number and distribution of individual Google searches for information on Toyota automobiles, he said, he was able to predict sales volume with some accuracy days before industry figures were released. A similar strategy, using searches for information on travel to Hong Kong, could produce predictions of the number of foreign visitors to the city, if their tourism bureau was interested.
These strategies present challenges for accurately tagging data and archiving it, the presenters warned. But they do offer hope for ways to survive the flood.




One Response to Professors Find Ways to Keep Heads Above ‘Exaflood’ of Data
d_fevens - February 25, 2010 at 8:48 am
Using the number and distribution of individual Google searches for information on Toyota automobiles, he said, he was able to predict sales volume with some accuracy days before industry figures were released.–Hal VarianI wonder what the results of a “Google Books Copyrights infringement” search would show?Douglas FevensHalifax, Nova ScotiaThe University of Wisconsin, Google, & Me