Previous

A Second Chance for Microsoft

Next

Blackboard's Legal Moves Raise Fears

August 01, 2006, 01:53 PM ET

Text-Mining the 'Times'

Researchers at the University of California at Irvine are trumpeting an advance in the field of "text mining:" They’ve managed to get computers to analyze the topics of some 330,000 New York Times stories in just hours.

Text-mining efforts have typically been inefficient because computers require a lot of guidance to categorize text. But the Irvine scientists were able to expedite the process by using a technique called topic modeling, which teaches computers to search for patterns of words that tend to occur together in articles on specific subjects. It’s a pretty safe bet, for example, that articles that include the words "rider," "bike," "race," "Lance Armstrong" and "Jan Ullrich" are about the Tour de France. —Brock Read

Categories: Research

Add Your Comment

Commenting is closed.