Previous |
Next |
August 01, 2006, 01:53 PM ET
Text-Mining the 'Times'
Researchers at the University of California at Irvine are trumpeting an advance in the field of "text mining:" They’ve managed to get computers to analyze the topics of some 330,000 New York Times stories in just hours.
Text-mining efforts have typically been inefficient because computers require a lot of guidance to categorize text. But the Irvine scientists were able to expedite the process by using a technique called topic modeling, which teaches computers to search for patterns of words that tend to occur together in articles on specific subjects. It’s a pretty safe bet, for example, that articles that include the words "rider," "bike," "race," "Lance Armstrong" and "Jan Ullrich" are about the Tour de France. —Brock Read
Categories: Research


Add Your Comment
Commenting is closed.