Washington—Digital humanists converged on the headquarters of the National Endowment for the Humanities yesterday to talk about cutting-edge work with big data in history, linguistics, literature, and other fields. This year’s Digging Into Data Challenge Conference features eight projects supported by the endowment’s Office of Digital Humanities and several partners, including Britain’s Joint Information Systems Committee. All involve scholars and computer scientists using text-mining and other digital strategies and tools to ask research questions using enormous amounts of data.
For instance, how can we trace the patterns of scholarly communication during the Enlightenment? Letters are one good way to track the circulation of ideas. One project, “Digging Into the Enlightenment: Mapping the Republic of Letters,” has been mapping thousands of letters sent by key figures of the period. It produces visualizations that track who wrote to whom, and where.
Were Voltaire, Franklin, et al., as cosmopolitan as they aspired to be? Not always, according to the patterns identified by the project. Seventy percent of Voltaire’s correspondents were in France, noted Dan Edelstein, an associate professor of French and history at Stanford University, who is one of the researchers involved. “Only 1 percent of his correspondence is to England, and even there he was writing to people who aren’t of historical importance to us,” Mr. Edelstein said. “He wasn’t plugged into the network there.”
Two other projects took the stage on the first day of the conference: an examination of “Railroads and the Making of Modern America” and “Harvesting Speech Datasets for Linguistic Research on the Web.”
Along with the big data on display, big questions about methodology and evolving research philosophy threaded through the day’s discussions. How can scholars catch up to the enormous amounts of data that humanity is generating? Can metadata capture the kind of specific context and individual detail that has driven so much humanistic scholarship? Are lab data less useful than data gathered from the real world—snippets of voice recordings captured online for linguistic analysis, for example? And how do computer scientists and scholars build tools that do new things, but not to the point of being so fancy they’re useless?
Digital tools are increasingly sophisticated—but are useful only if they do what researchers need them to do. “Interaction is an essential glue,” said Chris Weaver, an assistant professor in the school of computer science at the University of Oklahoma, who is part of the Digging Into the Enlightenment Team, during a presentation on visualizations.