Munoz – Question 4
4What has been your most enlightening moment in your work with data curation?
Associate Director, Maryland Institute for Technology in the Humanities (MITH); Assistant Dean of Digital Humanities Research, University Libraries – University of Maryland
¶ 1 Leave a comment on paragraph 1 0 Preparing the lectures for an introductory training course on data curation for humanists led to my most enlightening moment so far. In trying to explain the role of social factors in supporting preservation and the re-use of data, I started to see more points of convergence between data curation and work on “crowdsourcing” and closely related ideas of public humanities. This convergence is an effect of the changed information environment for “scholarly” data. Specifically, library and information science (LIS) represent a body of theory and practice on connecting users with information.1 New LIS research on questions of data “value,” which is closely tied to re-use and thus preservation, draws on this older tradition of service to user communities, to develop a thesis that “highly valuable research data are anti-fragile; they are capable of not just persisting over time but actually gaining in value as they are stressed via application to new settings, used in diverse contexts, and transferred amongst a network of systems.”2 Attention is a crucial factor in the survival of any particular dataset. The current configuration of Internet technologies has lowered the barrier to the loosely coordinated publishing of information to platforms (like Wikipedia), and these now concentrate an immense amount of attention (due in part to the market dominance of Google’s particular search engine technology). Thus, my moment of enlightenment came from realizing that data curators should be asking what public, web-based resources need to be curated alongside “local” data in order to increase the integration of a research project with a larger information network, which is a public good or a “commons.” Curation of one’s own data might well mean working on someone else’s data—for example, by improving the coverage of women’s history in Wikipedia,3 which serves as a source, via things like DBpedia, for identifiers to be used in the “linked data cloud.”
¶ 2 Leave a comment on paragraph 2 0 The digital humanities exist at the intersection of computational technologies with the theories, practices, and research questions of the humanities. Digital humanists have been publishing scholarship to “the Web” since its inception (as documented through crucial early projects like Voice of the Shuttle). The persistence of interest on the part of digital humanists to put more of certain kinds of humanities scholarship into “public” spaces is supported by numerous examples. The enthusiasm for both using and providing application programming interfaces (APIs) to digitized collections, and in creating additions to massively popular publishing software such as WordPress for scholarly purposes, demonstrates this. In other words, the digital humanities engage with “the Web” for scholarly communication, advocacy, etc., but this engagement also has important curatorial and preservation consequences. Palmer, Renear, and Cragin envision that data curators “will build and maintain not only digital libraries and curated data sets, but also the associated indexing systems, metadata standards, ontologies, and retrieval systems.”4 I would argue that these indexing systems and ontologies include open, web-based resources like Wikipedia (or maybe structured data resources derived from Wikipedia.)5 Wikipedia is the quintessential “crowdsourced” project and its use as a shared data resource suggests that data curation needs will push scholars and librarians further toward participating in a public information sphere on the web as a way to support local datasets.
- ¶ 3 Leave a comment on paragraph 3 0
- Carole L. Palmer, Allen H. Renear, and Melissa H. Cragin, “Purposeful Curation: Research and Education for a Future with Working Data,” December 2, 2008, https://www.ideals.illinois.edu/handle/2142/9764. [↩]
- Nicholas M. Weber, Karen S. Baker, Andrea K. Thomer, Tiffany C. Chao, and Carole L. Palmer. “Value and Context in Data Use: Domain Analysis Revisited,” Proceedings of the American Society for Information Science and Technology 49, no. 1 (2012): 1–10. doi:10.1002/meet.14504901168. [↩]
- Mia Ridge, “New Challenges in Digital History: Sharing Women’s History on Wikipedia – My Talk Notes,” Open Objects, March 23, 2013, http://openobjects.blogspot.com/2013/03/new-challenges-in-digital-history.html. [↩]
- Palmer, Renear, and Cragin, “Purposeful Curation: Research and Education for a Future with Working Data.” [↩]
- For example, http://www.freebase.com/. [↩]