Kouper – Question 1
1How is data curation a part of your job?
CLIR/DLF Data Curation Postdoctoral Fellow, Data to Insight Center – Indiana University
¶ 1 Leave a comment on paragraph 1 0 Three factors shape my perspective on data curation: my research interest in the changes in knowledge production over time and across domains; the goals of my CLIR/DLF postdoctoral fellowship; and the mission and activities of the research center where I work, the Data to Insight Center (D2I) at Indiana University Bloomington.
¶ 2 Leave a comment on paragraph 2 0 As a data curation fellow at D2I, I help develop tools and best practices for data management, sharing, and preservation; I conduct research in the areas of data analytics and stewardship as well as digital humanities and digital libraries; and I advance data curation as both an important component of science and as a professional activity.
¶ 3 Leave a comment on paragraph 3 0 My primary involvement is with the Sustainable Environment, Actionable Data (SEAD) project. The National Science Foundation (NSF) funded SEAD to create national and global data research infrastructure via the program “Sustainable Digital Data Preservation and Access Network Partners (DataNet).” SEAD is a collaboration between the University of Michigan, Indiana University, University of Illinois at Urbana-Champaign, and the Rensselaer Polytechnic Institute that works to develop a digital environment that will support data curation and management workflows in sustainability science. D2I is developing Virtual Archive, a solution that provides federated deposit and access capabilities across multiple institutional repositories. A researcher or data curator can use Virtual Archive to automatically deposit data in one of the partner repositories or to search across those repositories and find the data that they need.
¶ 4 Leave a comment on paragraph 4 0 For SEAD Virtual Archive, I co-lead architectural, policy, and user-engagement activities and solutions. For example, I use my metadata and information organization expertise to improve the Virtual Archive data model and interface to support the dual needs of researchers and data curators. Currently, we are working on identifying layers of semantic information, such as the relationships between different data types within a dataset or the possible uses of data, which can help to describe heterogeneous or multi-variable datasets in sustainability science. I also use my expertise in social theory and the sociology of knowledge to identify barriers in data sharing and find suitable approaches to data sharing and preservation in relevant SEAD disciplines, such as ecology or hydrology.
¶ 5 Leave a comment on paragraph 5 0 I am also involved in the HathiTrust Research Center (HTRC) project. This project develops cyberinfrastructure to support the digital humanities in their work with massive amounts of texts derived from the HathiTrust Digital Library. My role in this project is primarily concerned with engagement and community development. If we want this project to succeed, it is crucial to let the community know about this environment and its capabilities, gather feedback so that we can continue improving it, and perhaps learn together about innovative research opportunities in text computations.
¶ 6 Leave a comment on paragraph 6 0 Engagement and outreach are crucial for my data curation work. Many efforts to facilitate data sharing and preservation exist, yet it is still a relatively new component of scholarship and research. I work to help articulate the value of sharing and preserving data to research communities and to create opportunities to do so. We often talk about how important it is to ground our data tools in actual use cases, so I spend a significant amount of time thinking about how to connect to researchers and their data practices, how to find “data champions,” and how to create a framework to engage researchers in data exchanges. The latter is the focus of my efforts as an active member of the new global initiative called Research Data Alliance (RDA), within which I, along with others, formed an engagement interest group. We had a successful planning meeting during the RDA plenary in March, and about 26 data practitioners and scientists attended. We shared many interesting ideas, such as creating a library of best practices and solutions. Our group is still in the early stages of development and we invite all who are interested to join. (The RDA-Engage forum thread can found here; to subscribe to the RDA-Engage mailing list visit here.)