Improving research data reusability through data conversations
Bridging gaps in metadata supply and demand
More Info
expand_more
Abstract
Efficient and inclusive data reuse across research disciplines is based on high quality metadata that bridges the gap between data producers and consumers. This gap, referred to as the metadata gap, arises when the metadata provided by producers do not meet the needs of consumers. Through a comprehensive analysis of metadata supply and demand, this thesis identifies the motivations and barriers faced by producers in creating metadata, along with the challenges consumers face when reusing datasets. To address these issues, the thesis introduces context-bridging data conversations, a framework designed to make metadata creation a more collaborative and adaptive process. The proof-of-concept is built on four key mechanisms: involving consumers as co-creators, recognising and incorporating contextual metadata, leveraging real-time dialogue, and dynamically adapting metadata elicitation questions. Qualitative interviews were conducted to identify the factors that shape metadata practices, and AI-generated summaries were evaluated as a scalable tool to synthesise the insights of these conversations. The findings are applied to the data management plans of CropXR, an interdisciplinary research institute. This case study illustrates how the metadata gap analysis can identify specific areas for improvement in metadata practices and how the context-bridging data conversations framework can provide actionable recommendations to enhance data reusability. By analysing data reusability as a dynamic and context-dependent process, this thesis advances both practical methodologies and theoretical understanding of metadata management. These contributions offer actionable strategies to close the metadata gap, foster collaboration across scientific domains, and promote more efficient and inclusive research practices.