ETD-MS v2.0: A Proposed Extended Standard for Metadata of Electronic Theses and Dissertations

Content

ETD-MS v2.0: A Proposed Extended Standard for Metadata of Electronic Theses and Dissertations

Metadata

Title

ETD-MS v2.0: A Proposed Extended Standard for Metadata of Electronic Theses and Dissertations

Description

The growth of Electronic Theses and Dissertations (ETDs) in academic repositories requires comprehensive and robust schemas for compliance with the FAIR (Findability, Accessibility, Interoperability, and Reusability) principles. Dublin Core and ETD-MS v1.1 were established as the metadata standards for general scholarly documents and ETDs. We identified several gaps between the existing schemas and the need to represent ETDs comprehensively toward a better digital service. The content-level data, including objects comprising ETDs, become increasingly crucial to facilitate the development of machine learning models to mine scientific knowledge from ETDs, and scholarly big data services in general. By organizing content-level data into a new schema, this paper addresses the critical need for enhancing the expressiveness and depth of metadata for ETDs. The schema proposed includes a Core Component building on the existing ETD-MS v1.1 schema, and an Extended Component that captures objects, their provenance, and user interactions for ETDs. The schema covers 28 entities with a total of 160 metadata fields. To demonstrate applicability, we implemented the schema using MySQL and populated it with data derived from 1,000 ETDs collected from U.S. university libraries. This work provides a comprehensive and flexible approach that addresses the limitations of existing standards by enabling the description of content-level data, laying the groundwork for integrating advanced AI techniques into academic repositories.

Date

2024