Module 4: Metadata
Learning Objectives
- Learning Objective 1 (LO1): Summarise and present the definition, history and applications of metadata.
- Learning Objective 2 (LO2): Criticise and appraise the quality of descriptive metadata records.
- Learning Objective 3 (LO3): Describe the applications of structural metadata.
- Learning Objective 4 (LO4): Identify the applications of administrative/legal metadata.
Total Duration
8 hours 40 minutes
Learning Objective 1
LO1: Summarise and present the definition, history and applications of metadata.
Learning Activities
- Lecture (45 mins): The lecture can focus on defining metadata (suggestion to use Pommerantz 2015, Resource 6). History of Metadata can be covered, from ancient libraries to the present. Additionally the lecture can cover metadata types in the context of data stewardship; Descriptive, Structural, Administrative and Metadata in everyday life, and in the news. The driving point is to open students up to the reality that metadata is used all the time, everywhere in every day life in small, invisible but meaningful ways (ranging from the contents descriptions on a can of soup to your phone's metadata on a phone call). The instructor can then move to how metadata works within a data steward's portfolio of tasks.
- Discussion (90 mins): Discussion about the metadata used in the Edward Snowden case (Resources 8, 9). What does the metadata reveal? How can different bits of metadata contribute to a broader picture of meaning? How does it "tell a story" (true or not) or realise something that could be relayed with semantic meaning? The instructor can also choose another news source.
Materials to Prepare
- Lecture slides to be drawn from material in Resources 4–7.
- Facilitate discussion by presenting the Edward Snowden case or another news source and broad points about the Snowden leaks to guide the conversation.
Instructor Notes
Lecture:
- Metadata is a key tool in everyday life—metadata is a kind of infrastructure, it becomes visible upon breakdown.
- Go through the types of metadata and highlight their use in a data stewardship context. Outline the importance of descriptive, structural, and administrative metadata, and how they intersect with the work of data stewards, both as data curators in repositories but also as consultants on projects.
- Metadata is a crucial part of not only Findability but all elements of the FAIR principles.
- Relay the history of metadata from early libraries through the structuring of digital technologies. See Gartner 2016 and Pommerantz 2015 (Resources 6, 7) for excellent summaries on this topic.
- Highlight a few instances of metadata as a tool that enables our digital lifestyles.
Discussion:
- The goal of the discussion is to engage with a real case (Resources 8, 9) and have the learners discuss this with their understanding of metadata from the lecture. The instructor can ask the learners to consider the questions either through group discussions or alone. Drive the point in the discussion that labels are important in giving order/meaning/or even a story (like who you are and what you are doing through your phone metadata) its structure and description.
Resources
- "Metadata MOOC". YouTube. http://www.youtube.com/playlist?list=PLkp3pG2Rd3yqfIn313V32fXG4nng9Tb-H. Accessed 28. Mar. 2025.
- Architecture for Information in Digital Libraries. 27. Mar. 2017. https://web.archive.org/web/20170327015501/http://www.dlib.org/dlib/february97/cnri/02arms1.html.
- Peralta, Eyder. "U.S. Appeals Court Overturns Decision That NSA Metadata Collection Was Illegal". NPR, 28. Aug. 2015. NPR. https://www.npr.org/sections/thetwo-way/2015/08/28/435506021/u-s-appeals-court-overturns-decision-that-nsa-metadata-collection-was-illegal.
- DDI Training Group. DDI Basics: What Is Metadata? Aug. 2021. DOI.org (Datacite). https://doi.org/10.5281/ZENODO.5180481.
- DDI Training Group. Understanding Metadata. Aug. 2024. DOI.org (Datacite). https://doi.org/10.5281/ZENODO.13310567.
- Pomerantz, Jeffrey. Metadata. The MIT Press, 2015. DOI.org (Crossref), https://doi.org/10.7551/mitpress/10237.001.0001.
- Gartner, Richard. Metadata. Springer International Publishing, 2016. DOI.org (Crossref), https://doi.org/10.1007/978-3-319-40893-4.
- MacAskill, Ewen, et al. "NSA Files Decoded: Edward Snowden's Surveillance Revelations Explained." The Guardian, 1 Nov. 2013. https://www.theguardian.com/world/interactive/2013/nov/01/snowden-nsa-files-surveillance-revelations-decoded.
- "The Secret Things You Give Away through Your Phone Metadata". PBS News, 2. Jun. 2016. https://www.pbs.org/newshour/science/your-phone-metadata-is-more-revealing-than-you-think.
Learning Objective 2
LO2: Criticise and appraise the quality of descriptive metadata records.
Learning Activities
- Lecture (45 mins): The instructor can cover the following: Key-Value Pairs, Dublin Core elements and the introduction to descriptive schemes, Metadata record, Thesauri, controlled vocabularies and classification: how to gather and disambiguate records by their attribute, data repositories, knowledge (records) organisation, Domains and cross-domain integration (Data Documentation Initiative, DDI), Descriptive metadata within data files (further coverage in structural metadata Learning Objective), connection to ontologies and linked data (in the Ontologies Module of this curriculum), the semantic triple and a brief view of the future.
- Metadata scavenger hunt part 1 (100 mins): Look in a library catalogue or bibliographic database and analyse the contents of one or two records in groups, to be taken up as a group. Points of interest could be how a search engine produced a particular result, which can point to issues with weighting in such a search. Consider how tags and descriptive metadata enable findability. A twist of this can be done on a platform that allows for folksonomies. If time allows, this scavenger hunt could compare and contrast records that use authority files and folksonomic tagging (this could be accomplished on platforms such as YouTube or Reddit, which do not necessitate logins).
Materials to Prepare
- Lecture on descriptive metadata.
- Prepare a physical or online scavenger hunt.
Instructor Notes
Lecture:
- Key-Value Pairs are a simple but flexible way to represent an object. Touch briefly on the various formats, highlighting JSON and XML. Refer to prior technologies, such as library card catalogues.
- Present Dublin Core as the foundation for contemporary description. If the instructor is comfortable, they can refer to Ontologies and the RDA format as next technological steps in the domain of descriptive metadata.
- Cover how thesauri and authority files help to disambiguate terms that might otherwise semantically overlap and how this enables searching and systems of knowledge organisation.
- Different data repositories enable and hinder data discovery in various ways—consider how metadata interacts with these, and what best practices might improve this. This does not need to be terribly technical, but the instructor can draw attention to how data discover is largely contingent on relatively simple search wizards that parse complex metadata representations.
Metadata scavenger hunt part 1:
- This practical exercise will help learners analyse the contents of a database and allow them to appraise the quality of the records. Try puting students on databases with different forms of digital (or even physical) object description. How does the metadata differ?
Resources
- Alemu, Getaneh. "Metadata Standards and Models". Encyclopedia of Libraries, Librarianship, and Information Science, Elsevier, 2025, s. 532--45. DOI.org (Crossref), https://doi.org/10.1016/B978-0-323-95689-5.00035-3.
- Mayernik, Matthew. metadata. 2020. https://www.isko.org/cyclo/metadata#3.2.
- Biagetti, Maria Teresa. Ontologies (as knowledge organization systems). 2020. https://www.isko.org/cyclo/ontologies.
- Rafferty, Pauline. Tagging 2018. https://www.isko.org/cyclo/tagging.
- DDI Training Group. Representations -- Codes and Categories. Aug. 2021. DOI.org (Datacite). https://doi.org/10.5281/ZENODO.5180593.
- DDI Training Group. Foundational DDI Metadata: Unit, Unit Type, Universe, and Population. Feb. 2024. DOI.org (Datacite). https://doi.org/10.5281/ZENODO.10659410.
- DDI-CDI. https://ddialliance.org/ddi-cdi. Accessed 28. Mar. 2025.
- "Ddi-Cdi/README.Rst at Main -- Ddi-Cdi/Ddi-Cdi". GitHub. https://github.com/ddi-cdi/ddi-cdi/blob/main/README.rst. Accessed 28. Mar. 2025.
- RDF 1.1 Primer. https://www.w3.org/TR/rdf11-primer/#section-triple. Accessed 28. Mar. 2025.
Learning Objective 3
LO3: Describe the applications of structural metadata.
Learning Activities
- Lecture (90 mins): Give a walk-through of a survey data file online. How does descriptive and structural metadata make it meaningful? The instructor can facilitate learners to discuss how this is achieved or not. The instructor is invited to use Resource 5, an openly available survey conducted on people affected by various conflicts in the Balkans. This survey is interesting, as the respondents and questions are divided up in order to paint a broader picture of how people of different ethnicities, of different countries were impacted by violence, political unrest, and war.
- Learning activity: Going through a survey's structure alone or in groups (30 mins): Instruct students to search within a research data repository for an openly available social science survey. Ask the students to identify the structural metadata, and discuss how this metadata help to give the data meaning. If the course is being held in person, students can write these down on a whiteboard, which can in turn be presented.
Materials to Prepare
- Slides based on the content of the instructor notes.
- Familiarity with Resource 4.
Instructor Notes
Lecture:
- The lecture can cover the following:
- variables and relationships—structuring datasets,
- question texts and documentation (Social sciences),
- studies, datasets, data files—meaningful descriptions for data's structure,
- databases, and
- data preservation.
- Draw attention to the importance of a codebook and supporting documentation, which helps to give meaning to various metadata. Something coded as "1" must stand for a semantic phrase in order to draw meaning from the data.
- Discuss how metadata works within a dataset in order to give the raw data their meaning. Materials from this curriculum from the modules on Data Documentation and Storage, and FAIR data can also be reused here.
- Discuss how metadata structures datasets: Question texts, variables, and relationships can be used and applied to datasets.
- Draw attention to how these metadata enable longer term preservation of data (can refer to the module on data preservation and archiving for more information).
- Using Resource 4, point out how the survey gets divided depending on the nationality of the respondent. Draw students' attention to how this categorization (through structural metadata) provides a broader picture through the segmentation of the universe.
Learning Activity:
- Questions to consider:
- What is the purpose of the metadata?
- What role does this metadata play in the data life cycle?
- How does the metadata ensure integrity and quality?
Resources
Lecture:
- DDI Training Group. Variables and the Variable Cascade. Aug. 2021. DOI.org (Datacite). https://doi.org/10.5281/ZENODO.5180568.
- DDI Training Group. Questions. Aug. 2021. DOI.org (Datacite). https://doi.org/10.5281/ZENODO.5180575.
- Library of Congress. Metadata Encoding and Transmission Standard: Primer and Reference Manual. 2010. https://www.loc.gov/standards/mets/METSPrimer.pdf.
- Lesschaeve, Christophe, m.fl. 2018 ELWar Public Opinion Survey https://doi.org/10.7802/2396.
Learning Objective 4
LO4: Identify the applications of administrative/legal metadata.
Learning Activities
- Lecture (30 mins): The lecture can focus on file format metadata, data formats and versioning, access rights (for example, copyright, Creative Commons) and representing and implementing GDPR with respect to sensitive data.
- Metadata scavenger hunt part 2 (90 mins): Go to data repository and walk through a record's administrative and structural metadata.
Materials to Prepare
- Slides on administrative metadata based on the content of the instructor notes.
- Have a data repository ready for the scavenger hunt. If unsure what to use, browse re3data for something suitable.
Instructor Notes
Lecture:
- Briefly discuss legal and rights metadata. This need not be a comprehensive account. Just that these metadata are a requirement in many representations of digital objects. Point towards the difference between Open and Closed licences, and how these terms decide how one can interact and use an object.
- Dating and versioning metadata enable reproducibility. Highlight how various pieces of software and systems, such as GitHub as well as data repositories like Dataverse, represent through metadata changes made to a given dataset. Point towards how this metadata informs replication and reproduction studies. Moreover, this metadata enhances transparency, such that outsiders can understand how a bit of data came to look the way that it does.
- Talk about file format metadata, and how description can help guide a potential re-user of data through a metadata record about it.
- To get the learning activity going, point towards some data repositories that students can search within. Point at one of the various Dataverse installations or searching for repositories with Re3data.
- Questions for the learning activity; what are potential challenges if this metadata is missing? Look for places where the metadata correlates with elements in dataset documentation; for example, if the dataset files are not published, is there a clarification as to why in the documentation, the metadata fields provided within the repository? Or the other way around? What ethic considerations need to be taken under advisement in these metadata descriptions? How does this contribute to the interoperability of FAIR?
Resources
Lecture:
- Wildgaard, L., et al. Open Licenses for Data, Software and Code. 1.0.0, Zenodo. https://doi.org/10.5281/zenodo.12703494.
- Creative commons license chooser https://chooser-beta.creativecommons.org/.