Skip to content

Module 6: FAIR Data Sharing and Publication

Learning Objectives

  • Learning Objective 1 (LO1): Recognise the benefits, requirements, and limitations of sharing/publishing data.
  • Learning Objective 2 (LO2): Explain the importance of making data FAIR.
  • Learning Objective 3 (LO3): Recommend tools, workflows, and technical strategies to make research data FAIR.
  • Learning Objective 4 (LO4): Identify different forms of data publication and infrastructure to make outputs discoverable.
  • Learning Objective 5 (LO5): Apply CoreTrustSeal's requirements and CURATED checklist for datasets to certified repositories.

Total Module Duration

9 hours 10 minutes

Learning Objective 1

LO1: Recognise the benefits, requirements, and limitations of sharing/publishing data.

Learning Activities

  • Lecture (45 mins): Prepare a lecture on the benefits of data sharing/publication for researchers, the research community, funders, and the public. The lecture can also cover limitations (confidentiality, industrial exploitation, data protection) and what to do if the data cannot be made publicly available (for example, archive data and publish metadata with access rules). Requirements from institutional policies, funders and publishers (like data availability statements) should also be addressed.
  • Group Discussion (30 mins): Learners are confronted with a case in which a researcher resists sharing data and they need to provide arguments to convince the researcher. The activity can be done in small breakout groups, with results from each group discussed in plenum.

Materials to Prepare

  • Presentation slides on benefits, limitations, and requirements of data sharing and publication. Also introduce the reproducibility crisis.
  • Prepare a small input for the group task. The input should describe a detailed scenario of a researcher who is reluctant to sharing his data. Include some specific arguments from the researcher that the learners need to refute.

Instructor Notes

Lecture:

  • These are some of the key takeaways the instructor needs to cover in their presentation (see Resources 1–10 for information and inspiration):
    • Sharing data promotes collaboration, accelerates discoveries, and supports evidence-based decision-making across disciplines.
    • Open data allows for verification of results, reducing errors, bias, and misconduct in research and data-driven initiatives.
    • While open data drives progress, responsible sharing requires balancing openness with privacy, security, and proprietary concerns.
    • Sometimes there are recommendations on where to publish your data.
    • Making sure that guidelines from different stakeholders are known and followed is important to not violate any laws or good scientific practice (see Resources 11–14 for examples).
    • Discuss briefly the reproducibility crisis. The instructor can highlight the 'publish or perish' culture that is one of the factors that has led to this crisis.
  • If the audience is known, search for relevant requirements from institution/funders/publishers for the audience (see generic example requirements under Resources).
  • For further details about repositories and data infrastructure, please also refer to the module Data Preservation and Archiving of this curriculum.

Resources

Benefits/Limitations of sharing your data—for instructors:

  1. NFDI4Chem. Data Publishing | NFDI4Chem Knowledge Base. https://knowledgebase.nfdi4chem.de/knowledge_base/docs/data_publishing/. Accessed 11 Nov. 2024.
  2. DataONE Community Engagement & Outreach Working Group. Data Sharing. 2017. https://dataoneorg.github.io/Education/lessons/02_datasharing/index.html.
  3. Colavizza, Giovanni, et al. "The Citation Advantage of Linking Publications to Research Data." PLOS ONE, edited by Jelte M. Wicherts, vol. 15, no. 4, Apr. 2020, p. e0230416. DOI.org (Crossref), https://doi.org/10.1371/journal.pone.0230416.
  4. Lortie, Christopher J. "The Early Bird Gets the Return: The Benefits of Publishing Your Data Sooner." Ecology and Evolution, vol. 11, no. 16, Aug. 2021, pp. 10736--40. DOI.org (Crossref), https://doi.org/10.1002/ece3.7853.
  5. Piwowar, Heather A., and Todd J. Vision. "Data Reuse and the Open Data Citation Advantage." PeerJ, vol. 1, Oct. 2013, p. e175. DOI.org (Crossref), https://doi.org/10.7717/peerj.175.
  6. Smith, Jade. LibGuides: Research Data Management: Why Share Research Data. 28 Mar. 2025. https://libguides.ucd.ie/data/share.
  7. GRIC UPRM. LibGuides UPRM: Research Data Management (RDM): Publishing Your Data. 25 Aug. 2023. https://libguides.uprm.edu/datamanagement-en/share.https://libguides.uprm.edu/datamanagement-en/share.
  8. Eynden, Veerle van den. Managing and Sharing Data: Best Practice for Researchers. 3rd ed., Fully rev, UK Data Archive, 2011. Open WorldCat. https://dam.ukdataservice.ac.uk/media/622417/managingsharing.pdf.
  9. Schönbrodt, Felix, et al. Open Science - Open Data and Open Material I. 17 Oct. 2019. https://oer.vhb.org/edu-sharing/components/render/14521893-fda4-413b-ad4b-7434cb4ea983.
  10. RDA Session 8 - Persistant Identifiers, Data Citation and Open Data. Directed by CODATA, 2021. Vimeo. https://vimeo.com/620062523.

Exemplary requirements or guidelines for funding or publishing:

Funding from EU and related institutions:

  1. European Commission. Guidelines to the Rules on Open Access to Scientific Publications and Open Access to Research Data in Horizon 2020. 21 Mar. 2017. https://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf.
  2. Open Research Europe. Open Data, Software and Code Guidelines. https://open-research-europe.ec.europa.eu/for-authors/data-guidelines. Accessed 2 Apr. 2025.

National funders:

  1. Deutsche Forschungsgesellschaft (DFG). "Providing Public Access to Research Results." Wissenschaftliche Integrität. https://wissenschaftliche-integritaet.de/en/code-of-conduct/providing-public-access-to-research-results/. Accessed 2 Apr. 2025.
    Guideline from the German Research Association (DFG)

Publishers:

  1. MIT Libraries. Journal Requirements | Data Management. https://libraries.mit.edu/data-management/share/journal-requirements/. Accessed 2 Apr. 2025.
    List of some guidelines from publishers

Learning Objective 2

LO2: Explain the importance of making data FAIR.

Learning Activities

  • Debate (45 mins): Should all data be FAIR? Divide learners into two groups to argue for and against.
  • Role-play (45 mins): Advocate for FAIR data to a funding agency or a skeptical researcher.
  • Mind mapping (50 mins): Create a mind map showing the connections between FAIR principles and other key concepts like metadata, data curation, and preservation (covered in other modules). Based on the mind mapping activity, identify areas where learners feel less confident about FAIR data and address them.

Materials to Prepare

  • Debate: Prepare debate topics and provide examples of ethical and legal frameworks as learner preparation when arguing for and against making all data FAIR.
  • Facilitate the role play.

Instructor Notes

Overall:

  • The focus of this learning objective will be to consider the practical reasons for advocating FAIR data. Advocating for the why rather than the how, which is addressed in LO2. Address how FAIR data increases transparency and impact in research as well as common concerns researchers may have about making data FAIR (such as time, cost, privacy issues). Talk about the advantages too, such as how implementing the FAIR principles helps in the reproducibility crisis (discipline-dependent).
  • Focus on balancing benefits with realistic constraints.
    • There are common concerns (intellectual property, data security, lack of technical skills) about how to work FAIR and also advantages (for example making data FAIR can help secure funding and collaborations) which the data steward will need to mediate. Hence, there are valid concerns about FAIR data, but they can be managed with the right strategies.
    • FAIR data benefits the scientific community and increases research visibility.
  • The activities encourage the learners to articulate the FAIR agenda and consider FAIR from the perspective of researchers and data stewards.
  • In this last lesson, talk about how FAIR principles are interwoven in other data stewardship concepts. Make sure learners understand how FAIR connects with broader data stewardship topics and encourage learners to identify their learning needs for future modules.
  • FAIR is a foundational concept that informs all aspects of data stewardship. The principles of FAIR will be applied repeatedly throughout the curriculum. Other lessons in the curriculum dive deeper into specific technical strategies for implementing FAIR. Identify for example how other modules in the curriculum build on the FAIR principles (metadata, data quality). This will encourage the learner to think critically across the curriculum and the complexity of the tasks they are working with as data stewards on a deeper level.

Debate:

  • The topic can be: Should all data be FAIR?
  • How to prepare for the debate and for tips can be found in Resource 1.

Role Play:

  • Advocate for FAIR data to a funding agency or a skeptical researcher. "The What I Need From You" activity is an engaging format to structure the role play (Resource 2).

Mind map:

  • This activity serves as an assessment to connect different ideas. Mind mapping templates can be prepared (Resource 3) and used in the assessment activity. Prepare to discuss with the learners the areas they feel less confident about FAIR, the skills they may wish to improve and where they can learn more.

Resources

How to set up the debate and role-play activity, an instructor guide:

  1. 'Classroom Debates | Center for Innovative Teaching and Learning'. Northern Illinois University. https://www.niu.edu/citl/resources/guides/instructional-guide/classroom-debates.shtml. Accessed 24 Mar. 2025.
  2. 'What I Need From You (WINFY)'. SessionLab. https://www.sessionlab.com/methods/what-i-need-from-you-winfy. Accessed 24 Mar. 2025.

Brainstorming tools for assessment activities:

  1. "Free Online Mind Maps." Canva, https://www.canva.com/graphs/mind-maps/. Accessed 19 Mar. 2025.

Background reading for instructor preparation. These resources are also suitable for sharing with learners:

  1. "RDM Starter Kit". GO FAIR. https://www.go-fair.org/resources/rdm-starter-kit/. Accessed 19 Mar. 2025.
  2. Elixir Europe. "FAIRplus Webinar - What is the value of FAIR data?" YouTube,. https://www.youtube.com/watch?v=2iWf4XtnzkI. Accessed 19 Mar. 2025.
  3. Wildgaard, Lorna, m.fl. Milestone: Pilot learning path for Data Stewards. Zenodo, 13. Aug. 2024. Zenodo. https://doi.org/10.5281/zenodo.13309349.
  4. 4EU+. "Research Data Management- Introduction to FAIR and Open Data. YouTube, https://www.youtube.com/watch?v=gK5ZPKVk4RA. Accessed 19 Mar. 2025.

Learning Objective 3

LO3: Recommend tools, workflows, and technical strategies to make research data FAIR.

Learning Activities

  • Discussion activity (180 mins): The activity aims to familiarise learners with tools and practical strategies to make data FAIR and identify the competencies they need to develop further to support the implementation of FAIR research data. The learners are introduced to a toolbox via a handout, which will help them identify further what tools can facilitate the achievement of each FAIR principle.

Materials to Prepare

  • Handouts: overview of tools to support FAIR data with key takeaways.

Instructor Notes

  • The goal of this learning objective is to explore tools and strategies to make data FAIR through a hands-on activity using a handout and discussion format.
  • The instructor may like to tailor the content to learners' specific discipline, if the audience is comprised of people from a specific disciplinary background. Certain tools or strategies may be more common in particular disciplines.

Discussion:

  • Begin with a brief recap of the FAIR principles and the importance of making data FAIR (depending on how the instructor is teaching the module, these can be covered by materials in Learning Objective 1).
  • Present and share the handout, which provides a summary of key elements to make data FAIR (for example, (machine-readable) metadata, PIDs, repositories, open and interoperable formats, licences). The activity will give an overview of tools supporting FAIR data (for example, Zenodo, Figshare, Dataverse). The purpose is to provide a more "hands-on" introduction to workflows for metadata generation, data storage, and archiving (more information about this can be found in the module on Data Preservation and Archiving). It is important to use time explaining licensing options (for instance, Creative Commons) that align with FAIR principles.
  • Key Takeaways are:
    • Not all tools are equally suitable for every type of data or research project.
    • Many tools (such as repositories, metadata creators, ReadMe file templates etc.) exist to facilitate FAIR data, some are more complicated than others. Test the appropriateness of the tool before recommending to learners or including in the workshop material.
    • Choosing the right tool depends on the specific research needs.
    • Well-structured metadata is crucial for making data findable and reusable.
  • The instructor can highlight that each FAIR principle plays a crucial role in ensuring that data can be easily discovered, accessed, integrated, and reused by other researchers and stakeholders. Assigning persistent identifiers, such as DOIs to datasets ensures they are reliably referenced and can be easily located over time. PIDs enhance the Findability and credibility of research outputs. Using standardised formats and controlled vocabularies facilitates data interoperability, allowing datasets to be easily integrated and analysed alongside other data. This is vital for collaborative research and data sharing across disciplines. Familiarity with the tools and resources (for example metadata generators, repositories, FAIR evaluation tools) is essential for effectively implementing FAIR principles in practice. FAIR principles are also the first step for a reproducible research. These tools help streamline the process of making data FAIR.
  • Learners can also be asked to share their own inputs about other tools or strategies they know of that can help make data FAIR.
Handout for Discussion Activity

The suggested handout can be a list that prompts the learners to think about the tools and workflows for each principle or developed further as an online resource with links to examples and more information. It can also be used by the learners to assess the FAIRness of repositories and other services.

Prepare a detailed explanation of technical strategies that can be employed for each FAIR principle. The list below can be:

  1. Findable
    • Persistent Identifiers (PIDs): Assign globally unique and persistent identifiers (DOIs, Handles) to your data and metadata so that it can be reliably found and referenced. Examples include:
      • DOIs (Digital Object Identifiers) for datasets (DataCite, Zenodo).
      • ORCID iDs for author identification.
    • Metadata: Create and publish rich metadata using standard schemas (e.g., Dublin Core, DataCite). Ensure metadata includes PIDs and is indexed in searchable databases and repositories.
      Tools: DataCite Metadata Generator, Zenodo, Figshare.
    • Data Catalogues/Repositories: Deposit your data in open, FAIR-compliant repositories that ensure data and metadata are indexed and easily searchable.
      Examples: Zenodo, Figshare, Dryad, Dataverse.
  2. Accessible
    • Data Licensing: Use appropriate open licences (e.g., Creative Commons licences) to make data accessible within legal and ethical constraints.
    • Tools: Creative Commons License Chooser.
    • Standardized Protocols: Provide data using open, universally accessible protocols like HTTP or FTP. Ensure metadata remains accessible even if the data itself is restricted or sensitive.
    • Repositories like Zenodo and Figshare support these protocols.
    • Authentication and Authorisation: For sensitive or restricted data, implement secure access protocols (for instance, OAuth, OpenID) and maintain controlled access to data through repositories that provide these services.
    • Example repositories: dbGaP, controlled-access repositories.
  3. Interoperable
    • Standardised Formats: Use open, non-proprietary, and widely recognized file formats that enable interoperability across platforms (for example CSV, JSON, XML, NetCDF). Ensure these formats are compatible with common tools and software used by the research community.
    • Example formats: HDF5, NetCDF (for scientific data); RDF (for semantic web).
    • Controlled Vocabularies and Ontologies: Use community-endorsed vocabularies and ontologies to describe your data consistently and unambiguously.
    • Tools: FAIRsharing, BioPortal (to find vocabularies and ontologies).
    • Linked Data and Ontologies: Make data interoperable by using linked data principles (e.g., RDF, SPARQL) and connecting datasets with ontologies.
    • Examples: OWL, RDF, SPARQL endpoints.
  4. Reusable
    • Detailed Metadata: Provide comprehensive and rich metadata that describes not just the data but also its context, provenance, and how it can be reused (methodology, instruments, software used).
    • Data Licensing: Use appropriate licences (such as Creative Commons licences) to clearly specify how your data may be reused with regard to legal and ethical constraints.
    • Tools: Metadata schemas like DataCite, Dublin Core, and discipline-specific standards like ISO 19115 for geospatial data.
    • Data Provenance: Track and document data provenance (how data was collected, processed, or transformed) so future users can assess the quality and suitability of the data for reuse.
    • Tools: ProvONE (for provenance tracking), RO-Crate (for research objects).
    • Data Versioning: Use version control to ensure that data changes are tracked and documented, enabling users to reference the exact version of data they are using.
    • Tools: Git/GitHub/GitLab, Zenodo versioning, Dataverse versioning.
    • Citable and Open Documentation: Publish clear, citable documentation about how the data was generated, what standards it follows, and how to reuse it. This may include methods papers, data dictionaries, or readme files.
    • Tools: README generator tools, Jupyter Notebooks (for reproducibility).
(Optional) Detailed Plan of Discussion Activity

Part A: Hands-on Exercise: Making a Dataset FAIR (60 mins)

Check there is internet access for participants and they have their computers with them to explore the tools later in the workshop.

Objective:

  • Learners will apply strategies to make a provided dataset FAIR.

Instructions:

  1. Split participants into small groups (3–5 people per group).
  2. Provide each group with a sample dataset (ideally one with minimal FAIR compliance: incomplete metadata, no PIDs, and so on).
  3. Assign each group the task of improving the dataset's FAIRness by making it:
    • Findable: Adding metadata to make the dataset findable using a tool like the DataCite Metadata Generator.
    • Accessible: Deciding on an appropriate licence using the Creative Commons License Chooser, and selecting a repository like Zenodo to host the dataset.
    • Interoperable: Converting the dataset into an interoperable format (such as from Excel to CSV or JSON).
    • Reusable: Writing a comprehensive README file explaining the data's context, collection process, and potential for reuse. Add detailed metadata to enhance reusability.

Materials to Prepare:

  • A basic dataset in Excel or CSV format (could be real or fictional).
  • Access to tools such as:
    • DataCite Metadata Generator (for metadata).
    • Zenodo or Figshare (for repository selection).
    • Creative Commons License Chooser (for licensing).
    • README file generator or templates.
    • Internet access for groups to access tools.

Part B: Evaluation and Group Presentations (60 mins)

Objective:

  • Evaluate the group's work and learn from each other's approaches.

Instructions:

  1. Each group presents their FAIRification process, explaining the changes they made to the dataset in relation to each FAIR principle.
  2. The group should walk through how they improved the findability, accessibility, interoperability, and reusability of the dataset.
  3. After presentations, facilitate a group discussion focusing on:
    • What challenges did they encounter while making the data FAIR?
    • Which tools were most useful?
    • What additional steps could be taken to further enhance the dataset's FAIRness?

Materials to Prepare:

  • Presentation slides or flip charts for each group to summarise their process.
  • A rubric or checklist to evaluate the improvements made to the datasets according to the FAIR principles.

Part C: Wrap-up and Reflection (30 mins)

Objective:

  • Reflect on the learning experience and reinforce the key takeaways.

Instructions:

Ask participants to reflect on:

  1. Which technical strategies stood out as most important for FAIR data?
  2. How would they apply these strategies in their own research or work?
  3. What future skills or tools they would like to explore in more detail.
  4. Conclude with a Q&A session to clarify any outstanding questions about the technical strategies for making data FAIR.

Assessment:

  • Each group's dataset should be assessed based on how well they addressed each of the FAIR principles.
  • Provide feedback on their use of tools, completeness of metadata, and choice of repositories.

Optional Extension:

  • Learners could evaluate the FAIRness of their own datasets and present improvements in a follow-up session or as a homework task.

Materials to Prepare:

  • Pre-select a dataset for each group with suboptimal FAIR compliance.
  • Set up access to necessary tools (Zenodo, DataCite).
  • FAIRness checklist or rubric for the evaluation stage.
  • Prepare a slide deck or handout summarising key FAIR principles and strategies.

Resources

Input for tutorials and tools to include in the workshop:

  1. "RDM Starter Kit". GO FAIR. https://www.go-fair.org/resources/rdm-starter-kit/. Accessed 19 Mar. 2025.
  2. RDMkit. https://rdmkit.elixir-europe.org/metadata_management. Accessed 19 Mar. 2025.
  3. "DataCite Training". DataCite Support. https://support.datacite.org/docs/datacite-training. Accessed 19 Mar. 2025.
  4. Knowledge Exchange Webinar -- Persistent Identifiers (PID's) in Academia: Risk and Trust | Danish e-Infrastructure Consortium. https://www.deic.dk/da/event/knowledge-exchange-PID. Accessed 19 Mar. 2025.

Input for additional tools and platforms that assess the FAIRness of data, such as:

  1. FAIRsharing. https://fairsharing.org/?lang=en. Accessed 19 Mar. 2025.

This resource helps identify community standards for datasets and metadata:

  1. Assessment. https://fairplus.github.io/the-fair-cookbook/content/recipes/assessing-fairness.html. Accessed 19 Mar. 2025.

This resource provides services, tools and indicators to assess data against the FAIR principles:

  1. "F-UJI Automated FAIR Data Assessment Tool". FAIRsFAIR, 22 Sep. 2020. https://www.fairsfair.eu/f-uji-automated-fair-data-assessment-tool.

This resource is an automated tool for assessing FAIR data objects:

  1. Jasinska, Agnes, et al. Open Licences for Data. https://doi.org/10.5281/zenodo.14921877.
    Please note the Open learning object which is an interactive checklist to help learners choose a licence before sharing data.

Background reading on suggested skills for data stewards:

  1. Wildgaard, Lorna, m.fl. Milestone: Pilot learning path for Data Stewards. Zenodo, 13 Aug. 2024. Zenodo. https://doi.org/10.5281/zenodo.13309349.

Learning Objective 4

LO4: Identify different forms of data publication and infrastructure to make outputs discoverable.

Learning Activities

  • Lecture (45 mins): Lecture on different forms of data publication, infrastructures, persistent identifiers. The lecture should include a live demonstration of publishing data to some demo infrastructure.
  • Discussion activity (45 mins): Learners should take a dataset from their research and:
    • Choose the best form and licence for publishing their dataset (take into consideration possible relevant laws or guidelines from institutions).
    • Find a data journal suitable for a given dataset.
    • Create a network of different related publications (datasets, research papers, software); the publications may not yet exist, but can also be planned for the future.

Materials to Prepare

  • Presentation slides on data publication, infrastructures, persistent identifiers.
  • Prepare notes and practice live demonstration of publishing a dataset to a demo repository (for example, Zenodo Sandbox, see Resource 18).
  • If available, learners should bring their own datasets for the discussion activity. If they do not have any datasets, the instructor should prepare some sample datasets that can be used by the learners.

Instructor Notes

Lecture:

  • Resources 1–10 already provide content (some under an open licences) that can be used for creating the slides.
  • Content to include in lecture or presentation:
    • Different forms of data publication (see Resources 1–7)
      • Standalone Dataset
      • Attachment to research paper
      • Data paper/publication
    • Data Citaitons (see Resource 8)
    • Licences for dats (see Resource 9)
    • Promote your data publication (see Resource 10)
    • Infrastructure
      • Data repositories/Data registries/Research data centres (see Resources 11–20)
        • discipline-specific
        • institutional repositories
        • general repositories
        • national & international repositories
      • How to determine trustworthy repositories (see Resources 21–26)
        • Criteria:
          • moderation of deposit
          • guarantee about sustainability
          • offering of persistent identifiers
          • listed in registry such as re3data or fair sharing
          • TRUST principles
        • Certification
          • CoreTrustSeal
      • Data journals (see Resources 27–30)
    • Persistent identifiers (PID; see Resources 31–33)
      • DOI
      • ROR
      • ORCID
      • Accession numbers (for library collections)
  • If applicable, adopt licence part to country specific legislation.
  • If available, include research data policy from the home institution or discipline of the learners.

Discussion:

  • Discuss the advantages of different publication forms. No single form of data publication fits all needs. Researchers should select the most appropriate (or combine multiple) method(s) based on their goals, the nature of their data, and disciplinary norms:
    • Use dataset repositories when enabling direct access and reuse is a priority.
    • Publish a data paper when contextualisation and academic recognition are important.
    • Attach supplementary materials when data primarily supports a research article without independent reuse.
  • Assessing FAIRness (for instance with F-UJI or FAIReva, see Resources 34, 35) ensures compliance with best practices, making data more useful for future research and innovation.
  • A clear data licence defines how others can access, use, and build upon research. Selecting the right licence balances openness with necessary restrictions, fostering trust and ethical reuse while protecting intellectual contributions.
  • Data citations, just like research article citations, give proper credit to data creators and encourage responsible reuse.
  • Connecting datasets to code, research software, and publications provides the provenance of the research process and help the advancement of science. Breaking a single big journal publication containing everything into these smaller pieces values each step of the research process; the researcher does not only focus on publishing interpretations, but also puts effort into data collection and publishing. This supports reproducibility, verification, and reuse.
  • If the audience is known, modify the examples for the discussion to match the discipline or institution of the learners.

Resources

Learning materials that can be reused:

  1. NFDI4Ing. Datenlebenszyklus Phase 4: Daten Teilen Und Publizieren. https://nfdi4ing.pages.rwth-aachen.de/education/education-pages/dlc-datalifecycle/html_slides/dlc4new.html#/. Accessed 2 Apr. 2025.
    Self-paced learning materials about Data Sharing & Publication (CC BY 4.0, in German language)
  2. Bezjak, Sonja, et al. The Open Science Training Handbook: Open Research Data and Materials. https://open-science-training-handbook.github.io/Open-Science-Training-Handbook_EN/02OpenScienceBasics/02OpenResearchDataAndMaterials.html. Accessed 3 Feb. 2025.
    Learning Unit about Open Data and Materials (CC0 1.0 Universal)

General information for creating the training materials:

  1. CESSDA Training Team. Data Management Expert Guide: Data Publishing Routes. https://dmeg.cessda.eu/Data-Management-Expert-Guide/6.-Archive-Publish/Data-publishing-routes. Accessed 2 Apr. 2025.
    Guide on RDM in Social Sciences: Data Publishing Routes (CC BY-SA 4.0)
  2. Leibniz Information Centre for Economics (ZBW). Selecting the Suitable Repository for Research Data | Open Economics Guide of the ZBW. https://openeconomics.zbw.eu/en/knowledgebase/selecting-the-suitable-repository-for-research-data/. Accessed 2 Apr. 2025.
    Things to keep in mind when selecting repositories (CC BY 4.0)
  3. Leibniz Information Centre for Economics (ZBW). Data Repositories and Data Portals | Open Economics Guide of the ZBW. https://openeconomics.zbw.eu/en/knowledgebase/data-repositories-and-data-portals/. Accessed 31 Jan. 2025.
    Overview of where to search for data repositories and data journals (CC BY 4.0)
  4. Universität Würzburg. Research Data Management: Data Publication and Archiving. https://www.uni-wuerzburg.de/en/rdm/information/data-publication/. Accessed 3 Feb. 2025.
    Short overview of different data publication forms
  5. Charels University Open Science Support Centre. "How to Share Research Data." Open Science Support Centre. https://openscience.cuni.cz/OSCIEN-53.html. Accessed 3 Feb. 2025.
    Short overview of different data publication forms
  6. Data Citation Synthesis Group. Joint Declaration of Data Citation Principles. Force11, 2014. DOI.org (Datacite), https://doi.org/10.25490/A97F-EGYK.
    Video including information about data citation.
  7. Licenses for Research Data | RADAR. https://radar.products.fiz-karlsruhe.de/en/radarfeatures/lizenzen-fuer-forschungsdaten. Accessed 11 Nov. 2024.
    Information about licences for research data (for Germany)
  8. CESSDA Training Team. Data Management Expert Guide: Promoting Your Data. https://dmeg.cessda.eu/Data-Management-Expert-Guide/6.-Archive-Publish/Promoting-your-data. Accessed 2 Apr. 2025.
    Guide on RDM in Social Sciences: Promoting Your Data (CC BY-SA 4.0).

Search engines to find repositories for data publication:

  1. re3data.org - Registry of Research Data Repositories. https://doi.org/10.17616/R3D.
  2. Datacite Commons. https://commons.datacite.org/repositories.
  3. FAIRsharing.org. https://fairsharing.org/search?page=1&recordType=repository.
  4. OpenDOAR (quality-assured directory of open access repositories). https://v2.sherpa.ac.uk/opendoar/about.html.
  5. ROAR --- Registry of Open Access Repositories. https://roar.eprints.org/.
  6. RIsources (RI = Research Infrastructure). https://risources.dfg.de/home_en.html.

Exemplary general, interdisciplinary repositories:

  1. Zenodo. https://zenodo.org/.
  2. Zenodo Sandbox. https://sandbox.zenodo.org/.
    Can be used for testing publishing data to Zenodo without actually publishing the data.
  3. Figshare. https://figshare.com/.

Exemplary research data centres:

  1. KonsortSWD -- Datenzentren. https://www.konsortswd.de/en/services/research/all-datacentres/.

Help for selecting trustworthy repositories; see also module Data Preservation/Archiving:

  1. Lazzeri, Emma. Update of the Study on the Readiness of Research Data and Literature Repositories to Facilitate Compliance with the Open Science Horizon Europe MGA Requirements. 1.0, Zenodo, 14 Oct. 2024. DOI.org (Datacite). https://doi.org/10.5281/ZENODO.13919643.
    Recommendations on selecting suitable repositories for research output (CC BY 7.0)
  2. Science Europe. Criteria for the Selection of Trustworthy Repositories. https://www.scienceeurope.org/media/ffkb51ei/se-rdm-template-2-criteria-for-the-selection-of-trustworthy-repositories.docx. Accessed 31 Jan. 2025.
    Short list with criteria of how to select trustworthy repositories (CC BY 7.0)
  3. De Lamotte, Frédéric, et al. Selecting a Trustworthy Subject-Specific Repository for Self-Depositing Data: Methodology and Analysis of Existing Services. Ministère de l'enseignement supérieur et de la recherche, 2024. DOI.org (Crossref), https://doi.org/10.52949/81.
    Report on how trustworthy repositories for different disciplines have been selected (CC BY-ND 7.0)
  4. Lin, Dawei, et al. "The TRUST Principles for Digital Repositories." Scientific Data, vol. 7, no. 1, May 2020, p. 144. DOI.org (Crossref), https://doi.org/10.1038/s41597-020-0486-7.
    TRUST: Guiding principles to demonstrate digital repository trustworthiness
  5. CoreTrustSeal Standards And Certification Board. CoreTrustSeal Requirements 2023-2025. V01.00, Zenodo, 5 Sept. 2022. DOI.org (Datacite). https://doi.org/10.5281/ZENODO.7051012.
  6. Witt, Michael, et al. RDA Common Descriptive Attributes of Research Data Repositories. 1 Dec. 2023. https://www.rd-alliance.org/wp-content/uploads/2024/01/RDA20Common20Descriptive20Attributes20of20Research20Data20Repositories_0.pdf.

Exemplary data journals:

  1. Nature Scientific Data. https://www.nature.com/sdata.
  2. Earth Science System Data. https://www.earth-system-science-data.net.
  3. List of Data Journals from forschungsdaten.info. https://www.forschungsdaten.org/index.php/Data_Journals.

Service to find an open access(data)journal to publish a (data) paper:

  1. B!SON -- the Open-Access journal recommender. https://service.tib.eu/bison/.

Important persistent Identifiers:

  1. DOI. https://www.doi.org/.
    Identifier for various (digital) objects, widely used for scientific publications.
  2. ROR. https://ror.org/.
    Identifier for research organisations.
  3. ORCID. https://orcid.org/.
    Identifier for researchers.

Tools for FAIR assessment:

  1. FAIRsFAIR. "F-UJI Automated FAIR Data Assessment Tool". FAIRsFAIR. https://www.fairsfair.eu/f-uji-automated-fair-data-assessment-tool. Accessed 2 Apr. 2025.
  2. Aguilar Gómez, Fernando, and Isabel Bernal. "FAIR EVA: Bringing Institutional Multidisciplinary Repositories into the FAIR Picture." Scientific Data, vol. 10, no. 1, Nov. 2023, p. 764. DOI.org (Crossref), https://doi.org/10.1038/s41597-023-02652-8.

Learning Objective 5

LO5: Apply CoreTrustSeal's requirements and CURATED checklist for datasets to certified repositories.

Learning Activities

  • Lecture (20 mins): The instructor can introduce CoreTrustSeal as a criteria for repositories and its advantages. The learners can be introduced to cataloguing data, CURATE(D) checklist prior to data.
  • Exercise (45 mins): The goal of the exercise is to apply the CoreTrustSeal requirements as a checklist for individual datasets. Query the NASA dataset repository: Ask participants to analyse which curatorial decisions NASA has made to curate their studies on the planet Venus. The exercise can be done in pairs or small groups so that there is discussion during the activity.

Materials to Prepare

  • Slides for lecture on the CoreTrustSeal.
  • A general familiarity with NASA website and curation of a dataset. A general familiarity with the requirements of CoreTrustSeal data repositories as well as the CURATE(D) checklist for curating data prior to publication.

Instructor Notes

Lecture:

  • The instructor can link to materials in the ontologies and meta data modules for the lecture.
  • The learners can be introduced to CoreTrustSeal with the instructor going through the CoreTrustSeal's requirements, and how to prepare a dataset for a repository with the CoreTrustSeal? How do we administer a local repository to meet these requirements?
  • Discuss the advantages of obtaining the CoreTrustSeal for organisations and data stewards, such as increased credibility, improved data sharing and collaboration opportunities, and enhanced user trust. Highlight case studies or examples of organisations that have successfully achieved certification and the positive impacts it had on their data management practices (Resource 7).
  • Explain the steps involved in the assessment and certification process for obtaining the CoreTrustSeal. Discuss the self-assessment tools, documentation requirements, and the role of external audits. Provide insights into how data stewards can prepare their repositories for evaluation.
  • Go through the CURATE(D) checklist: Check, Understand, Request, Augment, Transform, Evaluate, Document.
  • This can either be done by way of a theoretical discussion of each step in which a data steward would review a dataset, or practically, by means of a sample dataset, that you go through together.

Exercise:

  • For the learning activity, search for and review NASA's curatorial choices, and how they curate their data. How would you go through this dataset as a data steward in such a way that you are checking the requirements of CoreTrustSeal repositories and the CURATE(D) checklist when curating datasets. A way to get the conversation going could be to look at CoreTrustSeal's guidance with respect digital object management (p. 13 of the 2023–2025 requirements document). Does NASA note any changes to the data and metadata (versioning)? How does the repository handle providence? With respect to preservation, does NASA include documentation as to how these files will be preserved long term? Some items of the checklist are of course internal workflows at NASA that are not completely transparent, but try to pry out elements of CoreTrustSeal and CURATE(D) that can be accessed through what is openly available.

Resources

Overall:

  1. CoreTrustSeal Standards And Certification Board. CoreTrustSeal Requirements 2023-2025. Zenodo, 5 Sept. 2022. DOI.org (Datacite), https://doi.org/10.5281/ZENODO.7051012.
  2. CoreTrustSeal Standards And Certification Board. CoreTrustSeal Trustworthy Data Repositories Requirements: Glossary 2023-2025. Zenodo, 5 Sept. 2022. DOI.org (Datacite), https://doi.org/10.5281/ZENODO.7051125.
  3. CoreTrustSeal Standards And Certification Board. CoreTrustSeal Trustworthy Digital Repositories Requirements 2023-2025 Extended Guidance. Zenodo, 5 Sept. 2022. DOI.org (Datacite), https://doi.org/10.5281/ZENODO.7051096.
  4. NASA dataset query. https://nssdc.gsfc.nasa.gov/nmc/DatasetQuery.jsp.
  5. "Data-Primers/Curated.Md at Main -- DataCurationNetwork/Data-Primers". GitHub. https://github.com/DataCurationNetwork/data-primers/blob/main/curated.md. Accessed 28. Mar. 2025.
  6. Data Curation CURATE(D) checklist: https://github.com/DataCurationNetwork/data-primers/blob/main/curated.md#check-step.
  7. CoreTrustSeal-AMT. https://amt.coretrustseal.org/certificates.