Module 1: Personal Data and GDPR
Learning Objectives
- Learning Objective 1 (LO1): Recognise applicable laws and identify personal data in research settings.
- Learning Objective 2 (LO2): Apply techniques for preserving the privacy of individual data subjects.
Total Module Duration
4 hours
Learning Objective 1
LO1: Recognise applicable laws and identify personal data in research settings.
Learning Activities
- Lecture (60 mins): Define personal data in research, and explain the different types of personal data (directly identifiable and indirectly identifiable) and special categories of personal data. Highlight what the General Data Protection Regulation (GDPR) states in relation to the right to protection of personal data.
- Exercise (30 mins): The instructor provides the learners with a semi-blank list (Resource 13) and asks them to reflect on how some of this information can be combined with other data points to identify someone.
Materials to Prepare
- Lecture introducing the basics of GDPR, the purpose of GDPR and the principles for processing personal data in research.
- List of exercises along with prompts to get learners thinking.
Instructor Notes
Lecture:
- The lecture could start with a short interaction with the learners on what they think constitutes personal data, for instance: names, addresses, e-mail addresses, IP addresses. Resources 1–3 can be used as inspiration for some questions for this learning objective or Learning Objective 2 in this module.
- Outline the definition of personal data: Resource 1 has a good definition that the instructor can use as a base.
- Introduce the General Data Protection Regulation (GDPR) within the EU legal system. Highlight what the General Data Protection Regulation (GDPR) states in relation to key GDPR principles, data subjects' rights, obligations of data controllers (Resource 1).
- Explain the different types of personal data (directly, indirectly identifiable and quasi-identifiers) and special categories of personal data. Interact with the learners on what they think constitutes personal data, for instance: names, addresses, e-mail addresses, IP addresses. Resources 1–3 can be used as inspiration for some questions for this learning objective or Learning Objective 2 in this module.
- Overview of typical scenarios and rules applicable to researchers and research-performing organisations:
- The rights of research participants and the obligations of the researcher (Resource 1 Module 5, Purpose of GDPR).
- The instructor should introduce the following topics:
- "Research exemptions" in the GDPR,
- Purpose and scope for Processing Personal Data,
- Legal basis for Processing Personal Data and Informed consent,
- How to manage Personal Data in a research project: building data flows, and
- How to identify applicable privacy and data protection regulations on a national and institutional level.
- The instructor can highlight national GDPR implementations in different countries or highlight specifically the implementation within their country context.
- In summary, some key concepts to cover in this lecture would be:
- Definition of personal data and applicable regulation (e.g. GDPR),
- Special categories of personal data,
- Direct vs. Indirect Identifiers,
- Quasi identifiers and privacy risks to individuals,
- Pseudonymised vs. Anonymised Data,
- Personal data in the research context.
Exercise:
- The instructor can use Resource 13 or excerpts of Resource 14 to prepare an exercise where learners can discuss the different types of personal data and the how easy it is to overlook different kinds of personal data. The instructor can have a discussion around how small pieces of information when combined can lead to privacy risks for an individual. The purpose of the exercise is to deepen the learners understanding of personal data before moving to the next module where practical aspects of preserving the privacy of data subjects will be discussed.
- This exercise is meant to introduce learners to the complexity of the GDPR, without going into detail on each point unless specifically required by their home institutions or roles, and the session is designed to cover more depth. The instructor can re-use a case study with prompts to help learners make their own assessment of the case study (Resource 1 Module 6 on complex cases can be used for this exercise, there are already questions embedded into the cases which the instructor can use as discussion points).
- An additional exercise can be facilitated if there is time. Present and discuss various case studies (Resource 4) and discuss the case before providing the learners with the actions taken by the Data Protection Commission on these cases.
Resources
Materials for creating the lecture slides:
- Course: GDPR 4 Data Support (English) | DANS. https://danstraining.moodlecloud.com/course/view.php?id=7. Accessed 22 Apr. 2025.
- Data Protection Quiz. https://www.urmconsulting.com/data-protection/data-protection-quiz. Accessed 22 Apr. 2025.
- General Data Protection Regulation (GDPR) Guidance Note for the Research Sector. Efamro, https://esomar.org/uploads/attachments/ckv2fj3rh001jbw3vejug72q2-efamro-esomar-gdpr-guidance-note-legal-choice.pdf.
- "Case Studies | Data Protection Commission." Case Studies | Data Protection Commission, https://www.dataprotection.ie/dpc-guidance/dpc-case-studies. Accessed 23 Apr. 2025.
- Habraken, Anja. LibGuides: Research Integrity: C) GDPR. https://libguides.uvt.nl/researchintegrity/gdpr. Accessed 23 Apr. 2025.
For inspiration:
- EDPB Guidelines 05/2020 on consent under Regulation 2016/679. https://www.edpb.europa.eu/our-work-tools/our-documents/guidelines/guidelines-052020-consent-under-regulation-2016679_en.
- Foggetti, N., Gerin Laslier, M., Di Giorgio, S., Haile Gebreyesus, N., Müller, S., van Nieuwerburgh, I., Romier, G., Van Wezel, J., Hönegger, L., Bodlos, A., & Vernet, M. (2021). Legal and Policy Framework and Federation Blueprint. Zenodo. https://doi.org/10.5281/zenodo.5647948 pp 72-85
- Sganga, C., Gebreyesus, N. H., van Wezel, J., Foggetti, N., Amram, D., & Drago, F. (2022). EOSC-Pillar Legal Compliance Guidelines for Researchers: a Checklist (interactive digital version). Zenodo. https://doi.org/10.5281/zenodo.6327668.
- Drążewski, K., COLCELLI, V., Brizioli, S., Fernandes, E., Karabuga, E., Margoni, T., & Schirru, L. (2024). D3.7 - Coordinated set of guides, fact-sheets and FAQs on ELSI aspects for Civil Servants and Policy Makers. Zenodo. https://doi.org/10.5281/zenodo.13467302 p. 12 et seq.
- European Union, 'Protecting data and opening data - General Data Protection Regulation (GDPR) as a supporter for Open Data'. https://data.europa.eu/en/publications/datastories/protecting-data-and-opening-data.
- Kuchinke, W.; EUDAT Sensitive Data Working Group. (2017). How can e-infrastructures deal with the sensitive data challenge (Working Paper) [Data set]. https://b2share.eudat.eu. https://doi.org/10.23728/B2SHARE.3D1DFB9B889C4022AE7B308DF009FCC9.
- OpenAIRE, How to deal with sensitive data. https://www.openaire.eu/sensitive-data-guide.
Input for the exercise:
- "+200 Examples of Personal Data | RGPD.COM.". https://rgpd.com/basics/200-examples-of-personal-data/. Accessed 22 Apr. 2025.
- Case Studies - ODPA. https://www.odpa.gg/information-hub/books-podcasts-stories/stories/. Accessed 22 Apr. 2025.
Learning Objective 2
LO2: Apply techniques for preserving the privacy of individual data subjects.
Learning Activities
- Lecture 1 (30 mins): The lecture should discuss the practical aspects of preserving privacy of data subjects, including information on applying privacy-preserving techniques in research, including anonymisation versus pseudonymisation.
- Lecture 2 (30 mins): Performing a Data Protection Impact Assessment (DPIA) and using it to conduct risk assessments of collecting and storing personal data.
- Exercise 1—Case study (45 mins): Discussing real or fictitious case and preparing a consent form.
- Exercise 2—Anonymising a fictitious dataset (45 mins): Look at an exemplary dataset and evaluate it in class for privacy concerns, work with the anonymisation tool (Resource 12).
Materials to Prepare
- Lecture on privacy-preserving techniques such as pseudonymisation and anonymisation.
- Lecture on data protection impact assessments.
- Case study for Exercise 1 on handling personal data (Resource 10 has some cases that can be used).
- Dataset for Exercise 2 on anonymisation activity: the use of UKDS tool (Resource 18).
Instructor Notes
Lecture 1:
- The goal of this lecture is to build a general awareness of the pros and cons of different strategies for mitigating the risk of exposing personal data, while still being able to utilise the data in research. The instructor can cover the following in the presentation:
Anonymisation and pseudonymisation:
- Anonymisation and pseudonymisation of data and the associated risks: When is a dataset considered anonymous and the difference between anonymisation and pseudonymisation (Resources 1, 2, 5).
- The misunderstandings of anonymisation and risks of re-identification (Resource 3).
Advanced methods of preserving privacy in research:
If the instructor wants to go into more depth, then the following topics can be covered:
- Differential privacy in research, overview and privacy robustness assessment of different data storage systems, federated learning and its implications to personal data protection, secure multiparty computation. There are some additional resources provided in Resources 5–10.
- Introduce various techniques for anonymisation (such as data masking, aggregation, and noise addition) and pseudonymisation (such as hashing, tokenisation). Provide examples of how these techniques can be applied to different types of data, such as structured databases and unstructured text. Discuss the trade-offs between data utility and privacy.
- Best practices: data audits to identify sensitive information; applying anonymisation and pseudonymisation techniques in data collection, reviewing and updating methods to comply with regulations.
- The different types of storage options, databases, file storage systems for handling sensitive data.
Penalties:
- The instructor can also highlight the risk/fines for breach of GDPR (Resource 4). The examples from the resource can also serve as good discussion points and examples of when personal data was not processed correctly.
Lecture 2:
- The goal of the lecture is to cover the role and practical aspects of carrying out Data Protection Impact Assessments:
- The instructor introduces DPIAs, which is a risk assessment tool to anticipate risks before data is collected (Resource 11).
- The instructor demonstrates how DPIAs help organisations show that they are complying with:
- principles of data protection (such as data minimisation, storage limitation),
- data subject rights (for instance access, erasure, objection), and
- accountability requirements (keeping records of decisions and assessments).
- To dive deeper into this topic:
- Discuss the risks of disclosing personal data in Horizon and other EU-funded research projects which mandate open access publishing—and to have a discussion on how this impacts the rights of data subjects.
- Ask learners to reflect on stakeholders within their organisation they could approach for more advice.
- Evaluate possible conflicts of interest between Open Data vs GDPR and be able to assist in evaluating the good practice and propose solutions (Resources 12, 13).
Exercise 1:
- The instructor should give the learners choice to discuss either:
- A fictional case (for instance, tracking students studying habits via an app, making a study about mental health of Master students via qualitative interviews). Some short suggestions are provided in Resource 16, the instructor may choose a different example as well, OR
- The instructor should introduce a real life example to discuss what types of personal data would be processed and how. Resource 17 introduces various case studies that can be shared for discussion on processing personal data.
- Once the learners have chosen their case, ask them to reflect on the data protection principles and to draft a consent form covering the various aspects of GDPR and privacy protection, such as:
- What data will be collected?
- Why is it needed?
- How will it be stored and protected?
- Who will access it?
- What rights do participants have under the GDPR?
Exercise 2:
- The instructor should find a dataset that is discipline-relevant and includes some form of personal data. A possible example could be the dataset on support for childhood vaccination in Resource 15.
- The goal is to instruct and ask the learners to apply the anonymisation tool. Does it work? What concerns do the learners have, given the content of the lecture, of applying this kind of tool? What ethical issues might you be concerned with prior to sharing this data openly?
Resources
Materials for creating lecture slides:
- Course: GDPR 4 Data Support (English) | DANS. https://danstraining.moodlecloud.com/course/view.php?id=7. Accessed 22 Apr. 2025.
- UCL. "Anonymisation and Pseudonymisation." Data Protection, 24 Apr. 2019, https://www.ucl.ac.uk/data-protection/guidance-staff-students-and-researchers/practical-data-protection-guidance-notices/anonymisation-and.
- European Data Protection Supervisor, 10 Misunderstandings related to anonymisation, https://www.edps.europa.eu/system/files/2021-04/21-04-27_aepd-edps_anonymisation_en_5.pdf.
- DPM. "20 Biggest GDPR Fines so Far 2025." Data Privacy Manager, 3 Mar. 2025, https://dataprivacymanager.net/5-biggest-gdpr-fines-so-far-2020/.
- Verhenneman, Griet. (2021). How GDPR fosters pseudonymisation in academic research. The perspective of a university hospital DPO. https://www.edps.europa.eu/system/files/2021-12/06_griet_verhenneman_en.pdf.
- Erotokritou et al. (2024). Simplifying Differential Privacy for Non-Experts: The ENCRYPT Project Approach. ICCR London. https://doi.org/10.5281/zenodo.13960367.
- A Practical Beginners' Guide to Differential Privacy - CERIAS Security Seminar, Purdue University. https://www.youtube.com/watch?v=Gx13lgEudtU.
- Mollakuqe E, Hamdiu E, Fishekqiu NS et al. Comparison of cloud storage in terms of privacy and personal data - Sync, pCloud, IceDrive and Egnyte [version 1; peer review: awaiting peer review]. Open Res Europe 2024, 4:128. https://doi.org/10.12688/openreseurope.16631.1.
- European Data Protection Supervisor, Federated Learning. https://www.edps.europa.eu/press-publications/publications/techsonar/federated-learning_en.
- Zapechnikov 2022 Secure multi-party computations for privacy-preserving machine learning, Procedia Computer Science, (213)2022, 523-527, https://doi.org/10.1016/j.procs.2022.11.100.
- Data Protection Impact Assessment (DPIA). GDPR.Eu, 9 Aug. 2018, https://gdpr.eu/data-protection-impact-assessment-template/.
- Protecting Data and Opening Data | Data.Europa.Eu. https://data.europa.eu/en/publications/datastories/protecting-data-and-opening-data. Accessed 24 Apr. 2025.
- Open Data - The Turing Way. https://book.the-turing-way.org/reproducible-research/open/open-data. Accessed 15 May 2025.
- Data with personal information in DORIS | Swedish National Data Service. https://snd.se/en/doris-researchers/describe-and-share-data-doris/data-personal-information-doris.
- Chiavenna, Chiara. Replication Data for: Personal Risk or Societal Benefit? Investigating Adults' Support for COVID-19 Childhood Vaccination. text/tab-separated-values,application/x-stata-syntax,application/x-stata-syntax,text/tab-separated-values,text/tab-separated-values, Harvard Dataverse, 2022. DOI.org (Datacite), https://doi.org/10.7910/DVN/Y3WAJL
- DPIA - Case Studies and Examples. https://www.linkedin.com/pulse/dpia-case-studies-examples-siddharth-srinivasan. Accessed 24 Apr. 2025.
- UK Government Web Archive. https://webarchive.nationalarchives.gov.uk/ukgwa/20211223140711/https://esrc.ukri.org/funding/guidance-for-applicants/research-ethics/ethics-case-studies/. Accessed 22 Apr. 2025.
- Service, UK Data. "Anonymising Qualitative Data". UK Data Service. https://ukdataservice.ac.uk/learning-hub/research-data-management/anonymisation/anonymising-qualitative-data/.
- ARX -- Data Anonymization Tool -- A Comprehensive Software for Privacy-Preserving Microdata Publishing. https://arx.deidentifier.org/.
- Welcome to Faker's documentation! --- Faker 37.1.0 documentation. https://faker.readthedocs.io/en/master/.
- "Hashlib --- Secure Hashes and Message Digests". Python Documentation. https://docs.python.org/3/library/hashlib.html.
- Dimakopoulos, Manolis Terrovitis, Dimitris Tsitsigkos and Nikolaos. Amnesia Anonymization Tool - Data anonymization made easy. https://amnesia.openaire.eu/.