On December 17, 2024, the European Data Protection Board ("EDPB" or Board) issued Opinion 28/2024, addressing data protection aspects related to the processing of personal data in the context of artificial intelligence ("AI") models. This opinion was requested by the Irish supervisory authority and focuses on several key issues: the anonymity of AI models, the appropriateness of legitimate interest as a legal basis for processing, and the consequences of unlawful processing during the development phase of AI models.
Anonymity of AI Models
A core component of the Board's opinion is that AI models trained with personal data cannot always be considered anonymous. For an AI model to be deemed anonymous, the likelihood of extracting personal data directly or indirectly must be insignificant. This determination should be based on a case-by-case analysis, which considers all means reasonably likely to be used by the controller or another person. The EDPB provides a non-exhaustive list of methods to demonstrate anonymity, including limiting the collection of personal data, reducing identifiability, and ensuring resistance to attacks.
Legitimate Interest as a Lawful Basis
In addition to the issue of anonymity, the Board's opinion outlines a comprehensive framework for assessing the use of legitimate interest as a legal basis for processing personal data in the development and deployment of AI models. This framework is built upon a three-step test that ensures the processing activities align with the principles of the GDPR.
1. Identifying the Legitimate Interest
The first step of the framework involves identifying the legitimate interest pursued by the controller or relevant third party. An interest is considered legitimate if it meets three cumulative criteria: it must be i) lawful, ii) clearly and precisely articulated, and iii) real and present (in other words, not speculative). Examples of legitimate interests in the context of AI models include developing a conversational agent to assist users, improving threat detection in information systems, or detecting fraudulent content or behavior. The EDPB emphasizes that the legitimacy of an interest should be assessed on a case-by-case basis, taking into account the specific circumstances of the processing activity--and in turn the corresponding risk associated with that particular activity.
2. Necessity Test
The second step of this framework is the necessity test, which examines whether the processing of personal data is necessary for the pursuit of the legitimate interest. This involves two key considerations: whether the processing activity will allow for the pursuit of the legitimate interest and whether there is no less intrusive way of achieving this interest. The EDPB highlights that the processing should be proportionate to the legitimate interest at stake, in line with the data minimization principle. For instance, if the purpose can be achieved through an AI model that does not process personal data, then processing personal data would not be considered necessary. The necessity test also takes into account the broader context of the processing, including whether the controller has a direct relationship with the data subjects (first-party data) or not (third-party data).
3. Balancing Test
The third step is the balancing test, which assesses whether the legitimate interest is overridden by the interests or fundamental rights and freedoms of the data subjects. This involves identifying and describing the different opposing rights and interests at stake. The EDPB provides several factors to consider in this assessment:
- Data Subjects' Interests, Fundamental Rights, and Freedoms: The processing of personal data in the development and deployment of AI models may impact data subjects' interests, such as their interest in self-determination and control over their personal data. It may also affect their fundamental rights and freedoms, including the right to privacy, freedom of expression, and non-discrimination. The EDPB notes the resultant risks may vary depending on the nature of the data processed, the context of the processing, and the potential consequences for the data subjects.
- Impact of the Processing on Data Subjects: The impact of the processing on data subjects can be positive or negative. For example, the processing may provide benefits such as improved accessibility to services or better healthcare. However, it may also pose risks such as identity theft, discrimination, or reputational harm. The EDPB emphasizes the importance of assessing the severity and likelihood of these risks, taking into account the specific circumstances of the case.
- Reasonable Expectations of the Data Subjects: The reasonable expectations of data subjects play a crucial role in the balancing test. This includes considering whether data subjects can reasonably expect their personal data to be processed for the identified purpose. Factors influencing reasonable expectations include the information provided to data subjects, the context in which the data was collected, and the nature of the relationship between the data subject and the controller.
- Mitigating Measures: When the interests, rights, and freedoms of data subjects seem to override the legitimate interest, controllers may introduce mitigating measures to limit the impact of the processing. These measures should be tailored to the specific circumstances of the case and may include technical and organizational safeguards, such as pseudonymization, data minimization, and enhanced transparency.
By following this three-step test, controllers can ensure that the use of legitimate interest as a legal basis for processing personal data in AI models is justified and compliant with the GDPR.
Consequences of Unlawful Processing
The EDPB discusses three scenarios regarding the impact of unlawful processing during the development phase on subsequent processing or operation of the AI model:
- Scenario 1: If personal data is retained in the AI model and processed by the same controller, the lawfulness of the subsequent processing depends on whether the development and deployment phases involve separate purposes. The lack of a legal basis for the initial processing may impact the lawfulness of the subsequent processing.
- Scenario 2: If personal data is retained in the model and processed by another controller, the new controller must assess whether the AI model was developed by unlawfully processing personal data. This assessment should consider the source of the data and any findings of infringement by supervisory authorities or courts. This in turn will determine whether and how the controller may continue utilizing the AI model.
- Scenario 3: If the AI model is anonymized after unlawful processing, the GDPR would not apply to the subsequent operation of the model. However, any subsequent processing of personal data collected during the deployment phase would remain subject to the GDPR.
Best Practices for Controllers
The above framework and explanation provided by the Board suggests a number of best practices which controllers (and in practice, any business which is utilizing an AI model). The best practices are outlined below:
Demonstrate Anonymity: Ensure that AI models are designed to prevent the extraction of personal data. Use state-of-the-art techniques to anonymize data and document these measures regularly and thoroughly.
Legitimate Interest Assessment: Thoroughly conduct the three-step test outline above to justify the use of legitimate interest as a legal basis. Clearly articulate the legitimate interest, ensure the necessity of processing, and perform a balancing test to weigh the interests of data subjects.
Mitigating Measures: Implement technical and organizational measures to mitigate risks to data subjects. This includes pseudonymization, data minimization, and transparency measures. These measures reduce the overall risk inherent in any data processing activity and will help to broaden the scope of use available for that data's use in an AI model.
Accountability and Documentation: Maintain detailed documentation of all processing activities, including data protection impact assessments (DPIAs) and records of processing activities. Ensure that these documents are readily available for supervisory authorities.
Compliance with GDPR: Regularly review and update data protection practices to ensure compliance with the GDPR. This includes monitoring technological developments (including the continued development of AI), observing ongoing decisions by relevant supervisory authorities, and keeping apace of ongoing litigation in this space (including, among others, the EU's AI Act).
By adhering to these best practices, controllers can promote responsible innovation while protecting the fundamental rights and freedoms of data subjects.