Meta Platforms has revealed plans to begin training its artificial intelligence (AI) models using publicly shared content from adult users in the European Union. This data will be drawn from platforms under the Meta umbrella, including Facebook, Instagram, WhatsApp and Messenger.
The technology giant stated, “We’ll use public content shared by adults on Meta’s products and services – like posts or comments – and interactions with our generative AI features, like AI stickers or Meta AI, including messages to Meta AI and prompts people share to generate images.”
The initiative comes shortly after Meta AI was introduced across Europe last month and marks the reactivation of training efforts that had been temporarily halted in 2023 due to regulatory concerns. By incorporating user-generated content, Meta aims to develop AI systems more attuned to the nuances of the European digital landscape.
Clarifying the scope of the project, the company said that data used for training will include public posts, comments, and user interactions with AI tools, such as queries and image-generation prompts. The intent is to enhance the performance of Meta’s AI by making it more inclusive, culturally aware, and reflective of local languages and social references.
“This training helps improve the performance, inclusiveness and safety of the AI experiences we’re building. It also helps us better understand the languages, geography and cultural references that are relevant in Europe,” Meta noted.
The company further explained that its models will benefit from a deeper understanding of the region’s linguistic diversity, cultural dynamics and online behaviours. “This includes everything from regional dialects to the distinct use of humour and sarcasm,” it said.
From this week, users in the EU will begin receiving notifications via Meta apps and email about the data collection practices. Importantly, they will be given the option to opt out through a straightforward process.
“We’ve developed a simple and accessible form that anyone in the EU, EEA or UK can use to object to our use of their information for AI training,” Meta confirmed. “We will honour all objections, including those we’ve already received.”
Meta reassured the public that data from individuals under 18, as well as private messages, will be excluded from the training datasets. It also emphasised that its methods align with practices already adopted by other leading technology companies.
“We’re doing what many others in the industry are already doing to train their AI models – using public content, including from the open web,” Meta said, referencing the examples of Google and OpenAI, which also employ publicly available content for training their AI systems.
This development follows approval from the European Data Protection Board (EDPB), which determined that Meta’s current data framework is in accordance with EU legal standards. Despite ongoing concerns voiced by privacy advocates, Meta continues to assert that its approach prioritises user rights and operates with transparency.