Meta has confirmed plans to utilise content shared by its adult users in the EU (European Union) to train its AI models.
The announcement follows the recent launch of Meta AI features in Europe and aims to enhance the capabilities and cultural relevance of its AI systems for the region’s diverse population.
In a statement, Meta wrote: “Today, we’re announcing our plans to train AI at Meta using public content – like public posts and comments – shared by adults on our products in the EU.
“People’s interactions with Meta AI – like questions and queries – will also be used to train and improve our models.”
Starting this week, users of Meta’s platforms (including Facebook, Instagram, WhatsApp, and Messenger) within the EU will receive notifications explaining the data usage. These notifications, delivered both in-app and via email, will detail the types of public data involved and link to an objection form.
“We have made this objection form easy to find, read, and use, and we’ll honor all objection forms we have already received, as well as newly submitted ones,” Meta explained.
Meta explicitly clarified that certain data types remain off-limits for AI training purposes.
The company says it will not “use people’s private messages with friends and family” to train its generative AI models. Furthermore, public data associated with accounts belonging to users under the age of 18 in the EU will not be included in the training datasets.
Meta wants to build AI tools designed for EU users
Meta positions this initiative as a necessary step towards creating AI tools designed for EU users. Meta launched its AI chatbot functionality across its messaging apps in Europe last month, framing this data usage as the next phase in improving the service.
“We believe we have a responsibility to build AI that’s not just available to Europeans, but is actually built for them,” the company explained.
“That means everything from dialects and colloquialisms, to hyper-local knowledge and the distinct ways different countries use humor and sarcasm on our products.”
This becomes increasingly pertinent as AI models evolve with multi-modal capabilities spanning text, voice, video, and imagery.
Meta also situated its actions in the EU within the broader industry landscape, pointing out that training AI on user data is common practice.
“It’s important to note that the kind of AI training we’re doing is not unique to Meta, nor will it be unique to Europe,” the statement reads.
“We’re following the example set by others including Google and OpenAI, both of which have already used data from European users to train their AI models.”
Meta further claimed its approach surpasses others in openness, stating, “We’re proud that our approach is more transparent than many of our industry counterparts.”
Regarding regulatory compliance, Meta referenced prior engagement with regulators, including a delay initiated last year while awaiting clarification on legal requirements. The company also cited a favourable opinion from the European Data Protection Board (EDPB) in December 2024.
“We welcome the opinion provided by the EDPB in December, which affirmed that our original approach met our legal obligations,” wrote Meta.
Broader concerns over AI training data
While Meta presents its approach in the EU as transparent and compliant, the practice of using vast swathes of public user data from social media platforms to train large language models (LLMs) and generative AI continues to raise significant concerns among privacy advocates.
Firstly, the definition of “public” data can be contentious. Content shared publicly on platforms like Facebook or Instagram may not have been posted with the expectation that it would become raw material for training commercial AI systems capable of generating entirely new content or insights. Users might share personal anecdotes, opinions, or creative works publicly within their perceived community, without envisaging its large-scale, automated analysis and repurposing by the platform owner.
Secondly, the effectiveness and fairness of an “opt-out” system versus an “opt-in” system remain debatable. Placing the onus on users to actively object, often after receiving notifications buried amongst countless others, raises questions about informed consent. Many users may not see, understand, or act upon the notification, potentially leading to their data being used by default rather than explicit permission.
Thirdly, the issue of inherent bias looms large. Social media platforms reflect and sometimes amplify societal biases, including racism, sexism, and misinformation. AI models trained on this data risk learning, replicating, and even scaling these biases. While companies employ filtering and fine-tuning techniques, eradicating bias absorbed from billions of data points is an immense challenge. An AI trained on European public data needs careful curation to avoid perpetuating stereotypes or harmful generalisations about the very cultures it aims to understand.
Furthermore, questions surrounding copyright and intellectual property persist. Public posts often contain original text, images, and videos created by users. Using this content to train commercial AI models, which may then generate competing content or derive value from it, enters murky legal territory regarding ownership and fair compensation—issues currently being contested in courts worldwide involving various AI developers.
Finally, while Meta highlights its transparency relative to competitors, the actual mechanisms of data selection, filtering, and its specific impact on model behaviour often remain opaque. Truly meaningful transparency would involve deeper insights into how specific data influences AI outputs and the safeguards in place to prevent misuse or unintended consequences.
The approach taken by Meta in the EU underscores the immense value technology giants place on user-generated content as fuel for the burgeoning AI economy. As these practices become more widespread, the debate surrounding data privacy, informed consent, algorithmic bias, and the ethical responsibilities of AI developers will undoubtedly intensify across Europe and beyond.
(Photo by Julio Lopez)
See also: Apple AI stresses privacy with synthetic and anonymised data

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.
Explore other upcoming enterprise technology events and webinars powered by TechForge here.
The post Meta will train AI models using EU user data appeared first on AI News.