Techno Time

Creators say they didn’t know Google uses YouTube to train AI

Thursday 19 June 2025 22:37

Google is using its expansive library of YouTube videos to train its artificial intelligence models, including Gemini and the Veo 3 video and audio generator, CNBC has learned.

The tech company is turning to its catalog of 20 billion YouTube videos to train these new-age AI tools, according to a person who was not authorized to speak publicly about the matter. Google confirmed to CNBC that it relies on its vault of YouTube videos to train its AI models, but the company said it only uses a subset of its videos for the training and that it honors specific agreements with creators and media companies.

“We’ve always used YouTube content to make our products better, and this hasn’t changed with the advent of AI,” said a YouTube spokesperson in a statement. “We also recognize the need for guardrails, which is why we’ve invested in robust protections that allow creators to protect their image and likeness in the AI era — something we’re committed to continuing.”

Such use of YouTube videos has the potential to lead to an intellectual property crisis for creators and media companies, experts said.

While YouTube says it has shared this information previously, experts who spoke with CNBC said it’s not widely understood by creators and media organizations that Google is training its AI models using its video library.

YouTube didn’t say how many of the 20 billion videos on its platform or which ones are used for AI training. But given the platform’s scale, training on just 1% of the catalog would amount to 2.3 billion minutes of content, which experts say is more than 40 times the training data used by competing AI models.

The company shared in a blog post published in September that YouTube content could be used to “improve the product experience … including through machine learning and AI applications.” Users who have uploaded content to the service have no way of opting out of letting Google train on their videos.

“It’s plausible that they’re taking data from a lot of creators that have spent a lot of time and energy and their own thought to put into these videos,” said Luke Arrigoni, CEO of Loti, a company that works to protect digital identity for creators. “It’s helping the Veo 3 model make a synthetic version, a poor facsimile, of these creators. That’s not necessarily fair to them.”

CNBC spoke with multiple leading creators and IP professionals, none were aware or had been informed by YouTube that their content could be used to train Google’s AI models.