in , ,

Google’s and OpenAI’s AI Training Methods Under Fire Due To Copyright Issues

Read Time:2 Minute, 26 Second

The training techniques used by tech giants OpenAI and Google for their sophisticated AI models have come to light recently due to reports by The New York Times. One area of concern is the possible infringement of intellectual property rights resulting from the usage of transcribed YouTube video footage.

The New York Times quoted sources stating that Google and OpenAI have improved the capabilities of their AI systems by using YouTube video transcripts. This raises concerns over the rights of content producers on the site and compliance with copyright laws. These actions have drawn criticism for perhaps violating the intellectual property rights of YouTube artists.

The research describes the tremendous efforts made by these businesses to gather a sizable amount of data in order to efficiently train their AI systems. Over a million hours of YouTube video material were purportedly transcribed by OpenAI using their Whisper speech recognition technology. This content was then used to train the company’s most recent AI model, GPT-4. Similar methods have previously been brought to light by The Information, which indicated a trend of exploiting freely accessible content, such as podcasts and YouTube videos, for AI training.

Notably, CEO Neal Mohan stated in an interview with Bloomberg Originals that using YouTube material for training seems to go against the platform’s regulations. Concerned that OpenAI may be breaking platform restrictions by using YouTube videos to train its text-to-video generator, Sora, Mohan voiced his worries.

Google, on the other hand, made clear what it stands for, stressing that its policies forbid any kind of unapproved downloading or scraping of YouTube video. Google spokeswoman Matt Bryant told The New York Times that the business was unaware that OpenAI was using YouTube videos as training material and reaffirmed Google’s dedication to upholding creator rights.

The article raises interesting questions about Google’s internal awareness of these tactics, but it also notes that no action has been taken since Google uses YouTube videos to train its AI algorithms. Google has justified its strategy by claiming that it only uses content whose producers have granted permission for it to be used. Google and OpenAI have been contacted by Engadget for comments on these findings.

Relatedly, it has been claimed that Google revised their privacy policy in June 2023 to more clearly address using publicly accessible content—including Google Docs and Sheets—for AI training. Clarity on the extent of data use and the kinds of material included was the goal of the modifications. Bryant made it clear that Google uses this data only in accordance with user settings and test features.

The way AI training techniques are developing creates significant ethical concerns around the use of data and the need of tech businesses to compensate content producers. In the technology industry, striking a balance between innovation and copyright protection is still crucial as companies like Google and OpenAI push the limits of AI capabilities.

What do you think?

In the race for the Premier League title, Man City and Arsenal win important away games; will Liverpool keep up?

Samsung Doubles Commitment to $44 Billion, Increasing Semiconductor Investment in Texas