OpenAI underpays news partners when licensing data, sources claim.

TLDR:

  • OpenAI is reportedly offering between $1 million and $5 million per year to news publishers for licensing deals to train AI models.
  • This marks one of the first indications of how much AI companies are willing to pay for copyrighted news articles.
  • Apple is also reportedly offering at least $50 million over a multiyear period to media companies for AI training data.
  • AI developers are increasingly seeking partnerships with news organizations to avoid copyright infringement and access high-quality training data.

OpenAI has been trying to establish licensing deals with news publishers to train its AI models, and according to The Information, these deals are valued between $1 million and $5 million per year. This sheds light on the amount of money AI companies are willing to pay for copyrighted news articles. Apple has also been in talks with media companies to obtain content for AI training, offering at least $50 million over a multiyear period for data.

The pricing of training datasets for AI models varies depending on the provider, size, and content of the dataset. Some datasets, like LAION, are open source and free, while others may require payment. AI developers often utilize web crawlers to gather data from the internet for training purposes. However, this practice has faced challenges, including companies blocking AI crawlers from accessing their data and copyright infringement claims from publishers.

In order to avoid these issues, AI companies are striking licensing partnerships with news organizations. OpenAI has signed deals with publishers like Axel Springer and The Associated Press to license stories for training models like GPT-4. Apple, Google, and other AI developers are also seeking similar partnerships with news publishers to access high-quality training data. Some news organizations have experimented with using generative AI tools in their newsrooms, but with mixed results.