Training ChatGPT model? Media agencies: will have to pay now.
AI companies may need to pay for copyrighted content used to train ChatGPT models in the future.
According to the Financial Times on Friday, several of the world's largest technology companies are negotiating with major media organizations in hopes of reaching a milestone agreement on using news content to train AI technology.
Citing sources familiar with the matter, OpenAI, Alphabet-C, Microsoft, and Adobe have met with news executives in recent months to discuss copyright issues related to AI products such as text chatbots and image generators. Publishers including News Corp, Axel Springer, The New York Times, and The Guardian are all in discussions with at least one technology company.
These sources added that these agreements may involve paying media organizations subscription fees for their content to develop chatbot technology that supports ChatGPT and Alphabet-C's Bard. These discussions are said to be in the early stages.
"Your AI is infringing on my copyright"
Since ChatGPT became popular worldwide, AI copyright issues have been a concern. AI models require a large amount of data and content for training, but many of them have copyright ownership. Whether AI companies should pay for using this content has become a focus of debate abroad.
The Wall Street Journal reported in March of this year that the American News Corp, which owns The New York Post, Barron's, and The Wall Street Journal, is preparing to demand compensation from AI technology manufacturers such as OpenAI, Microsoft, and Alphabet-C through legal procedures.
The report cited a person familiar with the media alliance organization as saying that some news executives have recently been studying the extent to which their content is used to train AI tools such as ChatGPT and Bard, and are exploring how to obtain compensation through legal means. This person said:
"We have valuable content that is the result of artificial hard work, but this content is constantly being used to create revenue for others, so compensation must be obtained."
Analysis suggests that the emergence of AI tools has exacerbated the already tense relationship between large technology companies and the publishing industry. Publishers have always relied on the help of technology companies such as Alphabet-C and Meta to reach a wider audience with their content. However, at the same time, more and more publishers are demanding that technology companies pay for using their content.
Publishers are concerned that AI tools may deplete their website traffic and advertising revenue. Media executives hope to avoid the mistakes of the early days of the internet, when many people provided articles for free online, ultimately destroying their business models. Alphabet-C and Facebook and other large technology groups then used this information to help build online advertising businesses worth billions of dollars. Not only text copyright owners, but also image copyright owners are angry that they don't pay for training ChatGPT models.
In February of this year, American image trading company Getty Images sued Stability AI in Delaware for infringing on image copyrights. In January, a group of artists, including painter Grzegorz Rutkowski, filed a class action lawsuit against AI image generation service companies Midjourney and Stability AI as plaintiffs. Earlier, in November last year, a federal court in California also filed a class action lawsuit against Microsoft and its subsidiaries GitHub and OpenAI's GitHub Copilot system.
In 2022, the image AI trend was set off by Dall-E 2 launched by OpenAI, Stable Diffusion created by Stability AI, and the AI image tool of the same name released by Midjourney.
In these AI tools, many works are generated by imitating the style of some photographic images or famous paintings. In order to enable the algorithm to imitate these styles, the companies that build them must first copy these works from the Internet and use them to train AI models. Therefore, these companies are also involved in one lawsuit after another due to copyright issues.