* Disclaimer, I realize this topic doesn't involve TV or Radio (I had another thread pulled down earlier today by the moderators for that reason), but it does involve "media" and I post this here as I think it's a discussion topic that may be of interest to others on this site.
This story originally appeared in the Washington Post. As it may be behind a paywall there for some, the link below is to the Yahoo version:
This story originally appeared in the Washington Post. As it may be behind a paywall there for some, the link below is to the Yahoo version:
AI's future could hinge on one thorny legal question
If a media outlet copied a bunch of New York Times stories and posted them on its site, that would probably be seen as a blatant violation of the Times's copyright. But what about when a tech company copies those same articles, combines them with countless other copied works, and uses them to...
www.yahoo.com
If a media outlet copied a bunch of New York Times stories and posted them on its site, that would probably be seen as a blatant violation of the Times's copyright.
But what about when a tech company copies those same articles, combines them with countless other copied works, and uses them to train an AI chatbot capable of conversing on almost any topic - including the ones it learned about from the Times?
That's the legal question at the heart of a lawsuit the Times filed against OpenAI and Microsoft in federal court last week, alleging that the tech firms illegally used "millions" of copyrighted Times articles to help develop the AI models behind tools such as ChatGPT and Bing.
First, like other recent AI copyright lawsuits, the Times argues that its rights were infringed when its articles were "scraped" - or digitally scanned and copied - for inclusion in the giant data sets that GPT-4 and other AI models were trained on. That's sometimes called the "input" side.
Second, the Times's lawsuit cites examples in which OpenAI's GPT-4 language model - versions of which power both ChatGPT and Bing - appeared to cough up either detailed summaries of paywalled articles, like the company's Wirecutter product reviews, or entire sections of specific Times articles. In other words, the Times alleges, the tools violated its copyright with their "output," too.
The Times did not specify the amount it is seeking, although the company estimates damages to be in the "billions." It is also asking for a permanent ban on the unlicensed use of its work. More dramatically, it asks that any existing AI models trained on Times content be destroyed.