The New York Times Sues OpenAI, Microsoft for "Billions" in Damages

Mikey Radio · Jan 5, 2024

* Disclaimer, I realize this topic doesn't involve TV or Radio (I had another thread pulled down earlier today by the moderators for that reason), but it does involve "media" and I post this here as I think it's a discussion topic that may be of interest to others on this site.
This story originally appeared in the Washington Post. As it may be behind a paywall there for some, the link below is to the Yahoo version:

AI's future could hinge on one thorny legal question

If a media outlet copied a bunch of New York Times stories and posted them on its site, that would probably be seen as a blatant violation of the Times's copyright. But what about when a tech company copies those same articles, combines them with countless other copied works, and uses them to...

www.yahoo.com

If a media outlet copied a bunch of New York Times stories and posted them on its site, that would probably be seen as a blatant violation of the Times's copyright.
But what about when a tech company copies those same articles, combines them with countless other copied works, and uses them to train an AI chatbot capable of conversing on almost any topic - including the ones it learned about from the Times?
That's the legal question at the heart of a lawsuit the Times filed against OpenAI and Microsoft in federal court last week, alleging that the tech firms illegally used "millions" of copyrighted Times articles to help develop the AI models behind tools such as ChatGPT and Bing.

First, like other recent AI copyright lawsuits, the Times argues that its rights were infringed when its articles were "scraped" - or digitally scanned and copied - for inclusion in the giant data sets that GPT-4 and other AI models were trained on. That's sometimes called the "input" side.
Second, the Times's lawsuit cites examples in which OpenAI's GPT-4 language model - versions of which power both ChatGPT and Bing - appeared to cough up either detailed summaries of paywalled articles, like the company's Wirecutter product reviews, or entire sections of specific Times articles. In other words, the Times alleges, the tools violated its copyright with their "output," too.

The Times did not specify the amount it is seeking, although the company estimates damages to be in the "billions." It is also asking for a permanent ban on the unlicensed use of its work. More dramatically, it asks that any existing AI models trained on Times content be destroyed.

Y2kTheNewOldies · Jan 6, 2024

Mikey Radio said:
* Disclaimer, I realize this topic doesn't involve TV or Radio (I had another thread pulled down earlier today by the moderators for that reason), but it does involve "media" and I post this here as I think it's a discussion topic that may be of interest to others on this site.
This story originally appeared in the Washington Post. As it may be behind a paywall there for some, the link below is to the Yahoo version:

AI's future could hinge on one thorny legal question

If a media outlet copied a bunch of New York Times stories and posted them on its site, that would probably be seen as a blatant violation of the Times's copyright. But what about when a tech company copies those same articles, combines them with countless other copied works, and uses them to...

www.yahoo.com

ChatGPT-maker OpenAI signs deal with AP to license news stories

ChatGPT-maker OpenAI and The Associated Press said Thursday that they’ve made a deal for the artificial intelligence company to license AP’s archive of news stories.

apnews.com

Open AI in the past year signed agreements with the AP for access to News content.

https://www.reuters.com/legal/microsoft-openai-hit-with-new-lawsuit-by-authors-over-ai-training-2024-01-05/

Here are more parties joining in the lawsuit related to how AI affects copyrights and royalties. It's a case of a business model that is yet to be established when it comes to AI and access to content.

Kelly A · Jan 9, 2024

The common thread to media is that Large Language Models of AI have scraped the Internet for all sorts of sites, social media, and traditional media to learn. The NYT claims this is a copyright violation, but also it enables users of Bing or ChatGPT to bypass sites of the original author/reporter/agency. In other terms; plagiarism. As an example, let's say there's a story behind a paywall. If you ask ChatGPT to send you the first page of the story without mentioning the source of the story, ChatGPT will send you what you asked for bypassing a subscription or logging in to read the original story.

boombox4 · Feb 2, 2024

I don't believe for a second that the big tech companies behind some of the AI projects (Microsoft especially) didn't have their floors of lawyers look into this possibility of legal issues before they decided to scrape the internet for info and data to use with their AI projects.

That said, if the lawsuits are successful, it will fence in the ability of AI to 'scrape and use' copyrighted materials.

Y2kTheNewOldies · Feb 3, 2024

News outlets join forces to create AI charter - ICIJ

A new set of guidelines was developed by a coalition of 17 organizations, including ICIJ.

www.icij.org

Here is one that is a factor in journalism and AI.