Breaking: Comedian Sarah Kate Silverman suing ChatGPT, less 3 Weeks After Another Lawsuit about Data Privacy
New Hampshire 6.33am / London 11.33am
Comedian Sarah Kate Silverman (born December 1, 1970, in Bedford, New Hampshire) suing ChatGPT, similar lawsuit filed by her against Meta too yesterday. OpenAI has been charged with copyright infringement, accusing them of using the Sarah’s copyrighted books as training material for GPT.
More and more lawsuit to ChatGPT. June 27th, 2023, A California-based law firm is launching a class-action lawsuit against OpenAI, alleging the artificial-intelligence company that created popular chatbot ChatGPT massively violated the copyrights and privacy of countless people when it used data scraped from the internet to train its tech.
The lawsuit seeks to test out a novel legal theory — that OpenAI violated the rights of millions of internet users when it used their social media comments, blog posts, Wikipedia articles and family recipes. Clarkson, the law firm behind the suit, has previously brought large-scale class-action lawsuits on issues ranging from data breaches to false advertising.
The firm wants to represent “real people whose information was stolen and commercially misappropriated to create this very powerful technology,” said Ryan Clarkson, the firm’s managing partner.
The case was filed in federal court in the northern district of California (June 27th). A spokesman for OpenAI did not respond to a request for comment.
The lawsuit goes to the heart of a major unresolved question hanging over the surge in “generative” AI tools such as chatbots and image generators. The technology works by ingesting billions of words from the open internet and learning to build inferences between them. After consuming enough data, the resulting “large language models” can predict what to say in response to a prompt, giving them the ability to write poetry, have complex conversations and pass professional exams. But the humans who wrote those billions of words never signed off on having a company such as OpenAI use them for its own profit.
“All of that information is being taken at scale when it was never intended to be utilized by a large language model,” Clarkson said. He said he hopes to get a court to institute some guardrails on how AI algorithms are trained and how people are compensated when their data is used.
The firm already has a group of plaintiffs and is actively looking for more.
The legality of using data pulled from the public internet to train tools that could prove highly lucrative to their developers is still unclear. Some AI developers have argued that the use of data from the internet should be considered “fair use,” a concept in copyright law that creates an exception if the material is changed in a “transformative” way.
The question of fair use is “an open issue that we will be seeing play out in the courts in the months and years to come,” said Katherine Gardner, an intellectual-property lawyer at Gunderson Dettmer, a firm that mostly represents tech start-ups. Artists and other creative professionals who can show their copyrighted work was used to train the AI models could have an argument against the companies using it, but it’s less likely that people who simply posted or commented on a website would be able to win damages, she said.
“When you put content on a social media site or any site, you’re generally granting a very broad license to the site to be able to use your content in any way,” Gardner said. “It’s going to be very difficult for the ordinary end user to claim that they are entitled to any sort of payment or compensation for use of their data as part of the training.”
The suit also adds to the growing list of legal challenges to the companies building and hoping to profit from AI tech. A class-action lawsuit was filed in November against OpenAI and Microsoft for how the companies used computer code in the Microsoft-owned online coding platform GitHub to train AI tools. In February, Getty Images sued Stability AI, a smaller AI start-up, alleging it illegally used its photos to train its image-generating bot. And this month OpenAI was sued for defamation by a radio host in Georgia who said ChatGPT produced text that wrongfully accused him of fraud.
OpenAI isn’t the only company using troves of data scraped from the open internet to train their AI models. Google, Facebook, Microsoft and a growing number of other companies are all doing the same thing. But Clarkson decided to go after OpenAI because of its role in spurring its bigger rivals to push out their own AI when it captured the public’s imagination with ChatGPT last year, Clarkson said.
“They’re the company that ignited this AI arms race,” he said. “They’re the natural first target.”
OpenAI doesn’t share what kind of data went into its latest model, GPT4, but previous versions of the tech have been shown to have digested Wikipedia pages, news articles and social media comments. Chatbots from Google and other companies have used similar data sets.
Regulators are discussing enacting new laws that require more transparency from companies about what data went into their AI. It’s also possible that a court case could prompt a judge to force a company such as OpenAI to turn over information on what data it used, said Gardner, the intellectual-property lawyer.
Some companies have tried to stop AI firms from scraping their data. In April, music distributor Universal Music Group asked Apple and Spotify to block scrapers, according to the Financial Times. Social media site Reddit is shutting off access to its data stream, citing how Big Tech companies have for years scraped the comments and conversations on its site. Twitter owner Elon Musk threatened to sue Microsoft for using Twitter data it had gotten from the company to train its AI. Musk is building his own AI company.
The new class-action lawsuit against OpenAI goes further in its allegations, arguing that the company isn’t transparent enough with people who sign up to use its tools that the data they put into the model may be used to train new products that the company will make money from, such as its Plugins tool. It also alleges OpenAI doesn’t do enough to make sure children under 13 aren’t using its tools, something that other tech companies including Facebook and YouTube have been accused of over the years.
4 Months ago, a lawyer named Steven Schwartz got ChatGPT to write the submissions (plainly nonsense) and in doing so, it fabricated authority citations. When the (angry) Judge requested the cases, the lawyer went back and got ChatGPT to FABRICATE THE CASES. There’s lots of talk about how lawyers can make use of AI in their practice.
==========END————
Thank you, as always, for reading. If you have anything like a spark file, or master thought list (spark file sounds so much cooler), let me know how you use it in the comments below.
If you enjoyed this post, please share it.
If a friend sent this to you, you could subscribe here 👇. All content is free, and paid subscriptions are voluntary.
————
-prada- Adi Mulia Pradana is a Helper. Former adviser (President Indonesia) Jokowi for mapping 2-times election. I used to get paid to catch all these blunders—now I do it for free. Trying to work out what's going on, what happens next. Arch enemies of the tobacco industry, (still) survive after getting doxed. Now figure out, or, prevent catastrophic situations in the Indonesian administration from outside the government. After his mom was nearly killed by a syndicate, now I do it (catch all these blunders, especially blunders by an asshole syndicates) for free.
(Very rare compliment and initiative pledge. Thank you. Yes, even a lot of people associated me PRAVDA, not part of MIUCCIA PRADA. I’m literally asshole on debate, since in college). Especially after heated between Putin and Prigozhin. My note-live blog about Russia - Ukraine already click-read 4 millions.
=======
Thanks for reading Prada’s Newsletter. I was lured, inspired by someone writer, his post in LinkedIn months ago, “Currently after a routine daily writing newsletter in the last 10 years, my subscriber reaches 100,000. Maybe one of my subscribers is your boss.” After I get followed / subscribed by (literally) prominent AI and prominent Chief Product and Technology of mammoth global media (both: Sir, thank you so much), I try crafting more / better writing.
To get the ones who really appreciate your writing, and now prominent people appreciate my writing, priceless feeling. Prada ungated/no paywall every notes-but thank you for anyone open initiative pledge to me.
(Promoting to more engage in Substack) Seamless to listen to your favorite podcasts on Substack. You can buy a better headset to listen to a podcast here (GST DE352306207). Listeners on Apple Podcasts, Spotify, Overcast, or Pocket Casts simultaneously. podcasting can transform more of a conversation. Invite listeners to weigh in on episodes directly with you and with each other through discussion threads. At Substack, the process is to build with writers. Podcasts are an amazing feature of the Substack. I wish it had a feature to read the words we have written down without us having to do the speaking. Thanks for reading Prada’s Newsletter.