Copyright issues at OpenAI and other AI startups are ramping up (2024)

Last week, Sony Music Group sent out a letter to more than 700 companies building out or using generative AI models with a clear warning: If you’re going to use Sony’s content, you better have explicit permission.

“We have reason to believe that you and/or your affiliates may already have made unauthorized uses (including TDM) of SMG Content in relation to the training, development or commercialization of AI systems,” according to the letter, a copy of which was obtained by Term Sheet. (The letter was first reported by Bloomberg.)

The letter—sent out to the key AI industry players, like OpenAI, Microsoft, Google, and YouTube, a Sony Music spokesperson confirmed—demanded that the companies provide information about any “unauthorized” usage of its content and that they “preserve all evidence” of using Sony content to train, develop, or commercialize their models.

It’s just another example of the ongoing, raging battle between generative AI companies, who are on the hunt for all the data they can get their hands on to keep improving their models, and the creators and license holders on the other end, who have a vested interest in protecting their IP—or at least getting some of these companies to pay for it.

Copyright issues have become central to the conversation around AI—mostly because we have no idea what nearly all of these companies are using to train their models. It’s not for lack of asking: When OpenAI CTO Mira Murati was asked about whether her company had used YouTube videos, Instagram, and Facebook videos to train its Sora model, she responded that she was “not sure about that.”

What we do know is the argument that some of these companies (and a few of their investors) laid out in letters to the U.S. Copyright Office last year, explaining why they shouldn’t have to pay for copyrighted information.

“The factual metadata and fundamental information that AI models learn from training data are not protected by copyright law. Copyright law does not protect the facts, ideas, scènes à faire, artistic styles, or general concepts contained in copyrighted works,” OpenAI wrote in its letter.

Vinod Khosla, one of OpenAI’s first investors, weighed in with his own letter: To restrict AI from training on copyrighted material would have no precedent in how other forms of intelligence that came before AI now, train. There are no authors of copyright material that did not learn from copyrighted works, be it in manuscripts, art or music. We routinely talk about the influence of a painter or writer on subsequent painters or writers. They have all learned from their predecessors. But copyrights can still be maintained. Many if not most authors or artists have talked about others that have been inspiration, influence or training materials for them.

A lot of the licensing companies and creators beg to disagree, as can be made clear by sifting through the handful of ongoing copyright lawsuits against AI companies. To name a few: The New York Times and OpenAI, Getty Images is suing Stability AI, or Universal Music Group is suing Anthropic. Although AI companies are also trying to land licensing deals, too: Yesterday, OpenAI announced a multi-year licensing arrangement with News Corp that will give OpenAI access to both current and archived articles from its brands including The Wall Street Journal, Barron’s, and the New York Post. In response to a request for comment for this essay, an OpenAI spokesperson also pointed to a blog post explaining how OpenAI is developing a tool that “will enable creators and content owners to tell us what they own and specify how they want their works to be included or excluded from machine learning research and training.” The blog post also says that OpenAI’s models are trained using publicly available data, data partnerships (such as licensing deals), and human feedback.

Attitudes around copyright took a drastic shift in 2022, when OpenAI released the first commercial generative AI model, according to Ed Newton-Rex, who has worked on generative AI models since 2011 at his own startup, as well as at ByteDance, and most recently at Stability AI. “Previously everyone was a bit cautious,” he said, adding: “It was almost as if the view flipped overnight.” He described it as somewhat of a “snowball effect.”

There are several companies taking a strong stance—Adobe’s Firefly, for example, is only trained on material that doesn’t violate copyright, according to the company. Newton-Rex offers certification for companies committed to using fairer data sourcing through the non-profit organization he started called Fairly Trained. There are now 14 companies who have the certification, he said.

It’s possible that some companies may be forced to make their training data available. The EU Artificial Intelligence Act (AI Act) was passed by the European Parliament earlier this year, and was approved by the EU Council just earlier this week. This will require AI companies with general purpose models intended for use in the European Union to publish a “sufficiently detailed” summary of the content they use to train their models. In the U.S., Rep. Adam Schiff (D-Calif.) has proposed the Generative AI Copyright Disclosure Act, which would require companies to disclose the copyrighted work they use to train their models.

All of this may lead to even more litigation. “I think if a lot of these datasets came out, you would see a lot more lawsuits,” Newton-Rex says. “I think one thing holding the lawsuits back is a lack of knowledge about what’s in the training data.”

In other news…Techstars CEO Maëlle Gavet said in a LinkedIn post yesterday that she was resigning at the end of this month for health reasons. Techstars cofounder and Board Chairman David Cohen will return as CEO.

See you tomorrow,

Jessica Mathews
Twitter: @jessicakmathews
Email: jessica.mathews@fortune.com
Submit a deal for the Term Sheet newsletter here.

Joe Abrams curated the deals section of today’s newsletter.

VENTURE DEALS

- Finout, a Tel Aviv, Israel-based financial operations platform for businesses, raised $26.3 million in Series B funding. Red Dot Capital led the round and was joined by Maor Investments and existing investors Team 8, Pitango, and Jibe Ventures.

- SOCRadar, a Newark, Del.-based cybersecurity platform, raised $25.2 million in funding. PeakScan Capital led the round and was joined by Oxx.

- Aerodome, a Los Angeles, Calif.-based developer of drones designed for first responder duties in law enforcement, wildfire prevention, and search & rescue, raised $21.5 million in Series A funding. CRV led the round and was joined by a16z, Karman Ventures, Ford Street Ventures, and others.

- Verse, a San Francisco-based developer of software designed to help companies transition to clean energy, raised $20.5 million in Series A funding. GV led the round and was joined by Coatue, CIV, and MCJ Collective.

- Patronus AI, a New York City-based platform designed for evaluating the accuracy and security of AI models, raised $17 million in Series A funding. Glenn Solomon at Notable Capital led the round and was joined by Lightspeed Venture Partners, Datadog, Gokul Rajaram, Factorial Capital, and others.

- Bolster, a Santa Clara, Calif.-based cybersecurity company, raised $14 million in Series B funding. M12 led the round and was joined by Thomvest Ventures, Crosslink Capital, Liberty Global Ventures, Cheyenne Ventures, Cervin Ventures, and Transform Capital.

- Averlon, a Redmond, Wash.-based platform designed to use AI to identify cloud security issues, raised $8 million in seed funding. Voyager Capital led the round and was joined by Salesforce Ventures and Outpost Ventures.

- Cellugy, a Søborg, Denmark-based company producing organic cellulose material designed to replace toxic plastic in products like cosmetics, packaging, and textiles, raised €4.9 million ($5.3 million) in seed funding. ICIG Ventures and Unconventional Ventures led the round and was joined by Joyance Partners and existing investors PSV DeepTech, The Footprint Firm, and EIFO.

- OncoveryCare, a Boston, Mass.-based provider of personalized care to cancer survivors, raised $4.5 million in seed funding. .406 Ventures led the round and was joined by the McKay Institute for Oncology Transformation at Tennessee Oncology, Oncology Ventures, and Techstars.

- STON.fi, a London, U.K.-based decentralized exchange on the TON blockchain, raised $3.6 million in funding. CoinFund led the round and was joined by Delphi Ventures, Karatage, TON Ventures, and others.

- Volt, a Tulsa, Okla.-based SMS operations platform designed for businesses, raised $3 million in seed funding. Mercury Fund led the round and was joined by Atento Capital, Uncorrelated Ventures, Stout Street Capital, and Yellow Rocks.

- Infinitform, a Los Angeles, Calif.-based AI-copilot for the designing and engineering processes in manufacturing, raised $2.3 million in seed funding from Schematic Ventures.

PRIVATE EQUITY

- Ascend, backed by Pfingsten, acquired majority stakes in Rand Group Solutions, a Houston, Texas-based professional services firm, and MicroAccounting, a Dallas, Texas-based provider of accounting, enterprise resource planning, HR, and business consulting. Financial terms were not disclosed.

- BV Investment Partners acquired a majority stake in CyberSheath, a Reston, Va.-based provider of IT services to the U.S. Department of Defense. Financial terms were not disclosed.

- CORE Industrial Partners took Fathom Digital Manufacturing Corporation, a Hartland, Wis.-based digital manufacturing platform, private. Financial terms were not disclosed.

- Cresta Fund Management acquired a majority stake in Ocean Pacific, a Southern California-based developer of compressed natural gas, renewable natural gas, and hydrogen fueling stations for truck fleets.

- Renovus Capital Partners acquired a majority stake in Case Works, an Austin, Texas-based provider of client engagement and case development services for tort law firms. Financial terms were not disclosed.

- Shur-Co, a portfolio company of Behrman Capital, acquired AB Airbags, a Carlsbad, Calif.-based supplier of airbags and cargo securement products. Financial terms were not disclosed.

EXITS

- Admiral Acquisition Limited agreed to acquire Acuren, a Tomball, Texas-based provider of engineering and lab testing services, from American Securities for $1.9 billion.

- MasterBrand (NYSE: MBC) agreed to acquire Supreme Cabinetry Brands, a Waterloo, Iowa and Howard Lake, Minn.-based cabinetry company, from GHK Capital Partners for $520 million in cash.

IPOS

- Tempus AI, a Chicago, Ill.-based company that uses AI tools to improve precision medicine, filed to go public on the Nasdaq. The company posted $562 million in revenue for the year ending March 31, 2024. Eric Lefkofsky, Keeks, and Baillie Gifford & Co. back the company.

FUNDS + FUNDS OF FUNDS

- Riata Capital Group, a Dallas, Texas-based private equity firm, raised $285 million for its second fund focused on companies in the business services, consumer, and health care services sectors.

PEOPLE

- Antler, a Singapore-based venture capital firm, hired Tobias Bengtsdahl as a partner. Formerly, he was with Osito Ventures.

- Bessemer Venture Partners, a San Francisco, Calif.-based venture capital firm, promoted Christopher Wan to vice president.

- BGF, a London, U.K.-based private equity firm, hired Christopher Olds as chief operating officer. Previously, he served as chief financial officer of Ultraleap.

This is the web version of Term Sheet, a daily newsletter on the biggest deals and dealmakers in venture capital and private equity. Sign up for free.

Copyright issues at OpenAI and other AI startups are ramping up (2024)
Top Articles
Latest Posts
Article information

Author: Arline Emard IV

Last Updated:

Views: 6369

Rating: 4.1 / 5 (52 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Arline Emard IV

Birthday: 1996-07-10

Address: 8912 Hintz Shore, West Louie, AZ 69363-0747

Phone: +13454700762376

Job: Administration Technician

Hobby: Paintball, Horseback riding, Cycling, Running, Macrame, Playing musical instruments, Soapmaking

Introduction: My name is Arline Emard IV, I am a cheerful, gorgeous, colorful, joyous, excited, super, inquisitive person who loves writing and wants to share my knowledge and understanding with you.