Google AI overviews = Theft? The Court’s ruling sets a precedent

Google’s bold new vision for the future of online search, powered by AI technology, is sparking an industry-wide backlash over fears it could damage the internet’s open ecosystem.

At the center of the controversy are Google’s recently launched ‘AI Overviews’, which are generated summaries that aim to answer searches directly by pulling information from the internet.

AI summaries appear prominently at the top of results pages, potentially limiting users’ need to click through to publishers’ websites.

The move led to legal action in France, where publishers filed lawsuits accusing Google of violating intellectual property rights by using their content to train AI models without permission.

A group of French publishers won an early lawsuit in April 2024. A judge ordered Google to negotiate fair compensation for reusing snippets of their content.

Publishers in the US are raising similar concerns as Google’s new AI search summaries threaten to pull traffic away from sources. They argue that Google unfairly profits from the content of others.

The debate highlights the need for updated frameworks for using online data in the age of AI.

Publisher concerns

According to industry observers, the implications of AI overviews could impact millions of independent creators who rely on referral traffic on Google Search.

Frank Pine, editor-in-chief at MediaNews Group, tells The Washington Post:

“When journalists do that to each other, we call it plagiarism.”

Pine’s company, which publishes the Denver Post and the Boston Herald, is among those suing OpenAI for allegedly scrapping copyrighted articles to train its language models.

Google’s revenue model has long been based on driving traffic to other websites and generating revenue through paid advertising channels.

AI overviews threaten to shift that revenue model.

Kimber Matherne, who runs a food blog, is quoted in the post article saying:

“[Google’s] The goal is to make it as easy as possible for people to find the information they are looking for. But if you take out the people who are the lifeblood of creating that information, it’s a disservice to the world.”

According to the Post’s report, Raptive, an ad services company, estimates that the changes could result in $2 billion in lost revenue for online creators.

They also believe that some websites could lose two-thirds of their search traffic.

Raptive CEO Michael Sanchez tells The Post:

“What was already not a level playing field could result in a situation in which the open internet is in danger of surviving.”

Concerns from industry professionals

Google’s AI overviews are understandably raising concerns among industry professionals, as evidenced by numerous tweets criticizing the move.

Matt Gibbs wondered how Google developed the knowledge base for its AI, stating bluntly: “They took it from publishers who did the actual work of creating the knowledge. Google is a bunch of thieves.”

Today at the top of the article ‘Generative AI in search’ from Google.

How did they develop that knowledge base?

They took it from publishers who did the actual work of creating the knowledge.

Google is a bunch of thieves. pic.twitter.com/SIkPqtWZwa

— Matt Gibbs (@ematt) May 14, 2024

In her tweet, Kristine Schachinger echoed similar sentiments, referring to Google’s AI responses as “a complete digital theft engine that prevents sites from getting clicks in the first place.”

.@zondarpichai And @Google launch AI responses to #GoogleIO2024 also known as a complete digital theft engine that prevents sites from receiving clicks in the first place.

We need the government to step in now and apply pressure to bring the sunshine.

This is AN AI RESPONSE.
Click in. pic.twitter.com/5NNtKAURxC

— Kristine (@schachin on Threads) 🇺🇦 (@schachin) May 14, 2024

Gareth Boyd retweeted a quote from the Washington Post article highlighting the troubles of blogger Jake Boly, whose site recently saw a 96% drop in Google traffic.

Boyd said, “The precedent being set by OpenAI and Google is scary…” and that “more people should be equally angry” at both companies for the “overt theft of content.”

The precedent being set by OpenAI and Google is terrifying… more people should be as angry at OpenAI as they are at Google for its blatant content theft.

To be clear, I HATE regulation, but by the time AI is quite rightly regulated, it will be too late. https://t.co/KsbNUKopeV

— Gareth Boyd (@garethaboyd) May 15, 2024

In his tweet, Avram Piltch directly accused Google of theft, stating: “the data used to train their AI came from the same publishers who allowed Google to crawl it and will now be harmed. This is theft, plain and simple. And it is a threat to the future of the internet.”

You can say Google doesn’t “owe” publishers anything, but the data used to train their AI came from the same publishers who allowed Google to crawl them and they will now be harmed. This is theft, plain and simple. And it’s a threat to the future of the internet. https://t.co/buDZgRaSuL

— Avram Piltch (@geekinchief) May 15, 2024

Lily Ray made a similar claim about Google: “Using all the content they took from the sites Google created. With little to no attribution or traffic.”

Using all the content they took from the sites Google created. With little to no attribution or traffic. https://t.co/0sNwk2ASmT

— Lily Ray 😏 (@lilyraynyc) May 14, 2024

Legal gray area

The controversy taps into broader debates around intellectual property and fair use, as AI systems are trained on unprecedented amounts of data scraped across the internet.

Google claims that its models only include publicly available web data and that publishers previously profited from search traffic.

Publishers implicitly consent to their content being indexed by search engines unless they opt out.

However, laws were not created with training AI models in mind.

What is the path forward?

This debate highlights the need for new rules around the way AI uses online data.

The path forward is unclear, but the stakes are high.

Some suggest revenue sharing or licensing fees when using publisher content to train AI models. Others propose an opt-in system that gives website owners more control over how their content is used for AI training.

The French rulings suggest that the courts can intervene without explicit guidelines and good faith negotiations.

The Internet has always relied on a balance between search engines and content creators. If that balance is disrupted without new safeguards, it could undermine the information exchange that makes the Internet so valuable.

Featured image: Veroniksha/Shutterstock

Publisher concerns

Concerns from industry professionals

Legal gray area

What is the path forward?

Leave a Comment Cancel reply