OpenAI's AI Browser Avoids News Sites Suing the Company

A new generation of AI-powered browsers, designed to act as personal agents on the internet, appears to be actively avoiding information from specific news organizations. An investigation into the behavior of ChatGPT Atlas, a browser developed by OpenAI, reveals that the tool systematically sidesteps websites of companies currently engaged in copyright litigation against its creator, raising new questions about information access and algorithmic bias.

Key Takeaways

OpenAI's browser, Atlas, demonstrates behavior that avoids direct access to news sites from companies suing OpenAI, such as The New York Times.
Instead of accessing the primary source, the AI gathers information from social media, citations, and other news outlets with existing OpenAI partnerships.
This AI agent operates by mimicking a human user, allowing it to bypass standard web protocols that block automated crawlers.
The behavior suggests a potential, built-in precautionary measure to navigate complex legal challenges surrounding AI and copyrighted content.

A New Kind of Web Navigation

AI-powered browsers represent a significant shift from traditional web surfing. Tools like ChatGPT Atlas are equipped with "agentic capabilities," meaning they are designed not just to search for information, but to perform tasks on behalf of the user, such as booking flights or making reservations. To do this, they need to navigate the web more freely than standard search engine crawlers.

Unlike older internet technology that respects `robots.txt` files—directives that tell automated bots not to access certain pages—these new AI agents operate differently. Research shows that Atlas sessions appear in server logs as normal human activity within a Chrome browser, as it is built on the open-source Chromium platform. This allows it to access pages that would typically block automated systems.

While this capability could enable more seamless user experiences, it also brings to light complex behaviors when the AI encounters sensitive territory, particularly sources of information owned by legal adversaries.

The Legal Backdrop

Several high-profile media organizations, including The New York Times and the parent company of PCMag, have filed lawsuits against OpenAI. These lawsuits allege that the AI company used millions of their copyrighted articles without permission to train its large language models, including the one that powers ChatGPT. This ongoing legal conflict forms a critical context for understanding the browser's selective information-gathering.

The Labyrinthine Path to Information

When tasked with summarizing articles from sources involved in litigation with OpenAI, the Atlas browser's behavior becomes notably evasive. Instead of going directly to the source, it embarks on a complex, roundabout journey to find the requested information elsewhere.

Case Study: PCMag

In one test, the browser was asked to summarize an article from PCMag. Rather than accessing the PCMag website, the AI agent scoured the internet for secondary references. It pieced together a summary by pulling information from social media posts discussing the article and from other news sites that had cited the original piece. The direct source was completely avoided.

Case Study: The New York Times

A similar pattern emerged with a request concerning an article from The New York Times. The AI did not visit the newspaper's website. Instead, it generated a summary based on the reporting of four different news outlets: the Guardian, the Washington Post, Reuters, and the Associated Press.

Notably, at least three of the four alternative sources used—the Guardian, the Washington Post, and the Associated Press—have existing content or search-related agreements with OpenAI, positioning them as "friendly" sources of information for the AI.

Programmed Caution or Algorithmic Bias

This behavior is being compared to a rat in a maze that has learned to avoid specific paths that lead to an electric shock. The AI appears to have an ingrained directive to steer clear of content from its parent company's legal opponents, preferring a safer, more circuitous route to fulfill a user's request.

This raises fundamental questions about the neutrality of information presented by these AI agents. If the tool is designed to favor partners and avoid litigants, users may not be receiving information from the most direct or authoritative source. This could subtly shape a user's understanding of an issue by filtering it through the lens of secondary, and potentially partnered, outlets.

"The AI's journey is far from the most direct path. It favors a safer, more AI-friendly route, which has implications for how users receive information."

Implications for the Future of Information

As AI agents become more integrated into how we access and interact with the internet, their underlying behaviors and potential biases become critically important. The ability of these tools to mimic human users allows them to operate in a gray area of web protocols, blurring the line between a user's direct action and an automated task performed on their behalf.

The key issues moving forward include:

Transparency: Will users be informed when their AI agent cannot or will not access a primary source of information, and why?
Neutrality: Can an AI agent developed by a company with extensive commercial partnerships and legal battles provide unbiased access to the web?
Regulation: How will internet norms and regulations adapt to these sophisticated agents that can bypass long-standing rules for automated bots?

The behavior of ChatGPT Atlas suggests that the digital landscape is becoming more complex. The paths to information are no longer straightforward, and the tools we use to navigate them may have their own, hidden agendas shaped by the corporate realities of their creators.

Key Takeaways

OpenAI's browser, Atlas, demonstrates behavior that avoids direct access to news sites from companies suing OpenAI, such as The New York Times.
Instead of accessing the primary source, the AI gathers information from social media, citations, and other news outlets with existing OpenAI partnerships.
This AI agent operates by mimicking a human user, allowing it to bypass standard web protocols that block automated crawlers.
The behavior suggests a potential, built-in precautionary measure to navigate complex legal challenges surrounding AI and copyrighted content.

A New Kind of Web Navigation

The Legal Backdrop

The Labyrinthine Path to Information

Case Study: PCMag

Case Study: The New York Times

Programmed Caution or Algorithmic Bias

"The AI's journey is far from the most direct path. It favors a safer, more AI-friendly route, which has implications for how users receive information."

Implications for the Future of Information

The key issues moving forward include:

Transparency: Will users be informed when their AI agent cannot or will not access a primary source of information, and why?
Neutrality: Can an AI agent developed by a company with extensive commercial partnerships and legal battles provide unbiased access to the web?
Regulation: How will internet norms and regulations adapt to these sophisticated agents that can bypass long-standing rules for automated bots?

Key Takeaways

A New Kind of Web Navigation

The Legal Backdrop

The Labyrinthine Path to Information

Case Study: PCMag

Case Study: The New York Times

Programmed Caution or Algorithmic Bias

Implications for the Future of Information

Related Articles

South African AI Tool Fights Gender Violence

AI Can Order Pizza, But It's Not Ready to Replace You Yet

Nvidia Inks Major AI Chip Deal with South Korea

Musk's Grokipedia Launches Amid Plagiarism Accusations

Key Takeaways

A New Kind of Web Navigation

The Legal Backdrop

The Labyrinthine Path to Information

Case Study: PCMag

Case Study: The New York Times

Programmed Caution or Algorithmic Bias

Implications for the Future of Information