The browser is no longer just a window to the web. It's becoming an AI agent that browses on your behalf.
In the space of 15 months, we've gone from Anthropic demonstrating computer use as a research preview to Google building agentic features into the world's most popular browser. Every major tech company now has some form of AI-powered browser automation, whether as a consumer product, developer tool, or enterprise API.
This is the complete guide to the agentic browser landscape as of January 2026. It expands on the timeline I presented in my "The Jungle of Optimizing for AI Agents" keynote at Conversion Hotel.
Contents
- Timeline: The Rise of Agentic Browsers
- Consumer AI Browsers
- Developer Tools & Frameworks
- Enterprise & API Solutions
- The Foundation: Model Context Protocol
- Notable Absences
- What This Means for Your Website
- What Comes Next
Timeline: The Rise of Agentic Browsers
Before diving into specific products, here's how we got here:
| Date | Milestone |
|---|---|
| Oct 2024 | Anthropic launches Computer Use public beta |
| Nov 2024 | Anthropic releases Model Context Protocol (MCP) |
| Dec 2024 | Google announces Project Mariner research prototype |
| Jan 2025 | OpenAI launches Operator with Computer-Using Agent |
| Mar 2025 | Amazon introduces Nova Act browser automation SDK |
| Mar 2025 | Microsoft releases Playwright MCP |
| Apr 2025 | Microsoft announces Copilot Studio Computer Use |
| May 2025 | Genspark launches AI browser with on-device models |
| Jun 2025 | The Browser Company launches Dia with AI features |
| Jul 2025 | Perplexity launches Comet browser |
| Jul 2025 | Microsoft releases Edge Copilot Mode |
| Aug 2025 | Anthropic releases Claude for Chrome preview |
| Sep 2025 | Opera launches Neon agentic browser |
| Sep 2025 | Atlassian acquires The Browser Company |
| Oct 2025 | OpenAI launches ChatGPT Atlas browser |
| Nov 2025 | Manus Browser Operator launches for Chrome and Edge |
| Dec 2025 | Google debuts Disco experimental AI browser |
| Jan 2026 | Chrome ships Gemini auto browse to all users |
The pace is accelerating. What started as research demos became developer tools, then consumer products.
Consumer AI Browsers
These are the products regular users can download and use today. The landscape breaks into two categories: standalone AI-native browsers and AI features added to existing browsers.
Standalone AI Browsers
Perplexity Comet
Perplexity's entry into the browser market came in July 2025. Comet combines their search-focused AI with full browser capabilities. You can ask questions naturally, and the browser handles the research, visiting multiple sites and synthesizing information.
The agentic features go beyond search. Comet can fill forms, compare products across sites, and complete basic transactions. It's free, which makes it the most accessible entry point for users curious about agentic browsing.
ChatGPT Atlas (OpenAI)
OpenAI launched Atlas in October 2025 as a dedicated browser product. The key feature is Agent Mode, which allows the browser to execute multi-step tasks autonomously. Ask it to "find and compare flight prices to Tokyo for next month," and it opens tabs, navigates airline sites, extracts pricing, and presents a comparison.
Atlas requires a ChatGPT subscription. The agentic features build on the Computer-Using Agent technology OpenAI first introduced with Operator in January 2025.
Dia (The Browser Company / Atlassian)
Dia launched in mid-2025 from The Browser Company, the team behind Arc. The browser was designed from the ground up with AI assistance at its core, not bolted on afterward.
Things changed in September 2025 when Atlassian acquired The Browser Company. The acquisition announcement framed Dia as becoming "the browser optimized for knowledge workers," with planned enterprise features and Atlassian integration. The browser remains available, though its roadmap is now tied to Atlassian's enterprise focus.
Fellou
Fellou differentiates itself through transparency and control. Where other agentic browsers operate as black boxes, Fellou lets you visually inspect and edit its planned workflow before execution. You can intervene at any step, which addresses a common concern about autonomous agents taking unintended actions.
The browser handles logged-in sessions across platforms like Salesforce, LinkedIn, and Reddit, making it practical for research workflows that require authenticated access. Its "agentic memory" feature learns from your browsing history and notes to provide contextual assistance without repeated prompting.
Genspark
Genspark launched in May 2025 with a distinctive proposition: on-device AI models that run locally without internet connectivity. The browser includes over 169 open-weight models from providers like OpenAI, Google, and Meta, all running on your machine rather than in the cloud.
The agentic capabilities include Autopilot Mode for autonomous browsing and a Super Agent that can make phone calls, book reservations, and draft emails based on your calendar. Genspark also features an MCP Store with over 700 tool integrations. The company raised $160 million and reached a $530 million valuation, though a September 2025 security analysis flagged concerns about its vulnerability to compromised web pages.
Sigma AI Browser
Sigma AI Browser takes a privacy-first approach to agentic browsing. Its SigmaGPT assistant runs locally by default, with no tracking or cloud dependency. The browser offers full agentic capabilities including logging into sites, filling forms, extracting data, and executing multi-step tasks.
What sets Sigma apart is accessibility: the agentic features are completely free, and the browser runs on Windows, macOS, Linux, Android, and iOS. For users who want to experiment with agentic browsing without subscriptions or waitlists, Sigma provides a low-barrier entry point.
AI Features in Existing Browsers
Chrome + Gemini
Google's biggest move came January 28, 2026, with Chrome auto browse. The feature, powered by Gemini 3, turns Chrome into an autonomous agent that can scroll, click, type, and navigate on your behalf.
Auto browse is available to Google AI Pro and AI Ultra subscribers in the US. Given Chrome's 3 billion user base, this represents the largest deployment of agentic browser technology to date. I wrote about the implications for website owners in Chrome Just Became an AI Agent.
Google Disco
While Chrome got Gemini integration, Google's more experimental work is happening in Disco, a separate browser launched in December 2025 through Google Labs.
Disco takes a fundamentally different approach. Rather than adding AI to traditional browsing, it generates custom web applications from your open tabs. The feature, called GenTabs, analyzes what you're working on and creates interactive tools to help. Planning a trip? Disco builds a custom travel planner with maps and booking links. Researching a topic? It generates a structured dashboard pulling from all your sources.
The browser removes the traditional URL bar entirely, replacing it with a prompt composer. It's currently waitlist-only and macOS-only, serving as Google's testing ground for ideas that may eventually reach Chrome.
Edge Copilot Mode
Microsoft launched Copilot Mode for Edge in July 2025. The feature differentiates itself with multi-tab context awareness. Copilot can see all your open tabs, understanding the full context of what you're researching to provide better assistance.
Current capabilities include natural voice navigation and integrated sidebar assistance. The announcement promised "Advanced Actions" for complex tasks like booking reservations, though these remain in preview. Copilot Mode is free and opt-in, available on Edge for Windows and Mac.
Claude for Chrome
Anthropic's approach differs from the others. Rather than building a full browser, they released Claude for Chrome as an extension that brings Claude's capabilities directly into your existing browser.
The extension launched in August 2025 as a limited preview for Max plan subscribers, expanding to Pro, Team, and Enterprise plans in December 2025. It can take actions on websites, fill forms, and integrate with Claude Code for debugging workflows.
Claude for Chrome puts significant emphasis on security, with site-level permission controls and action confirmations for sensitive operations. Anthropic published their work reducing prompt injection attack success rates from 23.6% to 11.2%.
Brave Leo
Brave's Leo AI assistant has been around since 2023, offering chat capabilities, page summarization, and content generation. It's a conversational tool, not an autonomous agent. Leo remains free for basic use, with premium tiers for more capable models.
Opera AI
Opera upgraded its built-in AI in October 2025, but the focus remains on assistance rather than automation. You can chat with pages and get summaries, but Opera doesn't offer the autonomous browsing capabilities of competitors. The AI features are free.
Opera Neon
Opera's experimental work happens in Opera Neon, a separate browser that launched in September 2025 and went public in December at $19.90 per month.
Unlike Opera AI's conversational assistant, Neon is built for autonomous action. It includes four specialized agents: Neon Do for web automation tasks, Neon Make for generating code and creative content, ODRA for deep research, and a standard chat interface. The browser integrates leading models including Gemini 3 Pro and GPT-5.1.
Opera positions Neon as its testing ground for agentic features before they reach mainstream products. Some of Neon's underlying architecture has already made its way into Opera One, delivering 20% faster AI responses.
Consumer Browser Comparison
| Browser | Agentic Features | Free Tier | Platform |
|---|---|---|---|
| Perplexity Comet | Full | Yes | Standalone |
| ChatGPT Atlas | Full (Agent Mode) | No | Standalone |
| Chrome + Gemini | Full (auto browse) | Limited | Extension of Chrome |
| Edge Copilot Mode | Partial | Yes | Extension of Edge |
| Claude for Chrome | Full | No | Chrome extension |
| Dia | Full | Yes | Standalone |
| Fellou | Full | Unknown | Standalone |
| Genspark | Full (Autopilot) | Limited | Standalone |
| Sigma AI Browser | Full | Yes | Standalone |
| Brave Leo | No | Yes | Extension of Brave |
| Opera AI | No | Yes | Extension of Opera |
| Opera Neon | Full | No ($20/mo) | Standalone |
Developer Tools & Frameworks
For developers building browser automation into their applications, the landscape includes open-source libraries, MCP servers, and cloud infrastructure.
Open-Source Libraries
browser-use
The browser-use library has become the go-to open-source solution for AI-powered browser automation. It provides a Python and TypeScript SDK for building agents that can interact with websites using LLM-powered decision making.
The ecosystem includes stealth browser technology for bypassing anti-bot systems, session management for authenticated workflows, and both self-hosted and cloud deployment options. The library works with multiple LLM providers and has spawned a community of custom agents and integrations.
Stagehand (Browserbase)
Stagehand positions itself as "an OSS alternative to Playwright that's easier to use and lets AI reliably read and write on the web." Built by Browserbase, it combines the predictability of traditional automation with AI adaptability.
The key feature is natural language commands. Instead of writing selectors and click handlers, you describe what you want to happen. Stagehand's self-healing capabilities mean scripts continue working even when websites change their markup.
Skyvern
Skyvern focuses on enterprise automation use cases. It's a Y Combinator company building AI agents for tasks like form filling, data extraction, and workflow automation. The platform emphasizes reliability and accuracy for business-critical processes.
AgentQL and Notte
The space includes several other notable libraries. AgentQL provides a query language specifically designed for AI agents to extract structured data from web pages. Notte focuses on research agent workflows, helping developers build systems that can gather and synthesize information across multiple sources.
MCP Servers
The Model Context Protocol has become the standard for connecting AI models to external tools, including browser automation.
Microsoft Playwright MCP
Microsoft released the official Playwright MCP server in March 2025. This provides browser automation capabilities through the MCP standard, making it compatible with any AI system that supports the protocol.
The implementation uses accessibility snapshots rather than screenshots, which means it works with non-vision models and provides faster, more reliable automation. Published as @playwright/mcp on npm.
Community MCP Servers
The MCP ecosystem includes multiple community-built browser automation servers. These range from Puppeteer-based implementations to specialized servers for specific use cases like web scraping or form automation. The MCP server directory catalogs available options.
Cloud Browser Infrastructure
Running browser automation at scale requires infrastructure. Several companies provide cloud browsers specifically designed for AI agents.
Browserbase
The company behind Stagehand also provides cloud browser infrastructure. Browserbase offers headless browsers with anti-detection features, proxy rotation, and session management designed for AI agent workloads.
Browserless and Steel
Browserless provides headless Chrome as a service, focusing on reliability and scale for automation workloads. Steel Browser emphasizes stealth capabilities for workflows that need to avoid bot detection.
Hyperbrowser
Hyperbrowser offers managed browser infrastructure with a focus on AI agent use cases, including built-in LLM integration and natural language automation APIs.
Enterprise & API Solutions
For businesses that need browser automation integrated into their systems, several companies offer API-first solutions.
Big Tech APIs
Anthropic Claude Computer Use
Anthropic's Computer Use API launched in October 2024 as the first major commercial offering in this space. Claude can control computer interfaces through screenshots and input commands, enabling automation of any desktop or web application.
The API is available through Anthropic's platform, Amazon Bedrock, and Google Cloud's Vertex AI. Computer Use remains in beta, with Anthropic advising developers to start with low-risk tasks.
Google Project Mariner
Project Mariner is Google's research prototype for browser automation. Currently available to Google AI Ultra subscribers in the US, with capabilities coming to the Gemini API for developers.
Mariner handles tasks like finding job listings, hiring service providers, and ordering groceries by interacting with websites autonomously. Google positions it as research into human-agent interaction rather than a finished product.
OpenAI Computer-Using Agent
OpenAI's Computer-Using Agent (CUA) powers both Operator and Atlas. It achieved 87% on the WebVoyager benchmark, one of the highest published scores for web automation tasks.
Microsoft Copilot Studio Computer Use
Microsoft announced Computer Use for Copilot Studio in April 2025. The feature allows Copilot Studio agents to interact with any application through its graphical interface, bridging the gap between AI assistants and legacy enterprise software.
The implementation runs on Microsoft-hosted infrastructure, keeping enterprise data within Microsoft Cloud boundaries. Target use cases include automated data entry, invoice processing, and market research.
Amazon Nova Act
Amazon launched Nova Act in March 2025 as an SDK for building browser agents. The model excels at web interaction tasks, achieving 0.939 on the ScreenSpot Web Text benchmark (compared to 0.900 for Claude and 0.883 for OpenAI CUA).
Nova Act integrates with Playwright for browser control and supports Python workflows with API calls and direct browser manipulation.
Specialized Platforms
MultiOn
MultiOn provides an API for web automation with a focus on reliability and scale. Their agents can handle complex multi-step workflows across websites, with built-in handling for authentication, CAPTCHAs, and dynamic content.
Airtop
Airtop offers browser automation infrastructure with AI integration, targeting enterprise use cases that require high reliability and compliance controls.
Manus Browser Operator
Manus Browser Operator takes a different approach. Rather than running in the cloud, it operates as a browser extension that controls your local browser. This gives it access to your authenticated sessions and trusted IP address, avoiding login prompts and CAPTCHA interruptions.
The extension launched in November 2025 for Chrome and Edge, with full user control over when and how automation runs.
The Foundation: Model Context Protocol
Connecting all of this is MCP, the Model Context Protocol.
Anthropic released MCP in November 2024 as an open standard for connecting AI models to external data sources and tools. Think of it as a universal adapter that lets any AI system talk to any tool or service through a consistent interface.
For browser automation specifically, MCP provides a standardized way for AI models to control browsers without each integration being custom-built. Microsoft's Playwright MCP is the canonical example: any MCP-compatible AI assistant can use it for browser automation.
The protocol has gained significant adoption. ChatGPT, Claude, Gemini, Cursor, VS Code, and GitHub Copilot all support MCP. The SDK sees over 97 million monthly downloads as of late 2025.
In December 2025, Anthropic donated MCP to the Linux Foundation, signaling its transition from a company project to an industry standard. This makes it a safe bet for developers building browser automation, as the protocol won't be subject to single-company control.
Notable Absences
Not everyone is joining the agentic browser movement.
Vivaldi maintains an explicitly anti-AI stance, focusing on user privacy and customization over AI features. Their CEO has been vocal about concerns with AI data practices.
Firefox takes a cautious approach. Mozilla announced an "AI Kill Switch" feature, letting users disable AI features entirely. They're exploring AI assistance but prioritizing user control over autonomous capabilities.
Safari has delayed its AI overhaul according to Bloomberg reporting, with Siri and Safari AI features reportedly on hold. Apple's privacy-first positioning may limit how aggressive they can be with agentic features that require cloud processing.
What This Means for Your Website
Here's where this connects to everyone who runs a website.
These agents aren't just demos or developer tools anymore. Chrome auto browse alone means billions of potential agent visits. Add Atlas, Comet, and all the developer tools building on this infrastructure, and agent traffic is becoming a meaningful percentage of web interactions.
The sites that work well with these agents will get included in agentic workflows. The sites that don't will see agents fail, give up, and go to competitors.
What Helps Agent Browsing
- Semantic HTML: Use proper elements. Buttons should be
<button>, not<div onclick>. - Clear labels: Form inputs need labels. Buttons need descriptive text.
- Logical structure: Navigation that makes sense. Headings that establish hierarchy.
- Accessible design: Sites that work with screen readers generally work with AI agents.
- Server-rendered content: Critical information should be in the HTML, not loaded by JavaScript.
What Breaks Agent Browsing
- Aggressive anti-bot measures: CAPTCHAs on every interaction. IP blocking.
- Mouse-only interactions: Hover states, drag-and-drop without keyboard alternatives.
- Infinite scroll without pagination: Agents need to know when they've reached the end.
- Content behind unlabeled buttons: "Show more" that doesn't indicate what it shows.
- Heavy client-side rendering: Blank pages until JavaScript executes.
Practical Steps
- Test with a screen reader. If VoiceOver or NVDA can navigate your site, agents probably can too. The text-only Lynx browser is another useful check for how agents might parse your content.
- Check your source HTML. View source on critical pages. Is the important information there?
- Add llms.txt. A simple markdown file that helps AI agents understand your site's purpose.
- Review your bot policies. Make sure you're not blocking legitimate AI crawlers.
- Run Glimpse. Glimpse by Web Performance Tools shows you how AI agents see your page.
I cover these topics in depth in What is Agent Experience Optimization.
What Comes Next
The agentic browser landscape will continue consolidating. The current fragmentation, with dozens of tools and competing approaches, isn't sustainable. MCP is emerging as the unifying standard, and the major platforms are all converging on similar capabilities.
The distinction between "browser" and "AI assistant" is blurring. When Chrome can complete tasks autonomously, when ChatGPT can browse the web on your behalf, the traditional concept of a browser as a passive viewing tool feels outdated.
For website owners, the message is clear: your site now has two audiences, humans and agents. Optimizing for both isn't optional anymore. The agents are already here, and their numbers are growing.
2026 is the year agents went mainstream. The infrastructure exists, the products are shipping, and the rollouts are accelerating. As I keep saying, your next million visitors won't be human.
The question is whether your website is ready for them.

