AI browser assistants — Atlas · Comet · Edge Copilot · Dia · Brave Leo · Chrome Gemini
AI browser assistants — from sidebars to agents
The flow of AI moving inside the browser progressed quickly between 2023 and 2025. It started with sidebar assistants, expanded into page summarization, translation, and drafting, and reached the place where agents handle pages on the user's behalf.
1. About these tools
ChatGPT Atlas — an AI-first browser OpenAI announced in October 2025. Chromium-based, launched on macOS first. Combines assistance like page summary, search, and drafts with an "agent mode" where it operates pages on the user's behalf. Tied to a ChatGPT account.
Comet — an AI browser Perplexity released in 2025. Bundles search, page assistance, and agent-style tasks in one place. macOS · Windows are mentioned, with Pro subscribers prioritized initially.
Gemini in Chrome — Gemini features integrated into Google's Chrome browser. Rolling out gradually, covering address bar, sidebar, and page summaries. Region and account conditions apply.
Microsoft Edge Copilot — Microsoft's Bing/Edge AI assistant introduced in 2023. Later unified as "Copilot" with sidebar, page summary, and writing assistance. Backend models in the OpenAI series.
Dia — an AI-first browser hinted at by The Browser Company (makers of Arc) in 2024. macOS first, designed separately from the existing Arc Max AI integration.
Brave Leo — Brave's built-in AI, GA in 2023. Page summaries, questions, and translation in the sidebar. Free model default + paid options. Local-model (Ollama) backend option.
Opera Aria — Opera's in-browser AI assistant from 2023. Sidebar form. OpenAI-based.
Vivaldi · Firefox — emphasizing user choice, adding multiple model assistants. Firefox is experimenting with a sidebar form where you pick from several LLMs.
2. The sidebar-assistant place
- Send current page text to the model for summary, translation, and Q&A.
- Use clipboard or drag selections as input.
- Writing assistance (email, document drafts).
- Summary of search results (the AI answer panel on search pages).
Browsers can pass the page DOM, extracted text, and rendered output straight to the model, which gives stronger context than a plain chatbot.
3. Agent mode
A mode that automates page-element clicks, inputs, and navigation. Relatively new and named differently per tool:
- ChatGPT Atlas's agent mode.
- Comet's automated tasks.
- Other tools' beta and experimental modes.
The higher the autonomy, the greater the risk of irreversible actions like payments, posts, and deletions. Usually mitigated with approval gates and domain allow-lists.
4. Model backend and data
- OpenAI series — Atlas · Edge Copilot.
- Google Gemini — Chrome.
- Anthropic Claude · Mistral, etc. — some Brave Leo options.
- Local models — LM Studio · Ollama integration in some tools.
Data policies (training use, retention) differ by tool and account tier. Read the terms.
5. Other paths
Extension side — places not bundled with the browser itself:
- Official extensions for ChatGPT, Claude, Gemini.
- Integration extensions like Sider, Monica, MaxAI, Glasp.
- Page summary, translation, notes.
Search-engine side — Perplexity, You.com, ChatGPT Search, Google AI Overviews. The line between browser assistance and search assistance is blurring.
Combining with automation tools — strands that put LLMs on libraries like Playwright and Browser-use to operate pages on the user's behalf. Custom agents (LangGraph · ADK) + browser tools.
6. Use places
- Summary — long articles, papers, documents.
- Translation — partial or whole pages.
- Drafts — comments, emails, social messages.
- Appointment / booking helper — form filling.
- Research helper — comparing multiple pages, generating tables.
- Shopping helper — comparing prices and reviews.
- Personal assistant — flow of work across multiple pages.
7. The activation place
- Region — some features available in stages by region.
- Account tier — free vs paid (ChatGPT Plus · Perplexity Pro · Edge Copilot Pro).
- OS — initially macOS only, later including Windows.
- Terms consent — options for training use and retention.
8. Privacy · data
- Training use — varies by tool and account tier. Find the opt-out location.
- Retention — retention period for chats and page text.
- Local model option — Brave Leo and others offer local backends optionally.
- Enterprise policy — separate policies and admin consoles for corporate use.
9. Common pitfalls
Permission scope — agent mode handling all page input and clicks collides with password managers, payments, and important tasks. Domain allow-list.
Indirect injection — hidden instructions in pages (small text, metadata) trying to alter model behavior. Design the trust boundary.
History exposure — page text remains in chat history and logs. An option to disable auto-assist on sensitive pages (banking, health).
Wrong auto-input — auto-fill in forms and payments may proceed by mistake. Confirm each time.
Multilingual drift — the model may mix Korean responses on English pages and vice versa.
Rapid policy change — data policies for new features change quarterly. Periodic check-ups.
Conflict with extensions — multiple AI extensions running on the same page muddle usability, cost, and log visibility.
Speed and cost — large page summaries use big tokens each time. Watch the auto-call triggers.
Closing thoughts
AI browser assistants started with sidebar summary and translation and are climbing toward higher autonomy in agent mode. As autonomy grows, the risks of permission scope, indirect injection, and wrong auto-input grow with it. Turning off auto-assist on sensitive pages and putting a user-approval gate on every irreversible action like payments and posts is the safe shape.
Next
- (end of agent-tooling)
We refer to ChatGPT Atlas · Perplexity Comet · Gemini in Chrome · Edge Copilot · Brave Leo · Dia · W3C WebExtensions · Browser-use · Playwright.