Bypassing bot defenses of chatbot apps via Chrome extension
Intro
Chatbot apps like ChatGPT, Claude, Grok, etc. create client-side defenses to prevent botting their service ie, automating the sending of prompts and reading of responses. Their defense can be implemented in 2 layers:
- Protocol-level: Cookie-based auth, client-minted tokens, and opaque protocols.
- Environment-level: Fingerprinting, behavioral signals.
This article describes how to break #1 after investigating 10 of the most popular chatbot apps. A Chrome extension is the tool of choice to do the offensive work since it has all privileges to break #1 and evades #2 for free since it executes in the context of a real user.
The diagnostic
Open DevTools on the provider's chat page and send a message. Watch the Network tab and look for three things:
- Replay the request with just cookies
Copy it as cURL, strip every header except Cookie, and fire it. If it works, we have Problem A: Cookie-based authentication.
- If 403, check which headers are missing
Look for short-lived tokens headers named things like Sentinel-, x-statsig-id, cf-turnstile-, or opaque base64 blobs that change every request. These are client-minted anti-bot tokens. We have Problem B: Token Gating on top of Problem A.
- Check if the protocol is opaque or stateful
- the transport is a WebSocket with a server-tracked handshake
- the payload is opaque (protobuf, custom binary framing)
- the request requires some state accumulated across page bootstrap
We have Problem C: Protocol Opacity. We might also have Problem B if tokens are flowing through JS alongside the opaque protocol, which can be seen after solving C.
The defenses are composable. Most providers present one or two; ChatGPT for example presents all three.
Problem A: Cookie-based authentication
The defense
The provider's web UI talks to an internal HTTP/SSE API that authenticates by reading the session cookie.
The weakness
Cookies are bearer tokens. Whoever holds the cookie is the user. An extension declared with host_permissions for the provider's API host gets the cookie attached to its fetches automatically, without even explicitly directing its extraction. The request is byte-for-byte indistinguishable from the page's own.
The offense
Call the provider's internal API directly with credentials: 'include' from any extension-privileged context. The browser handles cookies. The work is:
- Watch the user's real page to identify the endpoint, auth headers (Bearer tokens, org IDs, workspace IDs), and response format.
- Construct that same request from the extension context.
- Parse the response stream.
The Chrome capability
host_permissions in the manifest declares the API origin. fetch(..., { credentials: 'include' }) makes the call. Chrome attaches the host's cookies to any
extension fetch and includes SameSite=Lax cookies that would block a cross-origin XHR from a normal webpage.
Problem B: Token gating
The defense
The provider requires every API request to carry a short-lived token that can only be produced by running their client-side JavaScript inside a real browser.
Three common methods:
- Proof-of-work tokens
Server issues a random seed; the client hashes until it finds an input whose digest has N leading zeros. Each bot prompt costs measurable CPU, making mass scraping uneconomical.
- CAPTCHA-derived tokens
Cloudflare Turnstile or a similar provider runs invisible behavioral fingerprinting and mints a one-shot token.
- Opaque fingerprint headers
Proprietary methods like Grok’s x-statsig-id, minted by the Statsig SDK. Outside of chatbots, TikTok's msToken and X-Bogus, Reddit's session-scoped headers are some famous public examples. There are entire reverse-engineering communities around replicating them.
All three share the same structure ie, the token is generated by some JS code which can be obfuscated, rotated, lazy-loaded it from a CDN, hid behind closures, etc. It has a short TTL and the server validates it before processing the request.
The weakness
The token-mint function has to execute in the browser. Whatever runs in the same JS context can observe it. No matter what the provider does, they cannot prevent code running alongside theirs from capturing the token the moment it's minted.
The offense
The provider's page needs to be running somewhere so that we can inject into its JS context. There are 2 ways to achieve this:
- The user keeps the provider's page open in a tab. Inject directly into that tab.
- If #1 is not possible, it is required to create a hidden execution environment (see Step 1 of Problem C)
Once the JS context is available:
- Inject into the page's own JavaScript
MAINworld atdocument_start. Now the code we write in the content script would be in the same window object the page sees. This must happen before the page's scripts execute.
- Hook every function the token might flow through, adding code to steal it.
Headers.prototype.setandHeaders.prototype.append(modern code constructs a Headers object separately and passes it to fetch, patching Headers.prototype.set catches every header-set call regardless of transport)window.fetchXMLHttpRequest.prototype.open/send/setRequestHeader(some bot-check libraries deliberately use XHR to avoid fetch interception)- WebSocket constructor (for providers whose tokens flow over WS)
Alternatively, some tokens land in cookies rather than headers (often with auto-rotation). setInterval polling document.cookie every second picks this up without overriding anything.
- Bridge captured tokens back to the extension context. Since
MAINworld scripts can't callchrome.runtime.sendMessagedirectly, we make theMAINworld script callwindow.postMessage({ type: 'TOKEN_CAPTURED', value: ... }, '*')and anISOLATEDworld content script listens and forwards viachrome.runtime.sendMessage.
The extension now holds a stolen, fresh token. It can construct its own request with cookies (Problem A) plus the token as a header and gets the same access the real page would. Note that these tokens will most likely expire fast or after a single-use so this must run as a steady-state pipeline, not a one-shot.
The Chrome capability
chrome.scripting.executeScript({ world: 'MAIN' }) for runtime injection, and manifest content_scripts with "world": "MAIN" and "run_at": "document_start" for
timing-critical early hooks. This is the only sanctioned way for a third party to inject code into the page's own JS context.
Playwright's addInitScript and Puppeteer's evaluateOnNewDocument are the same primitive but they require a separate browser instance, detectable via navigator.webdriver === true and headless signatures.
Problem C: Protocol opacity
The defense
Even with cookies and fresh tokens, a synthesized request fails because:
- The request must come from
Origin: https://<provider>.comandSec-Fetch-Site: same-origin.
- The payload is protobuf or custom binary. Without a schema, mapping field numbers to semantics takes a lot of time and effort (even with AI).
- The fingerprint is computed across multiple round-trips against state the page itself accumulated.
- The transport is a stateful WebSocket with a server-tracked handshake.
The weakness
The defense assumes running a full browser session covertly is infeasible, but Chrome’s offscreen-document API enables a hidden, JS-executing DOM with access to the user's cookie jar, running inside the user's real Chrome process. From the provider's perspective, every signal is a real authenticated user.
The offense
- Create a hidden, persistent DOM context.
chrome.offscreen.createDocument({ url, reasons: ['IFRAME_SCRIPTING'], justification }). Note that only one offscreen document per extension is allowed, which forces a multiplexing pattern ie, a single offscreen page hosts aMap<string, HTMLIFrameElement>and routespostMessagetraffic by provider origin.
- Embed the provider's site as an iframe:
- Most providers set
X-Frame-Options: DENYandContent-Security-Policy: frame-ancestors 'none'to prevent embedding but we can strip them viachrome.declarativeNetRequestrules scoped toresourceTypes: ["sub_frame"]. It is good security practice (from the user’s perspective) to toggle the DNR rules on only when injecting a prompt, and turn it back off after receiving the responses. - Once loaded, all the provider's defenses pass automatically and it has no signal distinguishing this iframe from a normal tab.
- Drive the iframe by injecting a
MAINworld script that hooks the page's transport: - HTTP-based providers: Hook
window.fetch, intercept calls to the provider's API endpoint, inject your own requests through the same function. The request inherits the page's origin, cookies, and session state automatically. - WebSocket-based providers: Capture the live instance, hook send and message. All frames are sent over the page's own authenticated connection.
The Chrome capability
chrome.offscreen.createDocument() + chrome.declarativeNetRequest rulesets + chrome.scripting.executeScript({ world: 'MAIN' }) inside the iframe. The combination creates a hidden, JS-executing, cookie-bearing, code-injectable browser session inside the user's real Chrome.
Parsing the response
With the strategies above, we can successfully build a bot that reliably injects prompts into a Chatbot app and capture the responses. These responses usually come in one of 4 standard types, and are fairly easy to build a parser for:
- SSE with event-delimited blocks
- SSE but nests JSON-in-JSON inside the data
- NDJSON
- WebSocket frames
Cheatsheet
| Problem(s) | Strategy |
|---|---|
| A only | Direct API call with cookies. |
| A + B | Steal tokens from the page's JS, replay via direct API call. |
| A + C | Run the provider's page in a hidden iframe, drive the transport. |
| A + B + C | Hidden iframe as minting substrate for tokens, replay via direct API. |