Bypassing bot defenses of chatbot apps via Chrome extension


Intro

Chatbot apps like ChatGPT, Claude, Grok, etc. create client-side defenses to prevent botting their service ie, automating the sending of prompts and reading of responses. Their defense can be implemented in 2 layers:

  1. Protocol-level: Cookie-based auth, client-minted tokens, and opaque protocols.
  1. Environment-level: Fingerprinting, behavioral signals.

This article describes how to break #1 after investigating 10 of the most popular chatbot apps. A Chrome extension is the tool of choice to do the offensive work since it has all privileges to break #1 and evades #2 for free since it executes in the context of a real user.


The diagnostic

Open DevTools on the provider's chat page and send a message. Watch the Network tab and look for three things:

  1. Replay the request with just cookies
    1. Copy it as cURL, strip every header except Cookie, and fire it. If it works, we have Problem A: Cookie-based authentication.

  1. If 403, check which headers are missing
    1. Look for short-lived tokens headers named things like Sentinel-, x-statsig-id, cf-turnstile-, or opaque base64 blobs that change every request. These are client-minted anti-bot tokens. We have Problem B: Token Gating on top of Problem A.

  1. Check if the protocol is opaque or stateful
    1. the transport is a WebSocket with a server-tracked handshake
    2. the payload is opaque (protobuf, custom binary framing)
    3. the request requires some state accumulated across page bootstrap
    4. We have Problem C: Protocol Opacity. We might also have Problem B if tokens are flowing through JS alongside the opaque protocol, which can be seen after solving C.

The defenses are composable. Most providers present one or two; ChatGPT for example presents all three.


Problem A: Cookie-based authentication

The defense

The provider's web UI talks to an internal HTTP/SSE API that authenticates by reading the session cookie.

 

The weakness

Cookies are bearer tokens. Whoever holds the cookie is the user. An extension declared with host_permissions for the provider's API host gets the cookie attached to its fetches automatically, without even explicitly directing its extraction. The request is byte-for-byte indistinguishable from the page's own.

 

The offense

Call the provider's internal API directly with credentials: 'include' from any extension-privileged context. The browser handles cookies. The work is:

  1. Watch the user's real page to identify the endpoint, auth headers (Bearer tokens, org IDs, workspace IDs), and response format.
  1. Construct that same request from the extension context.
  1. Parse the response stream.
 

The Chrome capability

host_permissions in the manifest declares the API origin. fetch(..., { credentials: 'include' }) makes the call. Chrome attaches the host's cookies to any extension fetch and includes SameSite=Lax cookies that would block a cross-origin XHR from a normal webpage.


Problem B: Token gating

The defense

The provider requires every API request to carry a short-lived token that can only be produced by running their client-side JavaScript inside a real browser.

Three common methods:

  • Proof-of-work tokens
    • Server issues a random seed; the client hashes until it finds an input whose digest has N leading zeros. Each bot prompt costs measurable CPU, making mass scraping uneconomical.

  • CAPTCHA-derived tokens
    • Cloudflare Turnstile or a similar provider runs invisible behavioral fingerprinting and mints a one-shot token.

  • Opaque fingerprint headers
    • Proprietary methods like Grok’s x-statsig-id, minted by the Statsig SDK. Outside of chatbots, TikTok's msToken and X-Bogus, Reddit's session-scoped headers are some famous public examples. There are entire reverse-engineering communities around replicating them.

All three share the same structure ie, the token is generated by some JS code which can be obfuscated, rotated, lazy-loaded it from a CDN, hid behind closures, etc. It has a short TTL and the server validates it before processing the request.

 

The weakness

The token-mint function has to execute in the browser. Whatever runs in the same JS context can observe it. No matter what the provider does, they cannot prevent code running alongside theirs from capturing the token the moment it's minted.

 

The offense

The provider's page needs to be running somewhere so that we can inject into its JS context. There are 2 ways to achieve this:

  1. The user keeps the provider's page open in a tab. Inject directly into that tab.
  1. If #1 is not possible, it is required to create a hidden execution environment (see Step 1 of Problem C)
 

Once the JS context is available:

  1. Inject into the page's own JavaScript MAIN world at document_start. Now the code we write in the content script would be in the same window object the page sees. This must happen before the page's scripts execute.
  1. Hook every function the token might flow through, adding code to steal it.
      • Headers.prototype.set and Headers.prototype.append (modern code constructs a Headers object separately and passes it to fetch, patching Headers.prototype.set catches every header-set call regardless of transport)
      • window.fetch
      • XMLHttpRequest.prototype.open / send / setRequestHeader (some bot-check libraries deliberately use XHR to avoid fetch interception)
      • WebSocket constructor (for providers whose tokens flow over WS)

      Alternatively, some tokens land in cookies rather than headers (often with auto-rotation). setInterval polling document.cookie every second picks this up without overriding anything.

  1. Bridge captured tokens back to the extension context. Since MAIN world scripts can't call chrome.runtime.sendMessage directly, we make the MAIN world script call window.postMessage({ type: 'TOKEN_CAPTURED', value: ... }, '*') and an ISOLATED world content script listens and forwards via chrome.runtime.sendMessage.
 

The extension now holds a stolen, fresh token. It can construct its own request with cookies (Problem A) plus the token as a header and gets the same access the real page would. Note that these tokens will most likely expire fast or after a single-use so this must run as a steady-state pipeline, not a one-shot.

 

The Chrome capability

chrome.scripting.executeScript({ world: 'MAIN' }) for runtime injection, and manifest content_scripts with "world": "MAIN" and "run_at": "document_start" for timing-critical early hooks. This is the only sanctioned way for a third party to inject code into the page's own JS context.

 
💡

Playwright's addInitScript and Puppeteer's evaluateOnNewDocument are the same primitive but they require a separate browser instance, detectable via navigator.webdriver === true and headless signatures.


Problem C: Protocol opacity

The defense

Even with cookies and fresh tokens, a synthesized request fails because:

  • The request must come from Origin: https://<provider>.com and Sec-Fetch-Site: same-origin.
  • The payload is protobuf or custom binary. Without a schema, mapping field numbers to semantics takes a lot of time and effort (even with AI).
  • The fingerprint is computed across multiple round-trips against state the page itself accumulated.
  • The transport is a stateful WebSocket with a server-tracked handshake.
 

The weakness

The defense assumes running a full browser session covertly is infeasible, but Chrome’s offscreen-document API enables a hidden, JS-executing DOM with access to the user's cookie jar, running inside the user's real Chrome process. From the provider's perspective, every signal is a real authenticated user.

 

The offense

  1. Create a hidden, persistent DOM context. chrome.offscreen.createDocument({ url, reasons: ['IFRAME_SCRIPTING'], justification }). Note that only one offscreen document per extension is allowed, which forces a multiplexing pattern ie, a single offscreen page hosts a Map<string, HTMLIFrameElement> and routes postMessage traffic by provider origin.
  1. Embed the provider's site as an iframe:
      • Most providers set X-Frame-Options: DENY and Content-Security-Policy: frame-ancestors 'none' to prevent embedding but we can strip them via chrome.declarativeNetRequest rules scoped to resourceTypes: ["sub_frame"]. It is good security practice (from the user’s perspective) to toggle the DNR rules on only when injecting a prompt, and turn it back off after receiving the responses.
      • Once loaded, all the provider's defenses pass automatically and it has no signal distinguishing this iframe from a normal tab.
  1. Drive the iframe by injecting a MAIN world script that hooks the page's transport:
      • HTTP-based providers: Hook window.fetch, intercept calls to the provider's API endpoint, inject your own requests through the same function. The request inherits the page's origin, cookies, and session state automatically.
      • WebSocket-based providers: Capture the live instance, hook send and message. All frames are sent over the page's own authenticated connection.
 

The Chrome capability

chrome.offscreen.createDocument() + chrome.declarativeNetRequest rulesets + chrome.scripting.executeScript({ world: 'MAIN' }) inside the iframe. The combination creates a hidden, JS-executing, cookie-bearing, code-injectable browser session inside the user's real Chrome.


Parsing the response

With the strategies above, we can successfully build a bot that reliably injects prompts into a Chatbot app and capture the responses. These responses usually come in one of 4 standard types, and are fairly easy to build a parser for:

  • SSE with event-delimited blocks
  • SSE but nests JSON-in-JSON inside the data
  • NDJSON
  • WebSocket frames

Cheatsheet

Problem(s)Strategy
A only Direct API call with cookies.
A + B Steal tokens from the page's JS, replay via direct API call.
A + C Run the provider's page in a hidden iframe, drive the transport.
A + B + C Hidden iframe as minting substrate for tokens, replay via direct API.