JustProxies

Reference

Bandwidth budget calculator

4-min readbandwidth-budget
Bandwidth is the only metered resource on rotating plans. Knowing how much a typical job consumes before you start prevents mid-run surprises. The numbers here are empirical averages — your actual usage will vary by target and whether you fetch full pages or API endpoints.

How bandwidth is counted

We count bytes transferred between the proxy gateway and the target (the upstream leg). That includes:

  • The HTTP request headers and body you send.
  • The HTTP response headers and body you receive.
  • TLS handshake overhead on HTTPS connections.

We do not count the connection between your client and the gateway (the downstream leg). Failed requests that produce no upstream bytes — 407 auth errors, for example — cost nothing.

Redirects count. A 301 redirect to HTTPS is a separate upstream request; a 302 to a different page fetches that page. If your target redirects aggressively, factor in 1–2 extra round-trips per initial URL.

Typical request sizes

Request typeTypical sizeNotes
Lightweight API call (JSON)2–15 KBREST endpoints, price tickers, status checks
SERP result page80–200 KBGoogle, Bing — HTML only, no images
E-commerce product page150–400 KBHTML + inline JSON; varies hugely by site
Social media profile200–600 KBHeavy on inline JS and data payloads
News / blog article50–150 KBMostly text; can spike with embeds
Full browser page (Playwright)1–5 MBHTML + JS + CSS + images + fonts
Image / asset fetch10 KB – 2 MBAvoid unless you actually need the image

Per-task estimates

Rules of thumb for common scraping jobs. These assume bare HTTP requests (no headless browser) and no image fetching:

TaskEst. per 1,000 pagesEst. per 1,000,000 pages
SERP scraping (1 query = 1 page of results)~0.1 GB~100 GB
E-commerce product pages~0.3 GB~300 GB
Social media profiles~0.4 GB~400 GB
Lightweight JSON API calls~0.01 GB~10 GB
Real estate / job listings~0.2 GB~200 GB
Playwright full-page renders~2 GB~2 TB
Run a pilot batch of 100 requests, measure the total bytes in your proxy client, then extrapolate. That gives a project-specific number far more accurate than any generic estimate.

Reducing bandwidth spend

Small changes can cut spend significantly:

  • Fetch API endpoints, not pages — many e-commerce and social sites expose JSON APIs that return structured data in 5–20 KB instead of full HTML pages at 200–500 KB.
  • Block assets in headless browsers — images, fonts, and tracking scripts are the bulk of a full-page load. Block them in Playwright with route to cut browser-mode bandwidth by 60–80%.
  • Accept-Encoding: gzip — most HTTP clients send this automatically. HTML gzip compresses 5–10×. Verify your client is sending this header and the server is responding with Content-Encoding: gzip.
  • Fetch only what you need — if you only need the first product on a page, stop reading the response body after you've parsed it. Most HTTP clients support aborting the download mid-stream.
  • Cache aggressively on your end — pages you've already scraped don't need to be re-fetched unless the data is time-sensitive. ETag or Last-Modified headers can help, but for most scraping workloads a simple timestamp-based cache is enough.
  • Retry fewer times on hopeless targets — retrying a persistent 403 four times costs 4× the bandwidth for zero additional data. Detect quickly and move on.
Found a gap, or something wrong?
A real human reads support email.