Concurrency and connection pooling — Docs

Concurrency limits per pool type

Start here before touching any pool size setting. The binding constraint is almost always target-side tolerance, not gateway capacity.

Pool type	Typical P50 latency	Recommended starting concurrency	Notes
Datacenter rotating	60–120 ms	50–200 concurrent	Fast; push up until target 429s
Residential rotating	180–350 ms	30–100 concurrent	Need more parallelism to match datacenter RPS
Mobile	200–500 ms	16–50 concurrent	Bandwidth-priced; fewer retries matter more
Static datacenter	60–120 ms	20–50 per IP	Fixed IP — target may rate-limit the specific IP

Start at the low end, watch your 429 rate, and increase by 25% until it starts climbing. The optimal concurrency is just below the point where the target rate-limits you.

Python asyncio / aiohttp

aiohttp exposes connection pool limits via TCPConnector. Set limit for the total pool and limit_per_host for per-target concurrency.

aiohttp-pool.pypython

import asyncio
import aiohttp

PROXY = "http://USER:[email protected]:8080"

async def fetch(session: aiohttp.ClientSession, url: str) -> str:
    async with session.get(url, proxy=PROXY) as resp:
        resp.raise_for_status()
        return await resp.text()

async def main(urls: list[str]) -> list[str]:
    connector = aiohttp.TCPConnector(
        limit=100,            # total simultaneous connections
        limit_per_host=20,    # per target domain — tune down for fragile targets
        ttl_dns_cache=300,    # cache the gateway DNS result for 5 minutes
        enable_cleanup_closed=True,
    )
    timeout = aiohttp.ClientTimeout(total=20)

    async with aiohttp.ClientSession(
        connector=connector,
        timeout=timeout,
    ) as session:
        tasks = [fetch(session, url) for url in urls]
        results = await asyncio.gather(*tasks, return_exceptions=True)

    return results

if __name__ == "__main__":
    urls = [f"https://target.com/item/{i}" for i in range(500)]
    results = asyncio.run(main(urls))
    ok = [r for r in results if isinstance(r, str)]
    print(f"{len(ok)}/{len(urls)} succeeded")

limit_per_host is per resolved hostname, not per IP. If your target redirects to a CDN hostname, the CDN hostname gets its own limit bucket. Keep this in mind when debugging unexpected slowdowns.

Python httpx async

httpx.AsyncClient uses a connection pool internally. Control the pool size via limits:

httpx-async-pool.pypython

import asyncio
import httpx

PROXY = "http://USER:[email protected]:8080"

limits = httpx.Limits(
    max_connections=100,         # total pool size
    max_keepalive_connections=40, # idle connections to hold open
    keepalive_expiry=30,         # seconds before an idle connection closes
)

async def main(urls: list[str]):
    async with httpx.AsyncClient(
        proxy=PROXY,
        limits=limits,
        timeout=20,
        http2=True,   # HTTP/2 multiplexes multiple requests per connection
    ) as client:
        tasks = [client.get(url) for url in urls]
        responses = await asyncio.gather(*tasks, return_exceptions=True)
    return responses

asyncio.run(main(["https://target.com/"] * 200))

HTTP/2 multiplexing means fewer TCP connections can carry more requests — particularly effective when the target supports it and you have a small number of high-request-volume sessions.

Keep-alive and connection reuse

Every new TCP connection to the proxy gateway costs a round-trip for the TCP handshake plus another for the TLS handshake. At 200 ms latency that's 400 ms of overhead before the first byte of your actual request is sent. Keep-alive eliminates this for subsequent requests.

aiohttp: connections are kept alive by default. Set connector_owner=True (the default) so the connector is closed with the session.
httpx: keep-alive is on by default. keepalive_expiry controls how long idle connections are held. Don't set it too high or you'll accumulate stale connections.
requests: use a Session (not bare requests.get) — the session owns the connection pool. Bare calls create and discard connections every time.

Don't share sessions across threads

requests.Session is not thread-safe. In threaded workers, give each thread its own session. For async code, share a single aiohttp.ClientSession across coroutines — it is designed for concurrent use.

Per-domain concurrency caps

When scraping multiple domains in parallel, cap concurrency per domain to avoid hammering one target while another sits idle. A semaphore per domain is the standard pattern:

per-domain-cap.pypython

import asyncio
from collections import defaultdict
from urllib.parse import urlparse
import aiohttp

PROXY = "http://USER:[email protected]:8080"
MAX_PER_DOMAIN = 10

sems: dict[str, asyncio.Semaphore] = defaultdict(
    lambda: asyncio.Semaphore(MAX_PER_DOMAIN)
)

async def fetch(session: aiohttp.ClientSession, url: str) -> str:
    domain = urlparse(url).netloc
    async with sems[domain]:
        async with session.get(url, proxy=PROXY) as resp:
            return await resp.text()

async def main(urls):
    connector = aiohttp.TCPConnector(limit=200, limit_per_host=MAX_PER_DOMAIN)
    async with aiohttp.ClientSession(connector=connector) as session:
        return await asyncio.gather(
            *(fetch(session, u) for u in urls),
            return_exceptions=True,
        )

For the full system-level view — file descriptors, ephemeral ports, conntrack — see high-throughput tuning.