In Python 3 crawling workflows, proxies are commonly used to prevent IP bans and improve scraping throughput by distributing requests across multiple IP addresses. Proxies generally fall into two groups: free proxies, which are usually unstable, and paid proxies, which are typically more reliable.
Common Python 3 crawler proxy use cases include:
Preventing IP bans: many websites enforce request-rate limits, and when a single IP crosses that limit it can be blocked. Proxies reduce that risk.
Increasing crawl speed: proxies let you open multiple connections in parallel so you can collect data faster.
Bypassing geo restrictions: some sites expose different content in different regions. If you need region-specific data, a proxy can help you reach it.
In short, proxy IPs are an important part of Python 3 crawling. Because proxy use also introduces security considerations, you should choose providers carefully and follow applicable security and compliance rules.
Preparation
First, you need a working proxy. A proxy is simply an IP address and port combined in the format ip:port. If the proxy requires authentication, you will also need a username and password.
On my machine, a local proxy tool exposes an HTTP proxy on port 7890, which means the proxy is `127.0.0.1:7890`. It also exposes a SOCKS proxy on port 7891, so that proxy is `127.0.0.1:7891`. Once either proxy is configured, my machine routes traffic through the connected upstream server IP instead of the local IP.
In the examples below, I use those local proxies to demonstrate the setup. You can replace them with your own working proxy details.
After configuring a proxy, use http://httpbin.org/get as a quick test URL. The response includes request metadata, and the `origin` field shows the client IP so you can verify whether the proxy is active.
With that ready, let’s walk through proxy configuration for each request library.
Get Python 3 Crawler Proxies
Some websites monitor repeated access and actively block suspicious traffic. A proxy server distributes request sources, reduces the chance of detection, and improves crawl success rates.
Best US Static Proxy IP
IPRoyal is a proxy provider known for accessible residential proxy plans and broad international availability.
View IPRoyal
Cheapest Static Proxy
Proxy-seller is a datacenter proxy provider that remains popular with smaller internet marketers.
View Proxy-seller
Best-Value Static Proxy
Shifter.io is a well-known proxy provider focused on privacy protection and a smoother internet experience.
View Shifter.io
2. urllib
Let’s start with the most basic option, `urllib`, and look at how proxy configuration works there:
from urllib.error import URLError
from urllib.request import ProxyHandler, build_opener
proxy = '127.0.0.1:7890'
proxy_handler = ProxyHandler({
'http': 'http://' + proxy,
'https': 'http://' + proxy
})
opener = build_opener(proxy_handler)
try:
response = opener.open('https://httpbin.org/get')
print(response.read().decode('utf-8'))
except URLError as e:
print(e.reason)
The output looks like this:
{
"args": {},
"headers": {
"Accept-Encoding": "identity",
"Host": "httpbin.org",
"User-Agent": "Python-urllib/3.7",
"X-Amzn-Trace-Id": "Root=1-60e9a1b6-0a20b8a678844a0b2ab4e889"
},
"origin": "210.173.1.204",
"url": "https://httpbin.org/get"
}
Here we use `ProxyHandler` to configure the proxy. Its argument is a dictionary where the keys are protocols and the values are proxy addresses. You must include the scheme in the proxy value, such as http:// or https://. When the target URL is HTTP, `urllib` uses the `http` key. When the target URL is HTTPS, it uses the `https` key. In this example, both keys point to an HTTP proxy, so both HTTP and HTTPS traffic are routed through that proxy.
After creating the `ProxyHandler`, pass it into `build_opener()` to create an opener that already knows how to route requests through the proxy. Then call `open()` on that opener to fetch the target URL.
The response body is JSON, and the `origin` field shows the client IP. Because that IP matches the proxy instead of the real local IP, the proxy was configured successfully.
If the proxy requires authentication, configure it like this:
from urllib.error import URLError
from urllib.request import ProxyHandler, build_opener
proxy = 'username:password@127.0.0.1:7890'
proxy_handler = ProxyHandler({
'http': 'http://' + proxy,
'https': 'http://' + proxy
})
opener = build_opener(proxy_handler)
try:
response = opener.open('https://httpbin.org/get')
print(response.read().decode('utf-8'))
except URLError as e:
print(e.reason)
The only change is the `proxy` variable. Add the username and password before the host. For example, if the username is `foo` and the password is `bar`, the proxy string becomes foo:bar@127.0.0.1:7890.
If the proxy is SOCKS5, configure it like this:
import socks
import socket
from urllib import request
from urllib.error import URLError
socks.set_default_proxy(socks.SOCKS5, '127.0.0.1', 7891)
socket.socket = socks.socksocket
try:
response = request.urlopen('https://httpbin.org/get')
print(response.read().decode('utf-8'))
except URLError as e:
print(e.reason)
This example requires the `socks` module, which you can install with:
pip3 install PySocks
This assumes a local SOCKS5 proxy is running on port `7891`. When it works correctly, the output matches the HTTP proxy example above:
{
"args": {},
"headers": {
"Accept-Encoding": "identity",
"Host": "httpbin.org",
"User-Agent": "Python-urllib/3.7",
"X-Amzn-Trace-Id": "Root=1-60e9a1b6-0a20b8a678844a0b2ab4e889"
},
"origin": "210.173.1.204",
"url": "https://httpbin.org/get"
}
Again, the `origin` field shows the proxy IP, so the proxy setup is working.
3. Proxy Setup in `requests`
In `requests`, proxy setup is straightforward. You only need to pass the proxies parameter.
Using the local proxy from this machine as an example, the HTTP proxy configuration looks like this:
import requests
proxy = '127.0.0.1:7890'
proxies = {
'http': 'http://' + proxy,
'https': 'http://' + proxy,
}
try:
response = requests.get('https://httpbin.org/get', proxies=proxies)
print(response.text)
except requests.exceptions.ConnectionError as e:
print('Error', e.args)
The output looks like this:
{
"args": {},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Host": "httpbin.org",
"User-Agent": "python-requests/2.22.0",
"X-Amzn-Trace-Id": "Root=1-5e8f358d-87913f68a192fb9f87aa0323"
},
"origin": "210.173.1.204",
"url": "https://httpbin.org/get"
}
Like `urllib`, `requests` uses the `http` proxy for HTTP URLs and the `https` proxy for HTTPS URLs. In this example, both are routed through the same HTTP proxy.
If the `origin` value in the response matches the proxy IP, the proxy is configured correctly.
If the proxy requires authentication, prepend the username and password like this:
proxy = 'username:password@127.0.0.1:7890'
Just replace username and password with your own credentials.
If you need a SOCKS proxy, use this configuration instead:
import requests
proxy = '127.0.0.1:7891'
proxies = {
'http': 'socks5://' + proxy,
'https': 'socks5://' + proxy
}
try:
response = requests.get('https://httpbin.org/get', proxies=proxies)
print(response.text)
except requests.exceptions.ConnectionError as e:
print('Error', e.args)
For this, you need to install the extra package requests[socks]:
pip3 install "requests[socks]"
The output is the same:
{
"args": {},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Host": "httpbin.org",
"User-Agent": "python-requests/2.22.0",
"X-Amzn-Trace-Id": "Root=1-5e8f364a-589d3cf2500fafd47b5560f2"
},
"origin": "210.173.1.204",
"url": "https://httpbin.org/get"
}
There is also another approach that uses the `socks` module directly. It requires the same `socks` dependency installed earlier:
import requests
import socks
import socket
socks.set_default_proxy(socks.SOCKS5, '127.0.0.1', 7891)
socket.socket = socks.socksocket
try:
response = requests.get('https://httpbin.org/get')
print(response.text)
except requests.exceptions.ConnectionError as e:
print('Error', e.args)
This method also works for SOCKS proxies and produces the same result. Compared with the first method, this one changes socket behavior globally, so choose based on your use case.
4. Proxy Setup in `httpx`
`httpx` works a lot like `requests`, so it also uses a `proxies` argument. The main difference is that the keys must be http:// and https:// instead of just http and https.
For an HTTP proxy, use this setup:
import httpx
proxy = '127.0.0.1:7890'
proxies = {
'http://': 'http://' + proxy,
'https://': 'http://' + proxy,
}
with httpx.Client(proxies=proxies) as client:
response = client.get('https://httpbin.org/get')
print(response.text)
If the proxy requires authentication, just change the `proxy` value:
proxy = 'username:password@127.0.0.1:7890'
Replace username and password with your actual credentials.
The output is similar to the `requests` example:
{
"args": {},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Host": "httpbin.org",
"User-Agent": "python-httpx/0.18.1",
"X-Amzn-Trace-Id": "Root=1-60e9a3ef-5527ff6320484f8e46d39834"
},
"origin": "210.173.1.204",
"url": "https://httpbin.org/get"
}
For SOCKS proxies, install the `httpx-socks` package:
pip3 install "httpx-socks[asyncio]"
This installs support for both synchronous and asynchronous usage.
For synchronous mode, configure it like this:
import httpx
from httpx_socks import SyncProxyTransport
transport = SyncProxyTransport.from_url(
'socks5://127.0.0.1:7891')
with httpx.Client(transport=transport) as client:
response = client.get('https://httpbin.org/get')
print(response.text)
Here we create a `transport` object, point it at the SOCKS proxy, and pass that transport into `httpx.Client()`. The result is the same as before.
For asynchronous mode, use this version:
import httpx
import asyncio
from httpx_socks import AsyncProxyTransport
transport = AsyncProxyTransport.from_url(
'socks5://127.0.0.1:7891')
async def main():
async with httpx.AsyncClient(transport=transport) as client:
response = await client.get('https://httpbin.org/get')
print(response.text)
if __name__ == '__main__':
asyncio.get_event_loop().run_until_complete(main())
The only difference from synchronous mode is that we use `AsyncProxyTransport` instead of `SyncProxyTransport`, and `AsyncClient` instead of `Client`. Everything else stays the same.
5. Proxy Setup in `Selenium`
`Selenium` can also use proxies. Here the examples use Chrome.
For a non-authenticated proxy, configure it like this:
from selenium import webdriver
proxy = '127.0.0.1:7890'
options = webdriver.ChromeOptions()
options.add_argument('--proxy-server=http://' + proxy)
browser = webdriver.Chrome(options=options)
browser.get('https://httpbin.org/get')
print(browser.page_source)
browser.close()
The output looks like this:
{
"args": {},
"headers": {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"Accept-Encoding": "gzip, deflate",
"Accept-Language": "zh-CN,zh;q=0.9",
"Host": "httpbin.org",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36",
"X-Amzn-Trace-Id": "Root=1-5e8f39cd-60930018205fd154a9af39cc"
},
"origin": "210.173.1.204",
"url": "http://httpbin.org/get"
}
The `origin` field again matches the proxy IP, so the proxy is configured correctly.
If the proxy requires authentication, the setup is more involved:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import zipfile
ip = '127.0.0.1'
port = 7890
username = 'foo'
password = 'bar'
manifest_json = """{"version":"1.0.0","manifest_version": 2,"name":"Chrome Proxy","permissions": ["proxy","tabs","unlimitedStorage","storage","<all_urls>","webRequest","webRequestBlocking"],"background": {"scripts": ["background.js"]
}
}
"""
background_js = """
var config = {
mode: "fixed_servers",
rules: {
singleProxy: {
scheme: "http",
host: "%(ip) s",
port: %(port) s
}
}
}
chrome.proxy.settings.set({value: config, scope: "regular"}, function() {});
function callbackFn(details) {
return {
authCredentials: {username: "%(username) s",
password: "%(password) s"
}
}
}
chrome.webRequest.onAuthRequired.addListener(
callbackFn,
{urls: ["<all_urls>"]},
['blocking']
)
""" % {'ip': ip, 'port': port, 'username': username, 'password': password}
plugin_file = 'proxy_auth_plugin.zip'
with zipfile.ZipFile(plugin_file, 'w') as zp:
zp.writestr("manifest.json", manifest_json)
zp.writestr("background.js", background_js)
options = Options()
options.add_argument("--start-maximized")
options.add_extension(plugin_file)
browser = webdriver.Chrome(options=options)
browser.get('https://httpbin.org/get')
print(browser.page_source)
browser.close()
This approach creates a local `manifest.json` file and a `background.js` script to handle proxy authentication. When the code runs, it packages the configuration into `proxy_auth_plugin.zip`.
The result is the same as the previous example: the `origin` field shows the proxy IP.
SOCKS proxy setup is simpler. Change the protocol to socks5, like this non-authenticated example:
from selenium import webdriver
proxy = '127.0.0.1:7891'
options = webdriver.ChromeOptions()
options.add_argument('--proxy-server=socks5://' + proxy)
browser = webdriver.Chrome(options=options)
browser.get('https://httpbin.org/get')
print(browser.page_source)
browser.close()
The result is the same.
6. Proxy Setup in `aiohttp`
In `aiohttp`, you can configure a proxy directly through the proxy parameter. For an HTTP proxy:
import asyncio
import aiohttp
proxy = 'http://127.0.0.1:7890'
async def main():
async with aiohttp.ClientSession() as session:
async with session.get('https://httpbin.org/get', proxy=proxy) as response:
print(await response.text())
if __name__ == '__main__':
asyncio.get_event_loop().run_until_complete(main())
If the proxy needs a username and password, use the same pattern as `requests`:
proxy = 'http://username:password@127.0.0.1:7890'
Replace username and password as needed.
For SOCKS proxies, install the `aiohttp-socks` helper library:
pip3 install aiohttp-socks
You can then use `ProxyConnector` from that package to configure the SOCKS proxy:
import asyncio
import aiohttp
from aiohttp_socks import ProxyConnector
connector = ProxyConnector.from_url('socks5://127.0.0.1:7891')
async def main():
async with aiohttp.ClientSession(connector=connector) as session:
async with session.get('https://httpbin.org/get') as response:
print(await response.text())
if __name__ == '__main__':
asyncio.get_event_loop().run_until_complete(main())
The result is the same.
This library also supports SOCKS4, HTTP proxies, and proxy authentication. See its official docs for the full set of options.
7. Proxy Setup in `Pyppeteer`
`Pyppeteer` uses a Chromium browser similar to Chrome, so its setup looks much like Selenium’s. For a non-authenticated HTTP proxy, pass the proxy via args:
import asyncio
from pyppeteer import launch
proxy = '127.0.0.1:7890'
async def main():
browser = await launch({'args': ['--proxy-server=http://' + proxy], 'headless': False})
page = await browser.newPage()
await page.goto('https://httpbin.org/get')
print(await page.content())
await browser.close()
if __name__ == '__main__':
asyncio.get_event_loop().run_until_complete(main())
The output looks like this:
{
"args": {},
"headers": {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "zh-CN,zh;q=0.9",
"Host": "httpbin.org",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3494.0 Safari/537.36",
"X-Amzn-Trace-Id": "Root=1-5e8f442c-12b1ed7865b049007267a66c"
},
"origin": "210.173.1.204",
"url": "https://httpbin.org/get"
}
Again, the proxy is clearly active.
SOCKS proxies work the same way. Just change the scheme to socks5:
import asyncio
from pyppeteer import launch
proxy = '127.0.0.1:7891'
async def main():
browser = await launch({'args': ['--proxy-server=socks5://' + proxy], 'headless': False})
page = await browser.newPage()
await page.goto('https://httpbin.org/get')
print(await page.content())
await browser.close()
if __name__ == '__main__':
asyncio.get_event_loop().run_until_complete(main())
The result is the same.
8. Proxy Setup in `Playwright`
Compared with Selenium and Pyppeteer, `Playwright` makes proxy configuration easier because it exposes a dedicated `proxy` parameter when launching the browser.
For an HTTP proxy, use this setup:
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch(proxy={
'server': 'http://127.0.0.1:7890'
})
page = browser.new_page()
page.goto('https://httpbin.org/get')
print(page.content())
browser.close()
When calling `launch()`, pass a `proxy` dictionary. The required field is `server`, which should contain the HTTP proxy address.
The output looks like this:
{
"args": {},
"headers": {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "zh-CN,zh;q=0.9",
"Host": "httpbin.org",
"Sec-Ch-Ua": "\" Not A;Brand\";v=\"99\", \"Chromium\";v=\"92\"",
"Sec-Ch-Ua-Mobile": "?0",
"Sec-Fetch-Dest": "document",
"Sec-Fetch-Mode": "navigate",
"Sec-Fetch-Site": "none",
"Sec-Fetch-User": "?1",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4498.0 Safari/537.36",
"X-Amzn-Trace-Id": "Root=1-60e99eef-4fa746a01a38abd469ecb467"
},
"origin": "210.173.1.204",
"url": "https://httpbin.org/get"
}
For a SOCKS proxy, the setup is identical. Just change the `server` value to the SOCKS proxy address:
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch(proxy={
'server': 'socks5://127.0.0.1:7891'
})
page = browser.new_page()
page.goto('https://httpbin.org/get')
print(page.content())
browser.close()
The output is the same as before.
If the proxy requires authentication, Playwright also keeps that simple. Add `username` and `password` to the `proxy` object. For example, if the credentials are `foo` and `bar`:
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch(proxy={
'server': 'http://127.0.0.1:7890',
'username': 'foo',
'password': 'bar'
})
page = browser.new_page()
page.goto('https://httpbin.org/get')
print(page.content())
browser.close()
That’s all you need to enable authenticated proxies in Playwright.
9. Summary
This guide covered proxy configuration across several common request libraries. The setup patterns are similar, and once you understand them, adding proxies becomes an easy way to handle IP bans and rate limits in future scraping work.
By routing traffic through proxies in different locations, you can simulate requests from specific regions and collect localized data. Proxies also hide the crawler’s real IP address, which helps reduce blocking, protect privacy, and improve success rates when sites try to detect repeated requests from a single source.