无账号密码使用 Selenium 实现 HTTP 代理
- import time
-
- from selenium import webdriver
- from selenium.webdriver.chrome.service import Service
-
-
- proxy_ip = "127.0.0.1"
- proxy_port = "1080"
-
-
- chrome_options = webdriver.ChromeOptions()
- chrome_options.add_argument('--proxy-server=http://{}:{}'.format(proxy_ip, proxy_port))
-
-
- chrome_service = Service("./chromedriver.exe")
- driver = webdriver.Chrome(service=chrome_service, options=chrome_options)
-
-
- driver.get('https://www.baidu.com')
-
-
- time.sleep(30)
-
-
- driver.quit()
-
selenium添加代理(有账号密码)
Selenium-Chrome-HTTP-Private-Proxy HTTP 代理解决方案
- 默认情况下,Chrome的--proxy-server="http://ip:port"参数不支持设置用户名和密码认证。因此"Selenium + Chrome Driver"无法使用HTTP Basic Authentication的HTTP代理。一种变通的方式就是采用IP地址认证,但在国内网络环境下,大多数用户都采用ADSL形式网络接入,IP是变化的,也无法采用IP地址绑定认证。因此迫切需要找到一种让Chrome自动实现HTTP代理用户名密码认证的方案。
- Stackoverflow上有人分享了一种利用Chrome插件实现自动代理用户密码认证的方案非常不错,详细地址:http://stackoverflow.com/questions/9888323/how-to-override-basic-authentication-in-selenium2-with-java-using-chrome-driver
- 鲲之鹏的技术人员在此思路的基础上用Python实现了自动化的Chrome插件创建过程,即根据指定的代理“username:password@ip:port”自动创建一个Chrome代理插件,然后可以在"Selenium + Chrome Driver"中通过安装该插件实现代理配置功能(插件地址:https://github.com/RobinDev/Selenium-Chrome-HTTP-Private-Proxy)
如何实现
- 1、访问插件地址下载插件,放在项目目录中供使用
- 2、编写代码
- import time
- import string
- import zipfile
- from selenium import webdriver
- from selenium.webdriver.chrome.options import Options
- from selenium.webdriver.chrome.service import Service
-
-
- def create_proxyauth_extension(proxy_host, proxy_port, proxy_username, proxy_password, scheme='http', plugin_path=None):
- """Proxy Auth Extension
- args:
- proxy_host (str): domain or ip address, ie proxy.domain.com
- proxy_port (int): port
- proxy_username (str): auth username
- proxy_password (str): auth password
- kwargs:
- scheme (str): proxy scheme, default http
- plugin_path (str): absolute path of the extension
- return str -> plugin_path
- """
- if plugin_path is None:
- plugin_path = 'Selenium-Chrome-HTTP-Private-Proxy.zip'
- manifest_json = """
- {
- "version": "1.0.0",
- "manifest_version": 2,
- "name": "Chrome Proxy",
- "permissions": [
- "proxy",
- "tabs",
- "unlimitedStorage",
- "storage",
- "<all_urls>",
- "webRequest",
- "webRequestBlocking"
- ],
- "background": {
- "scripts": ["background.js"]
- },
- "minimum_chrome_version":"22.0.0"
- }
- """
- background_js = string.Template(
- """
- var config = {
- mode: "fixed_servers",
- rules: {
- singleProxy: {
- scheme: "${scheme}",
- host: "${host}",
- port: parseInt(${port})
- },
- bypassList: ["foobar.com"]
- }
- };
- chrome.proxy.settings.set({value: config, scope: "regular"}, function() {});
- function callbackFn(details) {
- return {
- authCredentials: {
- username: "${username}",
- password: "${password}"
- }
- };
- }
- chrome.webRequest.onAuthRequired.addListener(
- callbackFn,
- {urls: ["<all_urls>"]},
- ['blocking']
- );
- """
- ).substitute(
- host=proxy_host,
- port=proxy_port,
- username=proxy_username,
- password=proxy_password,
- scheme=scheme,
- )
- with zipfile.ZipFile(plugin_path, 'w') as zp:
- zp.writestr("manifest.json", manifest_json)
- zp.writestr("background.js", background_js)
-
- return plugin_path
-
-
- def configure_headless_browser(proxy_config):
- chrome_options = Options()
- chrome_options.add_argument("--start-maximized")
- proxyauth_plugin_path = create_proxyauth_extension(
- proxy_host=proxy_config[0],
- proxy_port=proxy_config[1],
- proxy_username=proxy_config[2],
- proxy_password=proxy_config[3]
- )
- chrome_options.add_extension(proxyauth_plugin_path)
-
- executable_path = './chromedriver.exe'
- service = Service(executable_path=executable_path)
- return webdriver.Chrome(options=chrome_options,service=service)
-
-
-
- proxy_config = ["xxx", "xxx", "xxx", "xxx"]
-
- driver = configure_headless_browser(proxy_config)
- driver.get('http://httpbin.org/ip')
-
- time.sleep(3)
- print(driver.page_source)
- driver.quit()
-