- import urllib.request
-
- response = urllib.request.urlopen("http://www.baidu.com/")
-
- html = response.read().decode('utf-8')
-
- print(html)
-
指定请求头的方式
- import urllib.request
-
- url = "http://www.baidu.com/"
-
- headers = {
- "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36"
- }
- request = urllib.request.Request(url=url, headers=headers)
-
- response = urllib.request.urlopen(request) #response是类文件对象
-
- html = response.read().decode('utf-8')
-
添加请求头的方式
- import urllib.request
-
- url = "http://www.baidu.com/"
- key = "User-Agent"
- value = "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36"
-
- request = urllib.request.Request(url=url)
- request.add_header(key,value)
- # request.get_header("User-agent,user_agent)
-
- response = urllib.request.urlopen(request) #response是类文件对象
-
- html = response.read().decode('utf-8')
-
- # response.getcode() 获取响应码
- # response.geturl() 返回实际数据的url,防止重定向
- # response.info() 响应报头信息
-
- import urllib.parse
-
- wd={"wd" : "阿里巴巴"}
- encodedWd = urllib.parse.urlencode(wd)
- # urllib.unquote(encodedWd)
-