如何使用http/urllib
使用urllib访问百度
# -*- coding:utf-8 -*-
import urllib.request as ur
url = "http://www.baidu.com"
conn = ur.urlopen(url)
print conn
data = conn.read()
print data
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-4-a26b59fb666e> in <module>()
1
----> 2 import urllib.request as ur
3 url = "http://www.baidu.com"
4 conn = ur.urlopen(url)
5 print conn
ImportError: No module named request
解决使用urllib访问百度的报错
问题:导入错误:ImportError: No module named request
解决方案:import urllib2 as ur替换import urllib.request as ur
# -*- coding:utf-8 -*-
import urllib2 as ur
url = "http://www.baidu.com"
conn = ur.urlopen(url)
print conn
data = conn.read()
print data
<addinfourl at 110377288L whose fp = <socket._fileobject object at 0x0000000006907C00>>
<!DOCTYPE html>
<!--STATUS OK-->
<html>
<head>
<meta http-equiv="content-type" content="text/html;charset=utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=Edge">
<meta content="always" name="referrer">
<meta name="theme-color" content="#2932e1">
<link rel="shortcut icon" href="/favicon.ico" type="image/x-icon" />
<link rel="search" type="application/opensearchdescription+xml" href="/content-search.xml" title="百度搜索" />
<link rel="icon" sizes="any" mask href="//www.baidu.com/img/baidu.svg">
<link rel="dns-prefetch" href="//s1.bdstatic.com"/>
<link rel="dns-prefetch" href="//t1.baidu.com"/>
<link rel="dns-prefetch" href="//t2.baidu.com"/>
<link rel="dns-prefetch" href="//t3.baidu.com"/>
<link rel="dns-prefetch" href="//t10.baidu.com"/>
<link rel="dns-prefetch" href="//t11.baidu.com"/>
<link rel="dns-prefetch" href="//t12.baidu.com"/>
<link rel="dns-prefetch" href="//b1.bdstatic.com"/>
<title>百度一下,你就知道</title>
省略大部分HTML...
</body>
</html>
<script src="https://ss1.bdstatic.com/5eN1bjq8AAUYm2zgoY3K/r/www/cache/yunying/Turing2017PC/logo_1.6.js"></script>
什么是http/urllib
http是客户端-服务器模块,包括4个部分:
* client处理客户端请求
* server处理服务器响应
* cookies和cookiejar处理cookie
urllib是基于http的封装库,包括3个部分:
* request模块处理客户端请求
* response处理服务器的响应
* parse解析url
模块结构比较松散,既包含服务器模块request,又包含客户端模块response
为何使用http/urllib
以上标准库标准库会在Python3.x中进行改进;如果想要快速开发客户端,建议使用request库.