翻看两个库的源码,对urllib2代码高内聚、低耦合的特点,体会的还是不深。先写下来,慢慢分析、慢慢领悟吧。
特意弄了个思维导图,对厘清代码的大概结构还是挺有帮助的。我按照函数和类去进行
同urllib一样,urllib2中也可以调用urlopen方法,贴下这个函数的源码:
1 _opener = None
2 def urlopen(url, data=None, timeout=socket._GLOBAL_DEFAULT_TIMEOUT,
3 cafile=None, capath=None, cadefault=False, context=None):
4 global _opener
5 if cafile or capath or cadefault:
6 if context is not None:
7 raise ValueError(
8 "You can't pass both context and any of cafile, capath, and cadefault"
9 )
10 if not _have_ssl:
11 raise ValueError('SSL support not available')
12 context = ssl.create_default_context(purpose=ssl.Purpose.SERVER_AUTH,
13 cafile=cafile,
14 capath=capath)
15 https_handler = HTTPSHandler(context=context)
16 opener = build_opener(https_handler)
17 elif context:
18 https_handler = HTTPSHandler(context=context)
19 opener = build_opener(https_handler)
20 elif _opener is None:
21 _opener = opener = build_opener()
22 else:
23 opener = _opener
24 return opener.open(url, data, timeout)
可以看出,urlopen实际上是对build_opener函数返回的对象进行调用,build_opener可以传递HTTPSHandler实例
build_opener使用一个handlers的对象列表创建出一个opener对象,这个opener是OpenerDirector的实例,支持一些默认的handler,包括http、ftp、https(调用时)。
def build_opener(*handlers):
import types
def isclass(obj):
return isinstance(obj, (types.ClassType, type))
# 实例化OpenerDirector
opener = OpenerDirector()
# 默认的handler
default_classes = [ProxyHandler, UnknownHandler, HTTPHandler,
HTTPDefaultErrorHandler, HTTPRedirectHandler,
FTPHandler, FileHandler, HTTPErrorProcessor]
if hasattr(httplib, 'HTTPS'):
# 添加了HTTPSHandler
default_classes.append(HTTPSHandler)
skip = set()
# handlers参数中有属于default_classes中子类或实例的,加入到skip中
for klass in default_classes:
for check in handlers:
if isclass(check):
if issubclass(check, klass):
skip.add(klass)
elif isinstance(check, klass):
skip.add(klass)
for klass in skip:
default_classes.remove(klass)
# 把默认类的实例化对象加入到opener中
for klass in default_classes:
opener.add_handler(klass())
for h in handlers:
if isclass(h):
h = h()
opener.add_handler(h)
return opener
当使用res = urllib2.urlopen('http://python.org')时,过下期间的过程哈(我怕忘):
urlopen函数会在其中调用build_operner函数,build_opener函数中会实例化OpenerDirector为opener,并把默认的类添加到opener上(add_handler),返回operner。urlopen调用opener.open方法,去执行响应的请求。
先写到这里吧,之后在继续