Python
大致是同
Internet
同时发展起来的。
Python
对常用的网络协议支持相当完善。本节主要介绍超文本传输协议
HTTP
和超文本标记语言
HTML
。
HTTP
协议的实现包含
HTTP server
和
HTTP client
。
l
Fetching Data via HTTP
可以使用
httplib
这个
module
轻松的从
HTTP server
获得数据。
import httplib
http = httplib.HTTP(‘www.python.org’
接下来需要告诉
HTTP server
,需要什么数据,数据的格式是怎样的。
http.putrequest(‘GET’, ‘/index.html’)
http.putheader(‘Accept’, ‘text/html’)
http.putheader(‘Accept’, ‘text/plain’)
http.endheaders()
之后便可以用
getreply()
来获得数据了。
errcode, errmsg, headers = http.getreply()
其中
errcode
是返回的错误代码,
200
表示成功;
errmsg
表示错误信息;而
headers
则为
HTTP server
发送的
headers
。可以用字典类型的方法来打印
headers
。
for key, value in headers.items():
print key, “=”, value
content-length = 15298
accept-ranges = bytes
server = Apache/2.0.54 (Debian GNU/Linux) DAV/2 SVN/1.1.4 mod_python/3.1.3 Python/2.3.5 mod_ssl/2.0.54 OpenSSL/0.9.7e
last-modified = Tue, 15 May 2007 01:54:01 GMT
connection = close
etag = "60193-3bc2-81ef6040"
date = Tue, 15 May 2007 11:29:13 GMT
content-type = text/html
可用getfile()来获得具体的data,其长度为content-length。content-length包含于headers中。
file = http.getfile()
得到的file为文件对象,可以相操作文件一样对其进行read、flush,write等。
l
Implementing an HTTP redirector
Python
也能用于实现
HTTP server
,
Python
的
SimpleHTTPServer.py
就是一个简单的
http server
。下面要实现的
HTTP redirector
类似于代理
Proxy
。该
HTTP server
接受
request
,将这些
requests
定向到其他
http server
。这种方法类似于钓鱼功能。
# HTTPRedirector.py
# An HTTP Server that redirects all requests to a named, remote server.
# BaseHTTPServer provides the basic HTTP Server functionality.
import BaseHTTPServer
# httplib establishes our connection to the remote server
import httplib
import socket # For the error!
# The server we are redirecting to.
g_RemoteServerName = "www.baidu.com"
class HTTPRedirector(BaseHTTPServer.BaseHTTPRequestHandler):
# This function is called when a client makes a GET request
# ie, it wants the headers, and the data.
def do_GET(self):
srcfile = self.send_headers("GET")
if srcfile:
# Copy the data from the remote server
# back to the client.
BLOCKSIZE = 8192
while 1:
# Read a block from the remote.
data = srcfile.read(BLOCKSIZE)
if not data: break
self.wfile.write(data)
srcfile.close()
# This function is called when a client makes a HEAD request
# i.e., it only wants the headers, not the data.
def do_HEAD(self):
srcfile = self.send_headers("HEAD")
if srcfile:
srcfile.close()
# A private function which handles all the redirection logic.
def send_headers(self, request):
# Establish a remote connection
try:
http = httplib.HTTP(g_RemoteServerName)
except socket.error, problem:
print "Error - Cannot connect to %s: %s" /
% (g_RemoteServerName, problem)
return
# Resend all the headers we retrieved in the request.
http.putrequest(request, self.path)
for header, val in self.headers.items():
http.putheader(header, val)
http.endheaders()
# Now get the response from the remote server
errcode, errmsg, headers = http.getreply()
self.send_response(errcode, errmsg)
# Send the headers back to the client.
for header, val in headers.items():
self.send_header(header, val)
self.end_headers()
if errcode==200:
return http.getfile()
if __name__=='__main__':
print "Redirecting HTTP requests to", g_RemoteServerName
HTTPServer.test(HTTPRedirector)
测试:
运行
HTTP server:
python.exe HTTPRedirector.py
l
FTP
Python
用
ftplib
模块来处理
FTP protocol
。同样
FTP protocol
也由
server
和
client
组成。
import ftplib
ftp = ftplib.FTP(‘ftp.gnu.org’)
之后,可以运用
login method
完成登陆。可以用
retrlines method
完成文件的传输等。
ftp.login(‘anonymous’, ‘’)
ftp.retrlines(‘LIST’) #
列出目录中的文件
下面
download
文件
welcome.msg
file = open(‘welcome.msg’, ‘w’)
ftp.retrlines(‘retr welcome.msg’, file.write)
file.close()