python text/plain' text/html' ****************************

最新推荐文章于 2022-11-12 14:23:50 发布

tycoon1988

最新推荐文章于 2022-11-12 14:23:50 发布

阅读量3.7k

点赞数

分类专栏： python-http_server

本文链接：https://blog.csdn.net/tycoon1988/article/details/39991917

版权

python-http_server 专栏收录该内容

22 篇文章 0 订阅

订阅专栏

Date Fri 22 November 2013 Tags encoding / python / http

字符编码和python中的字符编码两文对字符编码简单的介绍。现在开始讨论http中的编码问题，当完成编码系列的文章后，开始完成一系统http的文章，本文还是需要一些http基本的知识。

做为java出身的coder，今天还是用python语言来实现http的实例，java实现个东西太麻烦，Simple is better than complex.

响应头中的`Content-Type`

我们知道http响应报文，包括两部分实体首部（响应头）和实体主体（响应主体），响应头是对主体内容的描述，告知浏览器怎么处理主体内容（文本，图片等等）。上代码：

#coding=utf-8

from BaseHTTPServer import HTTPServer, BaseHTTPRequestHandler

class MyRequestHandler(BaseHTTPRequestHandler):
    def do_GET(self):
        self.send_response(200)
        self.send_header('Content-Type', 'text/plain')
        self.end_headers()
        self.wfile.write('hello web')

server = HTTPServer(('127.0.0.1', 9000), MyRequestHandler)
server.serve_forever()

不了解上面代码没关系，你只需要知道它是一个简单的web服务（只支持GET），只返回一个文本。运行后，打开浏览器访问http://localhost:9000

Encoding Img

看到上面我们预料之中的结果，如加入中文后，会出现什么情况

#将self.wfile.write('hello web')替换为下行内容
self.wfile.write('hello web 编码')

再次运行，访问浏览器

Encoding Img

乱码出现了。返回浏览器的主体是hello web 编码，响应头是Content-Type:text/plain。只说明返回的是文本，而没具体说明该用哪个字符集来解析该文本(浏览器默认操作系统字符集处理gbk)。若改Content-Type:text/plain;charset=utf-8，再次查看结果，乱码消失了。charset参数是告知浏览器如何把主体内容中的二进制转换为字符，同理可以推断出该程序会将文本按utf-8编码处理成二进制，在网络上传输。

响应头中的`Content-Encoding`

Content-Encoding常见的取值：

gzip        实体采用GNU zip编码
compress    采用Unix的文件压缩程序
deflate     用zlib格式压缩
identity    没有进行任何编码

上三种都是无损压缩算法，用于减少传输报文的大小写，不会导致信息损失。其中gzip效率是最高的。
与之请求对应的Accept-Encoding相对应。

#coding=utf-8

from BaseHTTPServer import HTTPServer, BaseHTTPRequestHandler

import gzip, cStringIO, urllib

def compressBuf(buf):
    zbuf = cStringIO.StringIO()
    zfile = gzip.GzipFile(mode = 'wb',  fileobj = zbuf, compresslevel = 9)
    zfile.write(buf)
    zfile.close()
    return zbuf.getvalue()

class MyRequestHandler(BaseHTTPRequestHandler):
    def do_GET(self):
        self.send_response(200)

         self.send_header('Content-Type', 'text/html')
         self.send_header('Content-Encoding','gzip')  #若注释该行，客户端就会出错
        self.end_headers()

        content = '''<html>
        <head>
            <title>最简单的httpserver</title>
            <meta charset="utf-8"/>
        </head>
        <body>就提供这一个页面</body></html>'''

        zbuf = compressBuf(content)
        print zbuf
        self.wfile.write(zbuf)

server = HTTPServer(('127.0.0.1', 9000), MyRequestHandler)
server.serve_forever()