今天发现一个错误日志:
2013-06-06 12:25:13,332 [ERROR] upload.service.UploadFileService - image open error ,url = http://img.xitisi.com/Commodity/BOBOTou_2204/RiXiFaXingNvShengJiaFa_HuaBuWu2011XinKuan_QiLiuHaiBoboBoBoTouXiuLianDuanFaZongSe20120210034904.jpg ,cannot identify image fil
看了一下图片的头信息:
Accept-Ranges | bytes |
Content-Encoding | gzip |
Content-Length | 452449 |
Content-Type | image/jpeg |
Date | Thu, 06 Jun 2013 05:03:08 GMT |
Etag | "8041952b9a50cd1:1a9a" |
Last-Modified | Fri, 22 Jun 2012 17:12:15 GMT |
Server | Microsoft-IIS/6.0 |
Vary | Accept-Encoding |
X-Powered-By | ASP.NET |
原来是通过gzip压缩过,所以Image无法识别,需要先处理一下。
解决办法:
1. 通过python的gzip反解
def _read_content(self,response):
content_type = response.headers.get('Content-Type')
content_encoding = response.headers.get("Content-Encoding")
if response.code == 200 and content_type and content_type.find('image')!=-1:
data = StringIO(response.read())
if content_encoding=="gzip":
data = gzip.GzipFile(fileobj=data).read()
data = StringIO((data))
return data
else:
logger.error("can't open image ,content type=%s, url=%s"%(content_type,url))
return None
2. 在请求头中指定不支持gzip
self.headers = {}
self.headers['User-Agent'] = """Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 GTB6"""
self.headers['Accept'] = 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'
self.headers['Accept-Encoding'] = 'identity'
self.headers['Accept-Language'] = "zh,en-us;q=0.7,en;q=0.3"
self.headers['Accept-Charset'] = "ISO-8859-1,utf-8;q=0.7,*;q=0.7"
self.headers['Connection'] = "keep-alive"
self.headers['Keep-Alive'] = "115"
self.headers['Cache-Control'] = "no-cache"
def open(self, url):
try:
response = self.opener.open(urllib2.Request(url, headers=self.headers),timeout=self.timeout)
data = self._read_content(response)
return data
except Exception,e:
logger.error(url)
logger.exception(e)
return None