爬数据的时候常常遇到img标签的src中不包含图片的后缀名,如http://photos.prnewswire.com/prn/20100819/LA52539LOGO所示,
这时通过imghdr模块就能够把图片的后缀名读出来
例子:
import urllib2
import imghdr
url = 'http://photos.prnewswire.com/prn/20100819/LA52539LOGO'
response = urllib2.urlopen(url)
webpage = response.read()
print imghdr.what('', webpage)