使用的语句是
page = requests.get( url , headers = self.header, timeout = 10 , verify = flag )
各变量的值分别为
url = "http://www.sbacn.org"
flag = False
self.header = {
"User-Agent" : "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:40.0) Gecko/20100101 Firefox/40.0",
"Accept" : "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Language" : "zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3",
"Accept-Encoding" : "gzip, deflate",
}
报错内容为
Traceback (most recent call last):
File "bing.py", line 237, in
bing.titleGet(urls)
File "bing.py", line 195, in titleGet
page = self.dataRequest(url)
File "bing.py", line 86, in dataRequest
page = requests.get( url , headers = self.header, timeout = 10 , verify = flag )
File "/usr/lib/python2.7/site-packages/requests/api.py", line 69, in get
return request("get", url, params=params, **kwargs)
File "/usr/lib/python2.7/site-packages/requests/api.py", line 50, in request
response = session.request(method=method, url=url, **kwargs)
File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 468, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python2.7/site-packages/requests/sessions.py", line 608, in send
r.content
File "/usr/lib/python2.7/site-packages/requests/models.py", line 734, in content
self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
File "/usr/lib/python2.7/site-packages/requests/models.py", line 657, in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
File "/usr/lib/python2.7/site-packages/requests/packages/urllib3/response.py", line 326, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "/usr/lib/python2.7/site-packages/requests/packages/urllib3/response.py", line 282, in read
data = self._fp.read(amt)
File "/usr/lib64/python2.7/httplib.py", line 567, in read
s = self.fp.read(amt)
File "/usr/lib64/python2.7/socket.py", line 380, in read
data = self._sock.recv(left)
socket.error: [Errno 104] Connection reset by peer
我纳闷的是在我的mac上运行就没问题,但在服务器的ubuntu上运行就会报错,这是为什么?
而且我其实是抓了bing的搜索结果里10页的url,连续访问的时候就会报错,但我要是把这个url单独拿出来访问的时候就没问题.这是为什么?