Scrapy、Splash和Connection被对方拒绝：10061

潮易

于 2024-07-25 06:37:00 发布

阅读量286

点赞数 5

文章标签： scrapy

本文链接：https://blog.csdn.net/wangbadan121/article/details/140678414

版权

当使用Scrapy、Splash或任何其他网络请求工具时，可能会遇到“Connection was refused by the server”或10061错误，这通常意味着服务器无法响应请求。以下是一些可能的解决方法及代码示例：

1. **检查服务器状态**：首先确认服务器是否正常运行且监听指定端口。可以使用`nc -z <server_ip> <port>`命令来检查端口是否开放。

2. **检查防火墙设置**：确保防火墙规则允许你的客户端（在本例中是Scrapy或Splash）连接到目标服务器的端口号。

3. **使用代理服务器**：如果目标服务器被限制了直接访问，你可以尝试通过一个代理服务器转发请求。在Scrapy中可以通过`DOWNLOADER_MIDDLEWARES`设置使用代理：
   ```python
   DOWNLOADER_MIDDLEWARES = {
       'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': 110,
   }
   PROXY_LIST = ['http://proxy1.example.com:8080', 'http://proxy2.example.com:3128']
   ```

4. **使用Splash代理**：如果使用的是Splash，可以通过`HTTP_PROXY`或`HTTPS_PROXY`环境变量指定代理服务器。例如：
   ```bash
   docker run -p 8050:8050 scrapinghub/splash --with-proxy=http://your-proxy.com:3128
   ```

5. **检查网络连接**：确保你的客户端（本例中是Scrapy或Splash）能够连接到互联网。可以通过ping目标服务器来测试。

6. **查看日志**：在出现错误时，查看应用程序和服务器的日志文件可能会提供更多有用的信息。

代码示例（使用Python的requests库）：
```python
import requests
from requests.exceptions import ConnectionError

try:
    response = requests.get('http://example.com')
    print(response.text)
except ConnectionError as e:
    print("Connection Error:", e)
```

测试用例：
```python
def test_connection():
    urls = ['http://example.com', 'https://nonexistent-domain.com']
    for url in urls:
        try:
            requests.get(url, timeout=5)
            print(f"Successfully connected to {url}")
        except ConnectionError as e:
            print(f"Failed to connect to {url}:", e)

test_connection()
```

如果是在人工智能大模型方面，可以尝试将上述代码转换为使用大模型的API调用。例如，如果你正在使用OpenAI的GPT模型，可以这样做：
```python
import openai

openai.api_key = 'your-api-key'

try:
    response = openai.Completion.create(engine="text-davinci-002", prompt="Write a sentence about artificial intelligence.")
    print(response.choices[0].text.strip())
except Exception as e:
    print("Error while calling OpenAI API:", e)
```