TypeError: can only concatenate str (not “int“) to str，TypeError: write() argument must be str, not

最新推荐文章于 2023-12-16 22:28:42 发布

luminous_you

最新推荐文章于 2023-12-16 22:28:42 发布

阅读量1.6k

点赞数

分类专栏： python 文章标签： python 正则表达式

本文链接：https://blog.csdn.net/luminous_you/article/details/120330231

版权

python 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

问题一：TypeError: can only concatenate str (not “int”) to str

》解决的办法
通过str()函数来将其他类型变量转成String。
举例

for num in range(24,100):
	testurl = 'https://aaa.com/search?p='+str(num)

问题二：TypeError: write() argument must be str, not bytes+

需要解决两点

1.	需要二进制写入文件，#a追加写入，b二进制
file = open('D:/github.txt','ab+')

2.利用encode指定写入字符串的编码格式为UTF-8
	file.write(string.encode('utf-8')+a.encode('utf-8'))

问题四：requests.exceptions.ConnectionError: (‘Connection aborted.’, ConnectionResetError(10054, ‘远程主机强迫关闭了一个现有的连接。’, None, 10054, None))

经过一番查询，发现该错误是因为如下：

http的连接数超过最大限制，默认的情况下连接是Keep-alive的，所以这就导致了服务器保持了太多连接而不能再新建连接。
ip被封
程序请求速度过快。

解决办法如下：

第一种方法

try:
    page1 = requests.get(ap)
except requests.exceptions.ConnectionError:
    r.status_code = "Connection refused"

第二种方法：

request的连接数过多而导致Max retries exceeded

在header中不使用持久连接

'Connection': 'close'
或
requests.adapters.DEFAULT_RETRIES = 5

第三种方法：

针对请求请求速度过快导致程序报错。

解决方法可以参考以下例子：

import time

while 1:
    try:
        page = requests.get(url)
    except:
        print("Connection refused by the server..")
        print("Let me sleep for 5 seconds")
        print("ZZzzzz...")
        time.sleep(5)
        print("Was a nice sleep, now let me continue...")
        continue

参考
http://www.chenxm.cc/article/255.html

问题五：UnboundLocalError: local variable ‘list’ referenced before assignment

这个问题的解决办法尤其简单，就是千万不要用list作为自己的自定义的变量名！！！

问题三：python换行写入文件

file = open('D:/github.txt','ab+')
#soup.find_all按标签寻找
#attrs={'class': 'Link--secondary'}) 找名字是Link--secondary的class标签
for classlist in soup.find_all(attrs={'class': 'Link--secondary'}):
	string=classlist.text
	a='\n'
	file.write(string.encode('utf-8')+a.encode('utf-8'))
file.close()

# 关于open()的mode参数：
# 'r'：读
# 'w'：写
# 'a'：追加
# 'r+' == r+w（可读可写，文件若不存在就报错(IOError)）
# 'w+' == w+r（可读可写，文件若不存在就创建）
# 'a+' ==a+r（可追加可写，文件若不存在就创建）
# 对应的，如果是二进制文件，就都加一个b就好啦：
# 'rb'　　'wb'　　'ab'　　'rb+'　　'wb+'　　'ab+'

遍历某网站标签内容写入文件

import requests
from bs4 import BeautifulSoup
for num in range(1,100):
    testurl = 'https://aaaa.com/search?p='+str(num)+'&q=1
    print(testurl)
    header = {
        "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36",
        "cookie": ""
    }
    #发送get请求
    list = requests.get(testurl, headers=header)
    soup = BeautifulSoup(list.text, 'lxml')
    file = open('D:/github.txt','ab+')
    #循环名字是Link--secondary的class标签
    for classlist in soup.find_all(attrs={'class': 'Link--secondary'}):
        # print(classlist.text)
        string=classlist.text
        #添加换行符
        a='\n'
        #写入文件
        file.write(string.encode('utf-8')+a.encode('utf-8'))
    file.close()

BeautifulSoup库利用知识

find_all(name, attrs, recursive, string, limit, **kwargs)
@PARAMS:
    name: 查找的value，可以是string，list，function，真值或者re正则表达式
    attrs: 查找的value的一些属性，class等。
    recursive: 是否递归查找子类，bool类型
    string: 使用此参数，查找结果为string类型；如果和name搭配，就是查找符合name的包含string的结果。
    limit: 查找的value的个数
    **kwargs: 其他一些参数

find()方法与find_all()方法类似，只是find_all()方法返回的是文档中符合条件的所有tag,是一个集合，find()方法返回的一个Tag
soup.find_all(attrs={‘class’: ‘Link–secondary’}):

在这里插入图片描述

lxml库利用知识
lxml 是一个HTML/XML的解析器，主要的功能是如何解析和提取 HTML/XML 数据。

luminous_you

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
TypeError: can only concatenate str (not “int“) to str，TypeError: write() argument must be str, not

问题一：TypeError: can only concatenate str (not “int”) to str》解决的办法通过str()函数来将其他类型变量转成String。举例for num in range(24,100): testurl = 'https://aaa.com/search?p='+str(num)问题二：TypeError: write() argument must be str, not bytes+需要解决两点1. 需要二进制写入文件，#a追加写入，b二
复制链接

扫一扫

专栏目录