1、导入数据库后中文乱码
UnicodeDecodeError: 'gbk' codec can't decode byte 0xa4 in position 124: illegal multibyte sequence
解决方案:encoding='utf-8'
with codecs.open(file_path, 'rU', encoding='utf-8',errors='ignore') as f:
2、emoji表情
pymysql.err.DataError: (1366, "Incorrect string value: '\\xF0\\x9F\\x91\\x88 d...' for column 'key' at row 22")
还没找到合适的解决方案
1、利用正则表达式过滤emoji
import re
def f_emoji(emoji_text):
try:
co = re.compile(u'[\U00010000-\U0010ffff]')
except re.error:
co = re.compile(u'[\uD800-\uDBFF][\uDC00-\uDFFF]')
return co.sub('', emoji_text)
2、emoji编码化👍
finger=emoji.demojize('👍')
print(finger)
print(emoji.emojize(finger))
#':thumbs_up:'
#👍