东是汉字“东”的Unicode编码的十进制表示;
char t = (char)19996;
就将该编码值转换成了相应的字符“东”;
import re
company = '东莞市陈珊服饰源头厂家'
if '&#' in company :
new_a_list = re.findall(r'&#(\d+?);', company )
company = ''
for m in new_a_list:
company += unichr(int(m))
print(company )
输出:东莞市陈珊服饰源头厂家