python encode函数_关于字符串：在python中使用unicode()和encode()函数

最新推荐文章于 2022-10-14 10:28:25 发布

weixin_39753213

最新推荐文章于 2022-10-14 10:28:25 发布

阅读量360

点赞数

文章标签： python encode函数

我在对路径变量进行编码并将其插入到sqlite数据库时遇到问题。我试图用没有帮助的编码("utf-8")函数来解决这个问题。然后我使用了unicode()函数，它给了我unicode类型。

1

2

3

4print type(path) #

path = path.replace("one","two") #

path = path.encode("utf-8") # strange

path = unicode(path) #

最后，我得到了unicode类型，但是我仍然有同样的错误，当路径变量的类型是str时仍然存在。

sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless

you use a text_factory that can interpret 8-bit bytestrings (like

text_factory = str). It is highly recommended that you instead just

switch your application to Unicode strings.

你能帮我解决这个错误并解释一下encode("utf-8")和unicode()函数的正确用法吗？我经常和它打架。

编辑：

此execute()语句引发了错误：

1cur.execute("update docs set path = :fullFilePath where path = :path", locals())

我忘了改变fullfilepath变量的编码，它也有同样的问题，但是我现在很困惑。我应该只使用unicode()或encode("utf-8")还是两者都使用？

我不能用

1fullFilePath = unicode(fullFilePath.encode("utf-8"))

因为它引发了这个错误：

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position

32: ordinal not in range(128)

python版本是2.7.2

引发错误的代码在哪里？

您的确切问题已经得到了解答：【stackoverflow.com/questions/2392732/&hellip；[1]：stackoverflow.com/questions/2392732/&hellip；

@我编辑了这个问题。

您是否已将两个使用的变量都转换为unicode？

了解python3如何处理文本和数据，真的帮助我了解了一切。然后很容易将这些知识应用到Python2。

下面是关于python中unicode的精彩演讲幻灯片--link

str是以字节表示的文本，unicode是以字符表示的文本。

将文本从字节解码为Unicode，并使用某些编码将Unicode编码为字节。

即：

1

2

3

4>>> 'abc'.decode('utf-8') # str to unicode

u'abc'

>>> u'abc'.encode('utf-8') # unicode to str

'abc'

谢谢您。非常有用的信息。

回答很清楚，很聪明谢谢

回答得很好，直截了当。我想补充一下，unicode指的是字母或符号，或者更一般地说：runes，而str代表某个编码中的字节字符串，你必须decode(显然是正确的编码)才能得到特定的runes。

您使用encode("utf-8")不正确。python字节字符串(str类型)有编码，unicode没有。可以使用uni.encode(encoding)将unicode字符串转换为python字节字符串，也可以使用s.decode(encoding)将字节字符串转换为unicode字符串(或等同于unicode(s, encoding))。

如果fullFilePath和path当前是str类型，您应该知道它们是如何编码的。例如，如果当前编码是UTF-8，则将使用：

1

2path = path.decode('utf-8')

fullFilePath = fullFilePath.decode('utf-8')

如果这样做不能解决问题，实际的问题可能是您在execute()调用中没有使用unicode字符串，请尝试将其更改为以下内容：

1cur.execute(u"update docs set path = :fullFilePath where path = :path", locals())

这一说法仍然引起了错误。fullfilepath是str和string类型的组合，取自db表的文本列，该列应该是utf-8编码。

根据这一点，它可以是utf-8、utf-16be或utf-16le。我能不知何故找到它吗？

@xralf，如果您组合不同的str对象，则可能是混合编码。你能出示print repr(fullFilePath)的结果吗？

我只能在decode()调用之前显示它。有问题的字符是u0161和u0165。

@xralf-那么它已经是unicode了？尝试将execute调用更改为unicode:cur.execute(u"update docs set path = :fullFilePath where path = :path", locals())。

谢谢。这就解决了它，但我仍然想知道为什么它是Unicode，当我用与路径变量和路径变量str相同的方式组合它时。

我会在你的答案中加上你最后的评论，因为这真的解决了问题。

这条路走错了。

在从shell运行脚本之前，确保已经设置了区域设置，例如

1

2

3

4

5$ locale -a | grep"^en_.\+UTF-8"

en_GB.UTF-8

en_US.UTF-8

$ export LC_ALL=en_GB.UTF-8

$ export LANG=en_GB.UTF-8

文件：man locale、man setlocale。

weixin_39753213

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python encode函数_关于字符串：在python中使用unicode()和encode()函数

我在对路径变量进行编码并将其插入到sqlite数据库时遇到问题。我试图用没有帮助的编码("utf-8")函数来解决这个问题。然后我使用了unicode()函数，它给了我unicode类型。1234print type(path) # path = path.replace("one","two") # path = path.encode("utf-8") ...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。