python改文件编码_使用Python更改文本文件的编码：这是不可能的

最新推荐文章于 2024-07-02 18:48:08 发布

weixin_39765100

最新推荐文章于 2024-07-02 18:48:08 发布

阅读量915

点赞数

文章标签： python改文件编码

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_39765100/article/details/113513476

版权

在以任何方式复制或几乎接触到文件时，Windows会将其编码更改为默认的1252:Western-European。在我使用的文本编辑器editpadproplus中，我可以看到并转换编码。我相信这种转换是有效的，因为我一直在处理Windows和UNIX之间的文件，而且我知道当我的文本编辑器更改编码时，文件在UNIX中被正确读取，而在UNIX中，这些文件以前会导致问题。在

我想全部转换文件。所以我尝试在Windows10中使用Python，从Powershell(使用pythonv3.6.2)或CygWin(使用pythonv2.7.13)调用Python。我看到codecs和{}都用于这个工作，并且有评论说io是Python3的正确方法。在

但是文件不会被转换--codecs或io。下面的脚本成功地复制了这些文件，但是我的文本编辑器仍然报告它们为1252。UniversalDetector(在下面脚本的注释部分中)将它们的编码报告为“ascii”。在

要想让这些成功转化，需要做些什么？在import sys

import os

import io

#from chardet.universaldetector import UniversalDetector

BLOCKSIZE = 1048576

#detector = UniversalDetector()

#def get_encoding( current_file ):

# detector.reset()

# for line in file(current_file):

# detector.feed(line)

# if detector.done: break

# detector.close()

# return detector.result['encoding']

def main():

src_dir = ""

if len( sys.argv ) > 1:

src_dir = sys.argv[1]

if os.path.exists( src_dir ):

dest_dir = src_dir[:-2]

for file in os.listdir( src_dir ):

with io.open( os.path.join( src_dir, file ), "r", encoding='cp1252') as source_file:

with io.open( os.path.join( dest_dir, file ), "w", encoding='utf8') as target_file:

while True:

contents = source_file.read( BLOCKSIZE )

if not contents:

break

target_file.write( contents )

#print( "Encoding of " + file + ": " + get_encoding( os.path.join( dest_dir, file ) ) )

else:

print( 'The specified directory does not exist.' )

if __name__ == "__main__":

main()

我尝试过一些变化，比如以UTF8的形式打开文件，调用read()而不使用blocksize，而且最初，编码的指定方式略有不同。它们都成功地复制了文件，但没有按预期对它们进行编码。在

weixin_39765100

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。