如何检测文本文件的编码/代码页

最新推荐文章于 2024-05-16 22:10:14 发布

asdfgh0077

最新推荐文章于 2024-05-16 22:10:14 发布

阅读量927

点赞数

文章标签： c# .net text encoding globalization

原文链接：https://oldbug.net/q/Nd8/How-can-I-detect-the-encoding-codepage-of-a-text-file

版权

本文翻译自：How can I detect the encoding/codepage of a text file

In our application, we receive text files ( .txt , .csv , etc.) from diverse sources. 在我们的应用程序中，我们从各种来源接收文本文件（ .txt ， .csv等）。 When reading, these files sometimes contain garbage, because the files where created in a different/unknown codepage. 读取时，这些文件有时包含垃圾，因为这些文件是在不同/未知的代码页中创建的。

Is there a way to (automatically) detect the codepage of a text file? 有没有办法（自动）检测文本文件的代码页？

The detectEncodingFromByteOrderMarks , on the StreamReader constructor, works for UTF8 and other unicode marked files, but I'm looking for a way to detect code pages, like ibm850 , windows1252 . 该detectEncodingFromByteOrderMarks ，对StreamReader构造，适用于UTF8等统一标记的文件，但是我正在寻找一种方法来检测代码页，像ibm850 ， windows1252 。

Thanks for your answers, this is what I've done. 感谢您的回答，这就是我所做的。

The files we receive are from end-users, they do not have a clue about codepages. 我们收到的文件来自最终用户，他们不了解代码页。 The receivers are also end-users, by now this is what they know about codepages: Codepages exist, and are annoying. 接收者也是最终用户，到目前为止，这是他们对代码页的了解：代码页存在并且令人讨厌。

Solution: 解：

Open the received file in Notepad, look at a garbled piece of text. 在记事本中打开接收到的文件，查看乱码的文本。 If somebody is called François or something, with your human intelligence you can guess this. 如果有人叫弗朗索瓦（François）之类的东西，凭着您的智慧，您就可以猜

最低0.47元/天解锁文章

asdfgh0077

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
如何检测文本文件的编码/代码页

In our application, we receive text files ( .txt , .csv , etc.) from diverse sources. 在我们的应用程序中，我们从
复制链接

扫一扫

如何检测文本文件的编码/代码页

“相关推荐”对你有帮助么？