python十进制转九进制,Python,文件(1)为什么用数字[7,8,9,10,12,13,27]和范围(0x20,0x100)来确定文本与二进制fi...

它们代表可打印文本最常见的代码点,加上换行符、空格和回车符等等。ASCII被覆盖到0x7F,而拉丁语-1或Windows代码页1251等标准将剩余的128字节用于重音字符等

你希望文本只使用那些代码点。二进制数据将使用0x00-0xFF范围内的所有码位;例如,文本文件可能不会使用\x00(NUL)或\x1F(ASCII标准中的单位分隔符)。在

不过,这充其量只是一种启发。一些文本文件可能仍然尝试在显式命名的7个字符之外使用C0 control codes,我确信存在的二进制数据碰巧不包括textchars字符串中未包含的25字节值。在

范围的作者可能基于file命令中的^{} table。它将字节标记为非文本、ASCII、Latin-1或非ISO扩展ASCII,并包含有关为什么选择这些代码点的文档:/*

* This table reflects a particular philosophy about what constitutes

* "text," and there is room for disagreement about it.

*

* [....]

*

* The table below considers a file to be ASCII if all of its characters

* are either ASCII printing characters (again, according to the X3.4

* standard, not isascii()) or any of the following controls: bell,

* backspace, tab, line feed, form feed, carriage return, esc, nextline.

*

* I include bell because some programs (particularly shell scripts)

* use it literally, even though it is rare in normal text. I exclude

* vertical tab because it never seems to be used in real text. I also

* include, with hesitation, the X3.64/ECMA-43 control nextline (0x85),

* because that's what the dd EBCDIC->ASCII table maps the EBCDIC newline

* character to. It might be more appropriate to include it in the 8859

* set instead of the ASCII set, but it's got to be included in *something*

* we recognize or EBCDIC files aren't going to be considered textual.

*

* [.....]

*/

有趣的是,表排除了0x7F,而您发现的代码没有。在

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值