文件都是二进制文件

文件都是二进制的,所以文件都是二进制文件,所谓“文本文件”和“二进制文件”之分,在于,文本文件那就是可见字符的文本,非文本的文件就是“二进制文件”


"文件都是二进制的“这话从文件都是以二进制数字的形式被保存在存储介质上这个角度来说没有问题。但是所谓的 Unix 在以读取方式打开文件时不区分文本文件和二进制文件,所指的乃是不论你打开的是一个文本文件还是无格式的数据文件(二进制文件),最后得到的都是一个可以从其中读取的(比特,byte)流。


读取256个BYTES, 判断它们是否全都是文本(>= 0x20的, 还包括回车,换行,TAB等),如果是,那么该文件99%就是文本. 如果其中有一个0x00, 100%不是文本的.

UNIX下open, fopen不分文本还是二进制. 分文本还是二进制是为了将行尾的0x0d, 0x0a转化为0x0a用的(比如DOS, WINDOWS). UNIX本来就没有0x0d, 所以没有转换问题.

POSIX系统不区分所谓的“文本文件”和“二进制文件”,一概看作字节流。


Q: What's the difference between text and binary I/O?

A: In text mode, a file is assumed to consist of lines of printable characters (perhaps including tabs). The routines in the stdio library (getc, putc, and all the rest) translate between the underlying system's end-of-line representation and the single \n used in C programs. C programs which simply read and write text therefore don't have to worry about the underlying system's newline conventions: when a C program writes a '\n', the stdio library writes the appropriate end-of-line indication, and when the stdio library detects an end-of-line while reading, it returns a single '\n' to the calling program. [footnote]

In binary mode, on the other hand, bytes are read and written between the program and the file without any interpretation. (On MS-DOS systems, binary mode also turns off testing for control-Z as an in-band end-of-file character.)

Text mode translations also affect the apparent size of a file as it's read. Because the characters read from and written to a file in text mode do not necessarily match the characters stored in the file exactly, the size of the file on disk may not always match the number of characters which can be read from it. Furthermore, for analogous reasons, the fseek and ftell functions do not necessarily deal in pure byte offsets from the beginning of the file. (Strictly speaking, in text mode, the offset values used by fseek and ftell should not be interpreted at all: a value returned by ftell should only be used as a later argument to fseek, and only values returned by ftell should be used as arguments to fseek.)

In binary mode, fseek and ftell do use pure byte offsets. However, some systems may have to append a number of null bytes at the end of a binary file to pad it out to a full record.

See also questions 12.37 and 19.12.

References: ISO Sec. 7.9.2
Rationale Sec. 4.9.2
H&S Sec. 15 p. 344, Sec. 15.2.1 p. 348


  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值