Brainfuck语言 未定义行为

在C/C++编程过程中,有些规则我们不知道这是语言规定的,还是语言未定义的,

比如一些编译器的通用规则、不同平台的差异等,因为语法太多,互相组合嵌套有太多的可能性,而且语言一直在变。

在BF语言编程过程中,对这个界限有了更加清晰的设计(当然,不同语言的边界是不一样的,这个并不是关键,关键是理解编程语言设计的原理)

Brainfuck语言实战里面有我提交OJ的代码,因为一直是超时,走读验证测试了很多遍都找不到BUG,在这个过程中就对边界更有感觉了,

然后在浏览维基词条的时候发现一块内容刚好和我想到的这几个问题吻合:

Cell size

In the classic distribution, the cells are of 8-bit size (cells are bytes), and this is still the most common size. However, to read non-textual data, a brainfuck program may need to distinguish an end-of-file condition from any possible byte value; thus 16-bit cells have also been used. Some implementations have used 32-bit cells, 64-bit cells, or bignum cells with practically unlimited range, but programs that use this extra range are likely to be slow, since storing the value  into a cell requires  time as a cell's value may only be changed by incrementing and decrementing.

通常使用8-bit cells,但是16位,32位,64位甚至更大的也都有。

In all these variants, the , and . commands still read and write data in bytes. In most of them, the cells wrap around, i.e. incrementing a cell which holds its maximal value (with the + command) will bring it to its minimal value and vice versa. The exceptions are implementations which are distant from the underlying hardware, implementations that use bignums, and implementations that try to enforce portability.

无论是多少位的,输入输出都是以字节为单位。

It is usually easy to write brainfuck programs that do not ever cause integer wraparound or overflow, and therefore don't depend on cell size. Generally this means avoiding increment of +255 (unsigned 8-bit wraparound), or avoiding overstepping the boundaries of [-128, +127] (signed 8-bit wraparound) (since there are no comparison operators, a program cannot distinguish between a signed and unsigned two's complement fixed-bit-size cell and negativeness of numbers is a matter of interpretation). For more details on integer wraparound, see the Integer overflow article.

关于整数溢出、有符号无符号的讨论。

PS:BF编程才更加深刻的体会到有符号无符号只是数据的一种理解方式而已,本质上还是每个字节就是8位0/1的256个组合。

Array size

In the classic distribution, the array has 30,000 cells, and the pointer begins at the leftmost cell. Even more cells are needed to store things like the millionth Fibonacci number, and the easiest way to make the language Turing complete is to make the array unlimited on the right.

纸带大小通常是30000,初始指针在最左边。

A few implementations[12] extend the array to the left as well; this is an uncommon feature, and therefore portable brainfuck programs do not depend on it.

有的解释器会使用双边的纸带,即指针在中间。

When the pointer moves outside the bounds of the array, some implementations will give an error message, some will try to extend the array dynamically, some will not notice and will produce undefined behavior, and a few will move the pointer to the opposite end of the array. Some tradeoffs are involved: expanding the array dynamically to the right is the most user-friendly approach and is good for memory-hungry programs, but it carries a speed penalty. If a fixed-size array is used it is helpful to make it very large, or better yet let the user set the size. Giving an error message for bounds violations is very useful for debugging but even that carries a speed penalty unless it can be handled by the operating system's memory protections.

数组越界的未定义行为。

End-of-line code

End-of-line code
Different operating systems (and sometimes different programming environments) use subtly different versions of ASCII. The most important difference is in the code used for the end of a line of text. MS-DOS and Microsoft Windows use a CRLF, i.e. a 13 followed by a 10, in most contexts. UNIX and its descendants (including GNU/Linux and Mac OS X) and Amigas use just 10, and older Macs use just 13. It would be difficult if brainfuck programs had to be rewritten for different operating systems. However, a unified standard was easy to create. Urban Müller's compiler and his example programs use 10, on both input and output; so do a large majority of existing brainfuck programs; and 10 is also more convenient to use than CRLF. Thus, brainfuck implementations should make sure that brainfuck programs that assume newline = 10 will run properly; many do so, but some do not.

This assumption is also consistent with most of the world's sample code for C and other languages, in that they use '\n', or 10, for their newlines. On systems that use CRLF line endings, the C standard library transparently remaps "\n" to "\r\n" on output and "\r\n" to "\n" on input for streams not opened in binary mode.

换行符的差异

End-of-file behavior

The behavior of the , command when an end-of-file condition has been encountered varies. Some implementations set the cell at the pointer to 0, some set it to the C constant EOF (in practice this is usually -1), some leave the cell's value unchanged. There is no real consensus; arguments for the three behaviors are as follows.

Setting the cell to 0 avoids the use of negative numbers, and makes it marginally more concise to write a loop that reads characters until EOF occurs. This is a language extension devised by Panu Kalliokoski.

Setting the cell to -1 allows EOF to be distinguished from any byte value (if the cells are larger than bytes), which is necessary for reading non-textual data; also, it is the behavior of the C translation of , given in Müller's readme file. However, it is not obvious that those C translations are to be taken as normative.

Leaving the cell's value unchanged is the behavior of Urban Müller's brainfuck compiler. This behavior can easily coexist with either of the others; for instance, a program that assumes EOF = 0 can set the cell to 0 before each , command, and will then work correctly on implementations that do either EOF = 0 or EOF = "no change". It is so easy to accommodate the "no change" behavior that any brainfuck programmer interested in portability should do so.

文件尾EOF的差异

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
Brainfuck,是一种极小化的计算机语言,它是由Urban Müller在1993年创建的。由于fuck在英语中是脏话,这种语言有时被称为brainf*ck或brainf***,甚至被简称为BF。 【内含:BF解释器,BF解释器源码,BF写的几个小程序】 ReadMe: Brainfuck 编程语言 [图灵完全] [8条指令] 语法: > 指针加一 < 指针减一 + 指针指向的字节的值加一 - 指针指向的字节的值减一 . 输出指针指向的单元内容(ASCII码) , 输入内容到指针指向的单元(ASCII码) [ 如果指针指向的单元值为零,向后跳转到对应的]指令的次一指令处 ] 如果指针指向的单元值不为零,向前跳转到对应的[指令的次一指令处 特性: 8KB 环状内存(初始化为0) <>操作不会越界 加减操作环状 +-操作不会溢出(0xFF + 为 0x00) 文件说明: bf.exe 解释器 Usage: bf [-options] source where options include: -b buffered input (default mode) 缓冲输入(按回车才输入,默认) -i not buffered input 无缓冲输入 -e not buffered input without echo 无缓冲输入且不回显 bf.cpp 解释器的源代码(纯C实现) hello.txt HelloWorld程序 up.txt 这个程序将你的输入(小写字母)转换为大写(回车结束) add.txt 这个程序对两个一位数做加法,并输出结果(如果结果也只有一位数的话)(例如:输入2+3) mul.txt 这个程序对两个一位数做乘法,并输出结果(如果结果也只有一位数的话)(例如:输入2*3) factor.txt 这个程序分解多位数的因子,并输出结果(例如:输入1000) numwarp.txt 这个程序输入 ()-./0123456789abcdef 和空格的字符串,显示很有趣的排列结果(例如:输入520 1314) prime.txt 这个程序输入一个多位整数,输出从1到这个整数间的所有素数(例如:输入100) quine.txt 这个程序输出源代码本身 [以上程序,基本上依靠回车确认]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值