chapter 22 The I/O Library

The I/O library offers two different models for file manipulation. The simple
model assumes a current input file and a current output file, and its I/O operations
operate on these files
. The complete model uses explicit file handles; it
adopts an object-oriented style that defines all operations as methods on file

handles.


22.1 The Simple I/O Model

The simple model does all its operations on two current files. The library
initializes the current input file as the process standard input (stdin) and the
current output file as the process standard output (stdout). Therefore, when we
execute something like io.read(), we read a line from the standard input.

that's when startup Lua

cmd: Lua <<abc efg  or

cmd :Lua <<c://test.txt  indicate a file as the standard input file.


We can change these current files with the io.input and io.output functions.其实就是重定向标准输入输出
A call like io.input(filename) opens the given file in read mode and sets it as
the current input file. From this point on, all input will come from this file, until
another call to io.input; io.output does a similar job for output. In case of an
error, both functions raise the error. If you want to handle errors directly, you
must use the complete mode
l.也就是无法拦截error


The io.write function  simply gets an arbitrary number of string arguments,也就是可以传入 多个string argument

and writes them to the  current output file. It converts numbers to strings following the usual conversion

rules; for full control over this conversion, you should use the string.format
function:


io.write("sin (3) = ", math.sin(3), "\n")
--> sin (3) = 0.14112000805987
> io.write(string.format("sin (3) = %.4f\n", math.sin(3)))
--> sin (3) = 0.1411


Avoid code like io.write(a..b..c); the call io.write(a,b,c) accomplishes the
same effect with fewer resources, as it avoids the concatenations
.


As a rule, you should use print for quick-and-dirty programs or debugging,
and write when you need full control over your output:


print("hello", "Lua"); print("Hi")
-- hello Lua
-- Hi
> io.write("hello", "Lua"); io.write("Hi", "\n")
-- helloLuaHi  --no space between hello and Lua.



Unlike print, write adds no extra characters to the output, such as tabs or
newlines.所以上面的example we add "\n",
Moreover, write allows you to redirect your output, whereas print
always uses the standard output.
Finally, print automatically applies tostring
to its arguments; this is handy for debugging, but it can hide bugs if you are not
paying attention to your output.


======lession21.lua

print("--print out put test");
io.write("io.write --out put test \n");

io.output("redirect.txt");
print("--print out put test");
io.write("io.write --out put test \n");
io.flush();

--------

==> dofile("lession21.lua");
--print out put test
io.write --out put test   -- 此时io.write use standar output.
--print out put test    -- 即使I set io.output("redirect.txt");, print still use standar output.


The io.read function reads strings from the current input file. Its arguments
control what to read:

“*a” reads the whole file
“*l” reads the next line (without newline)
“*L” reads the next line (with newline)
“*n” reads a number
num reads a string with up to num character
s


The call io.read("*a") reads the whole current input file, starting at its
current position,how to set current position?
If we are at the end of the file, or if the file is empty, the call
returns an empty string.


Because Lua handles long strings efficiently, a simple technique for writing
filters in Lua is to read the whole file into a string, do the processing to the string
(typically with gsub), and then write the string to the output
:
t = io.read("*a") -- read the whole file
t = string.gsub(t, ...) -- do the job
io.write(t) -- write the file


As an example, the following chunk is a complete program to code a file’s content
using the MIME quoted-printable encoding. This encoding codes each non-
ASCII byte as =xx, where xx is the value of the byte in hexadecimal. To keep
the consistency of the encoding, the ‘=’ character must be encoded as well:


t = io.read("*a")
t = string.gsub(t, "([\128-\255=])", function (c)
return string.format("=%02X", string.byte(c))
end)
io.write(t)


The pattern used in the gsub captures all bytes from 128 to 255, plus the equal
sign
.


The call io.read("*l") returns the next line from the current input file,
without the newline character; the call io.read("*L") is similar, but it keeps
the newline character (if present in the file). When we reach the end of file, the
call returns nil (as there is no next line to return). The pattern “*l” is the default
for read.
就是 *l/*L 是一行一行的读的

Usually, I use this pattern only when the algorithm naturally handles
the file line by line; otherwise, I favor reading the whole file at once, with *a, or
in blocks, as we will see later.


io.input("stdinput.txt");
k=io.read("*l");
print(k);
k=io.read("*l");
print(k);
l=io.read("*L");
print(l);
l=io.read("*L");
print(l);


stdinput.txt context below:

line1
line2
line3
line4
line5
line6


---------

line1
line2
line3  -- \n 也读了for *L

line4 -- \n 也读了for *L

===============


As a simple example of the use of this pattern, the following program copies
its current input to the current output, numbering each line:
for count = 1, math.huge do
local line = io.read()
if line == nil then break end -- 通过 nil 判断是否读到end of file
io.write(string.format("%6d ", count), line, "\n")
end


However, to iterate on a whole file line by line, we do better to use the io.lines
iterator. For instance, we can write a complete program to sort the lines of a file
as follows: use io.lines(), as this will retrun a iterator

local lines = {}
-- read the lines in table 'lines'
for line in io.lines() do lines[#lines + 1] = line end
-- sort
table.sort(lines)
-- write all the lines
for _, l in ipairs(lines) do io.write(l, "\n") end


The call io.read("*n") reads a number from the current input file. This is
the only case where read returns a number, instead of a string. When a program
needs to read many numbers from a file, the absence of the intermediate strings
improves its performance. The *n option skips any spaces before the number
and accepts number formats like -3, +5.2, 1000, and -3.4e-23. If it cannot find
a number at the current file position (because of bad format or end of file), it
returns nil.


You can call read with multiple options; for each argument, the function will
return the respective result.
Suppose you have a file with three numbers per
line:
6.0 -3.23 15e12
4.3 234 1000001

.......


Now you want to print the maximum value of each line. You can read all three
numbers with a single call to read:
while true do
local n1, n2, n3 = io.read("*n", "*n", "*n") ---想不到还能这样写,,,也是每次读一行,但返回了3 个number
if not n1 then break end  -- break the while
print(math.max(n1, n2, n3))
end


Besides the basic read patterns, you can call read with a number n as an
argument: in this case, read tries to read n characters from the input file(块读取). If it
cannot read any character (end of file), read returns nil; otherwise, it returns
a string with at most n characters,不一定读到n byte. As an example of this read pattern, the
following program is an efficient way (in Lua, of course) to copy a file from stdin
to stdout:
while true do
local block = io.read(2^13) -- buffer size is 8K
if not block then break end
io.write(block)
end
As a special case, io.read(0) works as a test for end of file: it returns an
empty string if there is more to be read or nil otherwis
e.


22.2 The Complete I/O Model

A central concept in this model is the file handle, which is equivalent to streams (FILE*) in C: it
represents an open file with a current position.


To open a file, you use the io.open function, which mimics the fopen function
in C
. It takes as arguments the name of the file to open plus a mode string. This
mode string can contain

an ‘r’ for reading,

a ‘w’ for writing (which also erasesany previous content of the file),

or an ‘a’ for appending, plus

an optional ‘b’ to open binary files.

The open function returns a new handle for the file. In case of an error, open returns nil, plus an error message and an error number:


print(io.open("non-existent-file", "r"))
--> nil non-existent-file: No such file or directory 2
print(io.open("/etc/passwd", "w"))
--> nil /etc/passwd: Permission denied 13


The interpretation of the error numbers is system dependent.
A typical idiom to check for errors is:
local f = assert(io.open(filename, mode))

If the open fails, the error message goes as the second argument to assert, which
then shows the message.


After you open a file, you can read from it or write to it with the methods
read/write. They are similar to the read/write functions, but you call them as
methods on the file handle, using the colon syntax. For instance, to open a file
and read it all, you can use a chunk like this:

---经典的读文件方法拉,,,

local f = assert(io.open(filename, "r"))
local t = f:read("*a")
f:close()


The I/O library offers handles for the three predefined C streams: io.stdin,
io.stdout, and io.stderr.
So, you can send a message directly to the error
stream with a code like this:
io.stderr:write(message)


We can mix the complete model with the simple model.

We get the current  input file handle by calling io.input(), without arguments.

We set this handle  with the call io.input(handle). (Similar calls are also valid for io.output.)

For  instance, if you want to change the current input file temporarily, you can write
something like this:

local temp = io.input() -- save current file  --返回的是文件handle
io.input("newinput") -- open a new current file
<do something with new input>  也就以后的io operation,就可以用simple model,而不用每次都靠handle
io.input():close() -- close current file
io.input(temp) -- restore previous current file

Instead of io.read, we can also use io.lines to read from a file. As we have
seen in previous examples, io.lines gives an iterator that repeatedly reads from
a file.


The first argument to io.lines can be a file name or a file handle. Given a
file name, io.lines will open the file in read mode and will close the file after
reaching end of file.
Given a file handle, io.lines will use that file for reading;
in this case, io.lines will not close the file after reading it. When called with
no arguments, io.lines will read from the current input file.


Starting in Lua 5.2, io.lines also accepts the same options that io.read
accepts, after the file argument. As an example, the next fragment copies a file
to the current output: using io.lines:
for block in io.lines(filename, 2^13) do
io.write(block)
end



A small performance trick

Usually, in Lua, it is faster to read a file as a whole than to read it line by line.
However, sometimes we must face a big file (say, tens or hundreds megabytes)


for which it is not reasonable to read it all at once. If you want to handle such big
files with maximum performance, the fastest way is to read them in reasonably
large chunks (e.g., 8 kB each). To avoid the problem of breaking lines in the
middle, you simply ask to read a chunk plus a line:
local lines, rest = f:read(BUFSIZE, "*l")


The variable rest will get the rest of any line broken by the chunk,并不是说还有多少没读到,而是由于BUFFER SIZE 的原因,不可能每次都恰好到一行尾止. We then concatenate the chunk and this rest of line. This way, the resulting chunk will
always break at line boundaries.


io.input("stdinput.txt");
local lines,rest = io.read(2^4,"*l")
print(lines,"rest=",rest);
local lines,rest = io.read(2^4,"*l")
print(lines,"rest=",rest);


---------stdinput.txt

line1
line2
linerestTest  ---2^4=16, 到line止,已经够16 个字符了,那么剩下的就是restTest
line4
line5
linrestTest
sadfasfasdf
asdfasf
asfasfsafasf
asdfasdf
sadfasfasdf


------------

line1
line2
line    rest=   restTest
line4   --可以 看到下一次读是从新行开始读,而不是接着上一次的位置读的
line5
lin     rest=   restTest


----------Wowow----

local lines,rest = io.read(2^4)
print(lines,"rest=",rest);
local lines,rest = io.read(2^4)
print(lines,"rest=",rest);

--------------如果没有"*L" or "*l", 那么读会从上一次的位置读的。。。。。also true for io.lines().










The example in Listing 22.1 uses this technique to implement wc, a program
that counts the number of characters, words, and lines in a file. Note the use
of io.lines to do the iteration and of the option “*L” to read a line with the
newline;


Listing 22.1. The wc program:

local BUFSIZE = 2^13 -- 8K
local f = io.input(arg[1]) -- open input file
local cc, lc, wc = 0, 0, 0 -- char, line, and word counts
   for lines, rest in io.lines(arg[1], BUFSIZE, "*L") do
     if rest then lines = lines .. rest end
      cc = cc + #lines
      -- count words in the chunk
       local _, t = string.gsub(lines, "%S+", "")
     wc = wc + t --我们只要第二个返回值,也就是替换了多少次就是有多少个非空格,that's words.
      -- count newlines in the chunk
     _,t = string.gsub(lines, "\n", "\n")
      lc = lc + t
     end
print(lc, wc, cc)



Binary files

The simple-model functions io.input and io.output always open a file in text
mode (the default). In UNIX, there is no difference between binary files and text
files. But in some systems, notably Windows, binary files must be opened with
a special flag. To handle such binary files, you must use io.open, with the letter
‘b’ in the mode string.


Lua handles binary data similarly to text. A string in Lua can contain any
bytes, and almost all functions in the libraries can handle arbitrary bytes. You
can even do pattern matching over binary data, as long as the pattern does not
contain a zero byte.
If you want to match this byte in the subject, you can use
the class %z instea
d. %z math zero byte. See below example:

s="abc\0cdefg\0"
print(s:gsub("%z","ZEROBYTE"));

abcZEROBYTEcdefgZEROBYTE        2




Typically, you read binary data either with the *a pattern, that reads the
whole file, or with the pattern n, that reads n bytes. As a simple example, the
following program converts a text file from Windows format to UNIX format
(that is, it translates sequences of carriage return–newlines to newlines). It
does not use the standard I/O files (stdin–stdout), because these files are open
in text mode. Instead, it assumes that the names of the input file and the output
file are given as arguments to the program:


local inp = assert(io.open(arg[1], "rb"))
local out = assert(io.open(arg[2], "wb"))
local data = inp:read("*a")
data = string.gsub(data, "\r\n", "\n")  --\r\n carriage return–newlines in Window, but in UNIX, just \n
out:write(data)
assert(out:close())



You can call this program with the following command line:
 lua prog.lua file.dos file.unix


As another example, the following program prints all strings found in a
binary file:
local f = assert(io.open(arg[1], "rb"))
local data = f:read("*a")
local validchars = "[%g%s]"  --%g printable characters except spaces %s, space. so,that's all the validate chars.
local pattern = "(" .. string.rep(validchars, 6) .. "+)\0"
for w in string.gmatch(data, pattern) do
print(w)
end



As a last example, the following program makes a dump of a binary file:


local f = assert(io.open(arg[1], "rb"))
local block = 16
   for bytes in f:lines(block) do
     for c in string.gmatch(bytes, ".") do
        io.write(string.format("%02X ", string.byte(c)))
    end
io.write(string.rep(" ", block - string.len(bytes)))  --保持前面是16 个字符,不够用空格补齐
io.write(" ", string.gsub(bytes, "%c", "."), "\n") -- we no use to use string.char, io.write 有个这功能吧,,or 这个char 也是个整体了,,,
end




要明白上面的程序,我做了下面的测试:

a1="b"

==> a1=string.byte(a)
==> =a1
98  ==b 字母的ASCII


==> =string.gsub(a1,"%c",".")
98      0   --%c 也就是98 不会当成9 and 8, 而是当成98 ,a char.

==> =string.char(string.gsub(a1,"%c","."))
b

==> =string.gsub(a1,"%C","**")

****    2  9,8 被分别match 了,,,


Again, the first program argument is the input file name; the output goes to the
standard output. The program reads the file in chunks of 16 bytes. For each
chunk, it writes the hexadecimal representation of each byte, and then it writes
the chunk as text, changing control characters to dots.
Listing 22.2 shows the result of applying this program over itself (in a UNIX
machine).


22.3 Other Operations on Files


The tmpfile function returns a handle for a temporary file, open in read/write
mode. This file is automatically removed (deleted) when your program ends.



The flush function executes all pending writes to a file. Like the write
function, you can call it as a function, io.flush(), to flush the current output
file; or as a method, f:flush(), to flush a particular file f.


The setvbuf method sets the buffering mode of a stream. Its first argument is
a string: “no” means no buffering; “full” means that the stream is only written
out when the buffer is full or when you explicitly flush the file; “line” means
that the output is buffered until a newline is output or there is any input from
some special files (such as a terminal device)也就是有新行的flag or other special files 注意是special files ie:terminal device 有input then will flush the buffer.
For the last two options, setvbuf
accepts an optional second argument with the buffer size
.



In most systems, the standard error stream (io.stderr) is not buffered, while
the standard output stream (io.stdout) is buffered in line mode. So, if you write
incomplete lines to the standard output (e.g., a progress indicator), you may need
to flush the stream to see that output.



The seek method can both get and set the current position of a file. Its
general form is f:seek(whence,offset), where the whence parameter is a string
that specifies how to interpret the offset. Its valid values are “set”, when offsets
are interpreted from the beginning of the file; “cur”, when offsets are interpreted
from the current position of the file; and “end”, when offsets are interpreted from
the end of the file.
Independently of the value of whence, the call returns the new
current position of the file, measured in bytes from the beginning of the file.


The default value for whence is “cur” and for offset is zero. Therefore, the
call file:seek() returns the current file position, without changing it; the call
file:seek("set") resets the position to the beginning of the file (and returns
zero); and the call file:seek("end") sets the position to the end of the file and
returns its size. The following function gets the file size without changing its
current position:
function fsize (file)
local current = file:seek() ---目的是下面的还原,we are not assume,the currentl position is at the beginning.

local size = file:seek("end") -- get file size
file:seek("set", current) -- restore position
return size
end
All these functions return nil plus an error message in case of error.












 


































































  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值