golang编程之文件操作

最新推荐文章于 2024-02-06 16:59:08 发布

blade2001

最新推荐文章于 2024-02-06 16:59:08 发布

阅读量4.7k

点赞数 2

分类专栏：大规模分布计算(云、搜索引擎) 文章标签： go

大规模分布计算(云、搜索引擎) 专栏收录该内容

75 篇文章 0 订阅

订阅专栏

http://blog.chinaunix.net/uid-24774106-id-3993609.html

操作文件是任何编程语言都绕不过，要掌握一门语言，知道如何操作文件是必不可少的，今天学习了下golang对文件操作的支持。
golang对文件的支持是在os package里。我无意将本文写成官方文档的模样，我只是想讨论如何利用这些接口操作文件。
OPEN
熟悉文件系统的人都知道，open是整个文件系统中最复杂的接口之一。熟悉C语言的都知道，C语言中有open和creat，接口如下：

 
         #include <sys/types.h>
 
        #include <sys/stat.h>
 
        #include <fcntl.h>
 
        int open(const char *pathname, int flags);
 
        int open(const char *pathname, int flags, mode_t mode);
 
        int creat(const char *pathname, mode_t mode)

对C的open而言，如果flag里面有了O_CREAT,那么必须带上mode参数，制定创建文件时的perm，如果文件已经存在了，这个O_CREAT标志就无效了（除非O_EXCL标志被指定。除了O_CREAT，还有很多的标志

 
     O_RDONLY 
 
     O_WRONLY 
 
     O_RDWR 
 
     O_DIRECT 
 
     O_APPEND 
 
     O_TRUNC 
 
     。。。。

这些标志位基本是顾名思义，对于open这种很复杂很综合的文件操作，golang中对应的是OpenFile

 
  func OpenFile(name string, flag int, perm FileMode) (file *File, err error)
 

我们看到了也有flag，也有FileMode.比如说我要读写打开一个文件，如果不存在就创建，如果存在，就追加写，如何写go 代码？

 
      f,err := os.OpenFile("test.txt",os.O_CREATE|os.O_APPEND|os.O_RDWR,0660)
 
     if(err != nil){
 
         panic(err)
 
     }

我们看到了，golang中也有这些标志（注意O_CREATE,在C语言中，是O_CREAT）,我在上面代码片段中用了几个标志

 
  const (
 
         O_RDONLY int = syscall.O_RDONLY // open the file read-only.
 
         O_WRONLY int = syscall.O_WRONLY // open the file write-only.
 
         O_RDWR int = syscall.O_RDWR // open the file read-write.
 
         O_APPEND int = syscall.O_APPEND // append data to the file when writing.
 
         O_CREATE int = syscall.O_CREAT // create a new file if none exists.
 
         O_EXCL int = syscall.O_EXCL // used with O_CREATE, file must not exist
 
         O_SYNC int = syscall.O_SYNC // open for synchronous I/O.
 
         O_TRUNC int = syscall.O_TRUNC // if possible, truncate file when opened.
 
       )

C语言中有creat，没有则创建，有则截断写，本质等于O_WRONLY | O_CREAT | O_TRUNC

 
  #include <sys/types.h>
 
 #include <sys/stat.h>
 
 #include <fcntl.h>
 
 int creat (const char *name, mode_t mode)

Ken Thompson大神曾经戏言，漏掉creat系统调用中的e字母是他设计Unix最后悔的事情，呵呵看起来老爷子接收了教训，没有犯同样的拼写错误，golang中对应的接口是Create（大神这一次没有拼写错）

 
  func Create(name string) (file *File, err error)
 

和C的creat系统调用相比，少了mode入参，默认是0x666（before umask），同时标志不再是O_WRONLY,而是O_RDWR，仍然带创建标志位，仍然带截断标志。
golang中的Open和C中的open就不能相比了（和C中的open PK那是OpenFile的事儿）接口如下：

 
  func Open(name string) (file *File, err error)
 

直白说，就是带O_RDONLY的open，太菜了。

CLOSE
这个接口无甚好说。接口如下

 
  func (f *File) Close() error
 

但说接口没啥说的，但是golang提供了defer，这是一个我认为很赞的特点，就是将不得不做的cleanup放到defer去做。
我们写C的人，经常遇到了这种代码

 
  fd = open(...)
 
 if(fd < 0 )
 
 {
 
     ...
 
 }
 
 if (failed_1)
 
 {
 
    ...
 
    close(fd);
 
    ....
 
 }
 
 if(faile_2)
 
 {
 
     ...
 
     close(fd);
 
     ...
 
 }
 
 ....

只要打开了文件，每次异常处理都要想着close，否则句柄泄漏，太烦。所以C语言是一门你要小心伺候的语言。
go提供了defer解决这种困境，后面不用时刻惦记close，函数退出前，会执行close。

 
      f,err := os.OpenFile("test.txt",os.O_CREATE|os.O_APPEND|os.O_RDWR,0660)
 
     if(err != nil){
 
         panic("open file failed")
 
     }
 
     defer f.Close()
     ...

READ和WRITE
read和write是比较重要的文件操作了，这是C的接口。

 
    #include <unistd.h>
 
   ssize_t write(int fd, const void *buf, size_t count);
 
   ssize_t read(int fd, void *buf, size_t count)

对于golang，接口如下：

 
  func (f *File) Read(b []byte) (n int, err error)
 
 func (f *File) ReadAt(b []byte, off int64) (n int, err error)
 
 func (f *File) Write(b []byte) (n int, err error)
 
 func (f *File) WriteAt(b []byte, off int64) (n int, err error)
 
 func (f *File) WriteString(s string) (ret int, err error)

看到代码片段，学习使用读写接口：

 
      read_buf := make([]byte,32)
 
     var pos int64 = 0
 
     for{
 
         n,err := f.ReadAt(read_buf,pos)
 
         if err != nil && err != io.EOF{
 
             panic(err)
 
         }
 
         if n == 0{
 
             fmt.Printf("\nfinish read\n")
 
             break
 
         }
 
         fmt.Printf("%s",string(read_buf[:n]))
 
         pos = pos +(int64)(n)
 
     }

在看一个代码片段：

 
      var buff = make([]byte,1024)
 
     for{
 
         n,err := fi.Read(buff)
 
         if err != nil && err != io.EOF{
 
             panic(err)
 
         }
 
         if n == 0{
 
             break
 
         }
 
         if _,err := fo.Write(buff[:n]); err != nil{
 
             panic(err)
 
         }
 
     }

最后，我写了一个完整的代码，完成简单cp功能，就叫mycp

 
  manu@manu-hacks:~/code/go/self$ cat mycp.go 
 
 package main
 
 import "fmt"
 
 import "os"
 
 import "io"
 
 func usage(){
 
     fmt.Printf("%s %s %s\n",os.Args[0],"filename" , "newfile")
 
 }
 
 func main(){
 
     if len(os.Args) != 3{
 
         usage()
 
         return 
 
     }
 
     filename_in := os.Args[1]
 
     fi,err := os.Open(filename_in)
 
     if err != nil{
 
         panic(err)
 
     }
 
     defer fi.Close()
 
     filename_out := os.Args[2]
 
     fo,err := os.Create(filename_out)
 
     if err != nil{
 
         panic(err)
 
     }
 
     defer fo.Close()
 
     var buff = make([]byte,1024)
 
     for{
 
         n,err := fi.Read(buff)
 
         if err != nil && err != io.EOF{
 
             panic(err)
 
         }
 
         if n == 0{
 
             break
 
         }
 
         if _,err := fo.Write(buff[:n]); err != nil{
 
             panic(err)
 
         }
 
     }
 
 }

执行结果：

 
  manu@manu-hacks:~/code/go/self$ ./mycp test.txt test.bak
 
 manu@manu-hacks:~/code/go/self$ diff test.txt test.bak 
 
 manu@manu-hacks:~/code/go/self$ cat test.txt 
 
 this is test file created by go
 
 if not existed ,please create this file
 
 if existed, Please write append
 
 hello world,hello go
 
 this is test file created by go
 
 if not existed ,please create this file
 
 if existed, Please write append
 
 hello world,hello go

参考文献
1 Linux system program
2 golang os package
3 StackOverflow How to read/write from/to file?

上篇博文学习了go语言的对FILE的基本操作，我突然想到，文件一个很常用的场景是逐行处理，比如我们的linux下的神器awk，比如我之前写的KMean++算法处理NBA后卫的数据。对于C语言而言，fgets就解决了这个问题，看下C语言中fgets的接口：

 
  char *fgets(char *s, int size, FILE *stream);
 

当然了首先要fopen，获得文件描述符，然后可以fgets按行获取。
我给出个C程序，完成基本的cat功能，支持-n选项，带了-n则打印出行号：

 
  manu@manu-hacks:~/code/c/self/readline$ cat mycat.c 
 
 #include<stdio.h>
 
 #include<stdlib.h>
 
 #include<string.h>
 
 #include<errno.h>
 
 int num_flag = 0;
 
 int cat(FILE* file)
 
 {
 
     char buf[1024] = {0};
 
     int line_no = 1;
 
     while(fgets(buf,1024,file) != NULL)
 
     {
 
         if(num_flag != 0)
 
         {
 
             fprintf(stdout,"%5d %s",line_no,buf);
 
         }
 
         else
 
         {
 
             fprintf(stdout,"%s",buf);
 
         }
 
         line_no++;
 
     }
 
 }
 
 int main(int argc,char* argv[])
 
 {
 
     int i = 0 ;
 
     int j = 0 ;
 
     int file_exist = 0;
 
     FILE* file = NULL;
 
     for(i = 1; i < argc;i++)
 
     {
 
         if(strcmp(argv[i],"-n") == 0)
 
         {
 
             num_flag = 1;
 
             break;
 
         }
 
     }
 
     for(j = 1; j<argc ;j++)
 
     {
 
         if(j==i)
 
             continue;
 
         file_exist = 1;
 
         file = fopen(argv[j],"rb");
 
         if(file == NULL)
 
         {
 
             fprintf(stderr,"%s:err reading from %s:%s\n",
 
                     argv[0],argv[j],strerror(errno));
 
             continue;
 
         }
 
         cat(file);
 
     }
 
     if(file_exist == 0)
 
     {
 
         cat(stdin);
 
     }
 
 }

golang怎么办?
golang 提供了package bufio。bufio.NewReader()创建一个默认大小的readbuf，当然，也可以bufio.NewReaderSize

 
  func NewReader(rd io.Reader) *Reader
 
     NewReader returns a new Reader whose buffer has the default size(4096).
 
 func NewReaderSize(rd io.Reader, size int) *Reader
 
     NewReaderSize returns a new Reader whose buffer has at least the
 
     specified size. If the argument io.Reader is already a Reader with large
 
     enough size, it returns the underlying Reader.

bufio提供

 
  func (b *Reader) ReadByte() (c byte, err error)
 
     ReadByte reads and returns a single byte. If no byte is available,
 
     returns an error.
 
 func (b *Reader) ReadBytes(delim byte) (line []byte, err error)
 
     ReadBytes reads until the first occurrence of delim in the input,
 
     returning a slice containing the data up to and including the delimiter.
 
     If ReadBytes encounters an error before finding a delimiter, it returns
 
     the data read before the error and the error itself (often io.EOF).
 
     ReadBytes returns err != nil if and only if the returned data does not
 
     end in delim. For simple uses, a Scanner may be more convenient.
 
 func (b *Reader) ReadString(delim byte) (line string, err error)
 
     ReadString reads until the first occurrence of delim in the input,
 
     returning a string containing the data up to and including the
 
     delimiter. If ReadString encounters an error before finding a delimiter,
 
     it returns the data read before the error and the error itself (often
 
     io.EOF). ReadString returns err != nil if and only if the returned data
 
     does not end in delim. For simple uses, a Scanner may be more
 
     convenient.

ReadByte这个接口，和C语言中fgetc很接近，每次读取一个字节。ReadBytes和ReadString都可以实现逐行读取，只要delim设置为'\n'.
看一下go语言实现的简易mycat：

 
  manu@manu-hacks:~/code/go/self$ cat mycat.go 
 
 package main
 
 import "fmt"
 
 import "os"
 
 import "io"
 
 import "flag"
 
 import "bufio"
 
 var num_flag = flag.Bool("n",false,"num each line")
 
 func usage(){
 
     fmt.Printf("%s %s\n",os.Args[0],"filename")
 
 }
 
 func cat(r *bufio.Reader){
 
     i := 1
 
     for {
 
         //buf,err := r.ReadBytes('\n')
 
         buf,err := r.ReadString('\n')
 
         if err == io.EOF{
 
             break
 
         }
 
         if *num_flag{
 
             fmt.Fprintf(os.Stdout,"%5d %s",
 
                         i,buf)
 
             i++
 
         }else{
 
             fmt.Fprintf(os.Stdout,"%s",buf)
 
         }
 
     }
 
     return 
 
 }
 
 func main(){
 
     flag.Parse()
 
     if(flag.NArg() == 0){
 
         cat(bufio.NewReader(os.Stdin))
 
     }
 
     for i:=0;i<flag.NArg();i++{
 
         f,err := os.OpenFile(flag.Arg(i),os.O_RDONLY,0660)
 
         if err != nil{
 
             fmt.Fprintf(os.Stderr,"%s err read from %s : %s\n",
 
             os.Args[0],flag.Arg(0),err)
 
             continue
 
         }
 
         cat(bufio.NewReader(f))
 
         f.Close()
 
     }
 
 }

单纯考虑逐行读取，line by line， bufio的文档也说

 
  For simple uses, a Scanner may be more convenient.
 

先看文档：

 
  func NewScanner(r io.Reader) *Scanner
 
     NewScanner returns a new Scanner to read from r. The split function
 
     defaults to ScanLines.
 
 func (s *Scanner) Text() string
 
     Text returns the most recent token generated by a call to Scan as a
 
     newly allocated string holding its bytes.
 
 func (s *Scanner) Err() error
 
     Err returns the first non-EOF error that was encountered by the Scanner.
 
 func (s *Scanner) Scan() bool
 
     Scan advances the Scanner to the next token, which will then be
 
     available through the Bytes or Text method. It returns false when the
 
     scan stops, either by reaching the end of the input or an error. After
 
     Scan returns false, the Err method will return any error that occurred
 
     during scanning, except that if it was io.EOF, Err will return nil.

怎么用Scanne呢?

 
  func cat(scanner *bufio.Scanner) error{
 
     for scanner.Scan(){
 
         fmt.Println(scanner.Text())    
 
       //fmt.Fprintf(os.Stdout,"%s\n",scanner.Text())
 
     }
 
     return scanner.Err()
 
 }

注意，为啥执行Scan，Text()函数就能返回下一行呢？因为默认的分割函数就是ScanLines.如你有特殊的需求来分割，func (s *Scanner) Split(split SplitFunc)

这个函数可以制定SplitFunc。你可以定制自己的分割函数。

需要注意的是，Scan会将分割符号\n去除，如果Fprintf输出的话，不添加\n打印，会出现没有换行的现象，如下所示

 
  fmt.Fprintf(os.Stdout,"%s",scanner.Text())
 

 
  manu@manu-hacks:~/code/go/self$ go run mycat_v2.go test.txt 
 
 this is test file created by goif not existed ,please create this fileif existed, Please write appendhello world,hello gothis is test file created by goif not existed ,please create this fileif existed, Please write appendhello world,hello gomanu@manu-hacks:~/code/go/self$ cat test.txt 
 
 this is test file created by go
 
 if not existed ,please create this file
 
 if existed, Please write append
 
 hello world,hello go
 
 this is test file created by go
 
 if not existed ,please create this file
 
 if existed, Please write append
 
 hello world,hello go

调用部分的代码如下：

 
          f,err := os.OpenFile(flag.Arg(i),os.O_RDONLY,0660)
 
                  ...
 
         error := cat(bufio.NewScanner(f))
 
         if err != nil{
 
             fmt.Fprintf(os.Stderr,"%s err read from %s : %s\n",
 
             os.Args[0],flag.Arg(i),error)
 
         }

推荐使用Scanner，使用比较简单。
参考文献：
1 godoc bufio

blade2001

关注

2
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
golang编程之文件操作

http://blog.chinaunix.net/uid-24774106-id-3993609.html操作文件是任何编程语言都绕不过，要掌握一门语言，知道如何操作文件是必不可少的，今天学习了下golang对文件操作的支持。 golang对文件的支持是在os package里。我无意将本文写成官方文档的模样，我只是想讨论如何利用这些接口操作文件。 OPEN
复制链接

扫一扫