Golang：bytes.Buffer

最新推荐文章于 2023-10-17 08:00:00 发布

RealKeyboardMan

最新推荐文章于 2023-10-17 08:00:00 发布

阅读量906

点赞数

分类专栏： Golang

本文链接：https://blog.csdn.net/weixin_40108561/article/details/105390977

版权

Golang 专栏收录该内容

25 篇文章 1 订阅

订阅专栏

bytes.Buffer（缓冲区）

结构

bytes.Buffer开箱即用，不仅可以拼接、截断子序列，还可以顺序读取。读写一体
Buffer是一个结构体，包含四个字段，buf是一个[]byte，用来保存内容，称之为内容容器；off表示当前读到了哪个位置，写的话从len(buf)后开始写；bootstrap是一个[64]byte，是为了快速支持长度较小的内容；lastRead是readOp类型，readOp是int8的别名类型，当上一次操作是读操作时，lastRead的值被设定为1-4，表示读的Unicode字符是多少字节，用于回退。而其余方法会将lastRead设置为-1。

// A Buffer is a variable-sized buffer of bytes with Read and Write methods.
// The zero value for Buffer is an empty buffer ready to use.
type Buffer struct {
	buf       []byte   // contents are the bytes buf[off : len(buf)]
	off       int      // read at &buf[off], write at &buf[len(buf)]
	bootstrap [64]byte // memory to hold first slice; helps small buffers avoid allocation.
	lastRead  readOp   // last read operation, so that Unread* can work correctly.

	// FIXME: it would be advisable to align Buffer to cachelines to avoid false
	// sharing.
}

// The readOp constants describe the last action performed on
// the buffer, so that UnreadRune and UnreadByte can check for
// invalid usage. opReadRuneX constants are chosen such that
// converted to int they correspond to the rune size that was read.
type readOp int8

// Don't use iota for these, as the values need to correspond with the
// names and comments, which is easier to see when being explicit.
const (
	opRead      readOp = -1 // Any other read operation.
	opInvalid   readOp = 0  // Non-read operation.
	opReadRune1 readOp = 1  // Read rune of size 1.
	opReadRune2 readOp = 2  // Read rune of size 2.
	opReadRune3 readOp = 3  // Read rune of size 3.
	opReadRune4 readOp = 4  // Read rune of size 4.
)

基础方法

Cap方法返回buf切片的容量（cap(buf)），注意不是长度！Len方法返回未读长度（len(buf)-off）。
Bytes方法返回buf未读部分，返回值是切片；String方法返回未读部分，返回值是字符串。私有方法empty返回缓冲区未读部分是否为空。

// Bytes returns a slice of length b.Len() holding the unread portion of the buffer.
// The slice is valid for use only until the next buffer modification (that is,
// only until the next call to a method like Read, Write, Reset, or Truncate).
// The slice aliases the buffer content at least until the next buffer modification,
// so immediate changes to the slice will affect the result of future reads.
func (b *Buffer) Bytes() []byte { return b.buf[b.off:] }

// String returns the contents of the unread portion of the buffer
// as a string. If the Buffer is a nil pointer, it returns "<nil>".
//
// To build strings more efficiently, see the strings.Builder type.
func (b *Buffer) String() string {
	if b == nil {
		// Special case, useful in debugging.
		return "<nil>"
	}
	return string(b.buf[b.off:])
}

// empty returns whether the unread portion of the buffer is empty.
func (b *Buffer) empty() bool { return len(b.buf) <= b.off }

// Len returns the number of bytes of the unread portion of the buffer;
// b.Len() == len(b.Bytes()).
func (b *Buffer) Len() int { return len(b.buf) - b.off }

// Cap returns the capacity of the buffer's underlying byte slice, that is, the
// total space allocated for the buffer's data.
func (b *Buffer) Cap() int { return cap(b.buf) }

截断和重置

Truncate方法会截断buf，当参数n为0，重置Buffer并返回；n小于0或者n大于buf未读部分的长度，就会panic；其余情况，会将buf设置为从0到off加上n，即已读部分加上未读部分的前n个字节，后面的就截断。
Reset方法重置Buffer，buf设置为长度0，off设置为0。但它保留了底层存储以供将来写入。

// Truncate discards all but the first n unread bytes from the buffer
// but continues to use the same allocated storage.
// It panics if n is negative or greater than the length of the buffer.
func (b *Buffer) Truncate(n int) {
	if n == 0 {
		b.Reset()
		return
	}
	b.lastRead = opInvalid
	if n < 0 || n > b.Len() {
		panic("bytes.Buffer: truncation out of range")
	}
	b.buf = b.buf[:b.off+n]
}

// Reset resets the buffer to be empty,
// but it retains the underlying storage for use by future writes.
// Reset is the same as Truncate(0).
func (b *Buffer) Reset() {
	b.buf = b.buf[:0]
	b.off = 0
	b.lastRead = opInvalid
}

扩容

Grow方法会指定一个整数n，n小于0则会panic，否则会调用私有grow方法。
私有grow方法首先获取缓冲区未读部分的长度，如果未读部分长度为0，且off不为0（表示曾经有内容，但是读完了），那么就重置缓冲区。接下来调用私有方法tryGrowByReslice。
私有方法tryGrowByReslice先获取buf的长度（已读加未读），如果n小于等于剩余容量（buf的容量-buf的长度），那么把buf设置为buf[:l+n]，即将长度增加n（增加后的长度依然小于原容量），然后返回buf的原长度和true；如果n大于剩余容量，则什么都不做，返回0和false。
grow接收到tryGrowByReslice，检查第二个bool值，如果为true，说明扩容成功，返回值是buf的原长度；如果为false，说明剩余容量不足n个，需要真正的扩容。
首先判断如果buf为空，且n小于等于64（即小于bootstrap的长度），说明是个新的Buffer，那直接将buf指定为bootstrap[:n]，并返回原长度0。
如果buf不为空，或者n长度大于64，则获取buf的容量，如果指定的新增空间n小于等于buf容量的一半减去未读部分的长度，即原容量一半大于未读部分加上n，那么就会复用当前容器，将未读部分+剩余容量拷贝到头部位置，即覆盖掉已读内容；这里有两点需要注意，第一，其实只需要原容量大于未读加上n就可以了，但是为了省掉后一次扩容的开销，所以取的是原容量的一半；第二，n+未读小于原容量一半，n大于剩余容量，那么剩余+未读也小于原容量一半，而原容量=已读+未读+剩余容量，说明已读部分大于原容量一半，大于剩余+未读，那么直接将将未读部分+剩余容量拷贝到头部位置，这一步其实已读部分并未被完全覆盖。
如果n加未读部分大于原容量的一半，如果原容量2倍加上n超出整数范围，那么就panic，否则，开辟新空间，容量为原容量2倍加上n（这里也有可能panic），并将未读部分拷贝过去。
返回前将off设置为0，buf设置为未读部分加上n，最终返回未读部分长度。
Grow方法得到未读部分长度，因为之前的已经读取部分没有被完全覆盖，将buf设置为未读部分，这样就抹除了影响。

// ErrTooLarge is passed to panic if memory cannot be allocated to store data in a buffer.
var ErrTooLarge = errors.New("bytes.Buffer: too large")
var errNegativeRead = errors.New("bytes.Buffer: reader returned negative count from Read")

const maxInt = int(^uint(0) >> 1)

// tryGrowByReslice is a inlineable version of grow for the fast-case where the
// internal buffer only needs to be resliced.
// It returns the index where bytes should be written and whether it succeeded.
func (b *Buffer) tryGrowByReslice(n int) (int, bool) {
	if l := len(b.buf); n <= cap(b.buf)-l {
		b.buf = b.buf[:l+n]
		return l, true
	}
	return 0, false
}

// grow grows the buffer to guarantee space for n more bytes.
// It returns the index where bytes should be written.
// If the buffer can't grow it will panic with ErrTooLarge.
func (b *Buffer) grow(n int) int {
	m := b.Len()
	// If buffer is empty, reset to recover space.
	if m == 0 && b.off != 0 {
		b.Reset()
	}
	// Try to grow by means of a reslice.
	if i, ok := b.tryGrowByReslice(n); ok {
		return i
	}
	// Check if we can make use of bootstrap array.
	if b.buf == nil && n <= len(b.bootstrap) {
		b.buf = b.bootstrap[:n]
		return 0
	}
	c := cap(b.buf)
	if n <= c/2-m {
		// We can slide things down instead of allocating a new
		// slice. We only need m+n <= c to slide, but
		// we instead let capacity get twice as large so we
		// don't spend all our time copying.
		copy(b.buf, b.buf[b.off:])
	} else if c > maxInt-c-n {
		panic(ErrTooLarge)
	} else {
		// Not enough space anywhere, we need to allocate.
		buf := makeSlice(2*c + n)
		copy(buf, b.buf[b.off:])
		b.buf = buf
	}
	// Restore b.off and len(b.buf).
	b.off = 0
	b.buf = b.buf[:m+n]
	return m
}

// Grow grows the buffer's capacity, if necessary, to guarantee space for
// another n bytes. After Grow(n), at least n bytes can be written to the
// buffer without another allocation.
// If n is negative, Grow will panic.
// If the buffer can't grow it will panic with ErrTooLarge.
func (b *Buffer) Grow(n int) {
	if n < 0 {
		panic("bytes.Buffer.Grow: negative count")
	}
	m := b.grow(n)
	b.buf = b.buf[:m]
}

// makeSlice allocates a slice of size n. If the allocation fails, it panics
// with ErrTooLarge.
func makeSlice(n int) []byte {
	// If the make fails, give a known error.
	defer func() {
		if recover() != nil {
			panic(ErrTooLarge)
		}
	}()
	return make([]byte, n)
}

写方法

Write方法尝试写入，如果给定的参数长度大于剩余容量，那么就尝试扩容，否则就直接将数据，拷贝到现有数据后面（相当于append），返回值是参数的长度。WriteString方法同理，只不过参数类型是字符串。
WriteByte方法只写入一个字节，WriteRune方法先获取想要写入的Unicode字符的字节长度，然后写入。
WriteTo方法尝试往Writer中写入，先获取缓冲区的未读长度，如果还有未读，那么就将未读写入；如果Write方法返回值大于未读，则panic。如果正常的话，就将off增加，返回写入的长度；如果未读部分为空，则直接Reset。

// Write appends the contents of p to the buffer, growing the buffer as
// needed. The return value n is the length of p; err is always nil. If the
// buffer becomes too large, Write will panic with ErrTooLarge.
func (b *Buffer) Write(p []byte) (n int, err error) {
	b.lastRead = opInvalid
	m, ok := b.tryGrowByReslice(len(p))
	if !ok {
		m = b.grow(len(p))
	}
	return copy(b.buf[m:], p), nil
}

// WriteString appends the contents of s to the buffer, growing the buffer as
// needed. The return value n is the length of s; err is always nil. If the
// buffer becomes too large, WriteString will panic with ErrTooLarge.
func (b *Buffer) WriteString(s string) (n int, err error) {
	b.lastRead = opInvalid
	m, ok := b.tryGrowByReslice(len(s))
	if !ok {
		m = b.grow(len(s))
	}
	return copy(b.buf[m:], s), nil
}

// WriteByte appends the byte c to the buffer, growing the buffer as needed.
// The returned error is always nil, but is included to match bufio.Writer's
// WriteByte. If the buffer becomes too large, WriteByte will panic with
// ErrTooLarge.
func (b *Buffer) WriteByte(c byte) error {
	b.lastRead = opInvalid
	m, ok := b.tryGrowByReslice(1)
	if !ok {
		m = b.grow(1)
	}
	b.buf[m] = c
	return nil
}

// WriteRune appends the UTF-8 encoding of Unicode code point r to the
// buffer, returning its length and an error, which is always nil but is
// included to match bufio.Writer's WriteRune. The buffer is grown as needed;
// if it becomes too large, WriteRune will panic with ErrTooLarge.
func (b *Buffer) WriteRune(r rune) (n int, err error) {
	if r < utf8.RuneSelf {
		b.WriteByte(byte(r))
		return 1, nil
	}
	b.lastRead = opInvalid
	m, ok := b.tryGrowByReslice(utf8.UTFMax)
	if !ok {
		m = b.grow(utf8.UTFMax)
	}
	n = utf8.EncodeRune(b.buf[m:m+utf8.UTFMax], r)
	b.buf = b.buf[:m+n]
	return n, nil
}

// WriteTo writes data to w until the buffer is drained or an error occurs.
// The return value n is the number of bytes written; it always fits into an
// int, but it is int64 to match the io.WriterTo interface. Any error
// encountered during the write is also returned.
func (b *Buffer) WriteTo(w io.Writer) (n int64, err error) {
	b.lastRead = opInvalid
	if nBytes := b.Len(); nBytes > 0 {
		m, e := w.Write(b.buf[b.off:])
		if m > nBytes {
			panic("bytes.Buffer.WriteTo: invalid Write count")
		}
		b.off += m
		n = int64(m)
		if e != nil {
			return n, e
		}
		// all bytes should have been written, by definition of
		// Write method in io.Writer
		if m != nBytes {
			return n, io.ErrShortWrite
		}
	}
	// Buffer is now empty; reset.
	b.Reset()
	return n, nil
}

读方法

Read方法尝试将未读部分拷贝到目标切片中，如果目标切片长度不足，那就只拷贝一部分。返回值是拷贝的数据长度。
ReadByte只读一个字节，ReadRune只读一个Unicode字符。
Next方法返回未读部分的前n个字节，可能存在内容泄露，因为直接返回的切片和buf共用一个底层数组。
MinRead是个常量512，ReadFrom尝试从r中读取数据，它先保证有足够的剩余容量，然后尝试将数据拷贝到剩余容量中；如果Read返回值小于0，就panic，否则，表示读了m个，将buf切片的右边界右移；如果e是EOF，则表示读完了，返回读取的字节数；如果e是其他错误，那么就返回错误；如果没读完，也没发生错误，那么就自旋，循环读取。

// Read reads the next len(p) bytes from the buffer or until the buffer
// is drained. The return value n is the number of bytes read. If the
// buffer has no data to return, err is io.EOF (unless len(p) is zero);
// otherwise it is nil.
func (b *Buffer) Read(p []byte) (n int, err error) {
	b.lastRead = opInvalid
	if b.empty() {
		// Buffer is empty, reset to recover space.
		b.Reset()
		if len(p) == 0 {
			return 0, nil
		}
		return 0, io.EOF
	}
	n = copy(p, b.buf[b.off:])
	b.off += n
	if n > 0 {
		b.lastRead = opRead
	}
	return n, nil
}

// Next returns a slice containing the next n bytes from the buffer,
// advancing the buffer as if the bytes had been returned by Read.
// If there are fewer than n bytes in the buffer, Next returns the entire buffer.
// The slice is only valid until the next call to a read or write method.
func (b *Buffer) Next(n int) []byte {
	b.lastRead = opInvalid
	m := b.Len()
	if n > m {
		n = m
	}
	data := b.buf[b.off : b.off+n]
	b.off += n
	if n > 0 {
		b.lastRead = opRead
	}
	return data
}

// ReadByte reads and returns the next byte from the buffer.
// If no byte is available, it returns error io.EOF.
func (b *Buffer) ReadByte() (byte, error) {
	if b.empty() {
		// Buffer is empty, reset to recover space.
		b.Reset()
		return 0, io.EOF
	}
	c := b.buf[b.off]
	b.off++
	b.lastRead = opRead
	return c, nil
}

// ReadRune reads and returns the next UTF-8-encoded
// Unicode code point from the buffer.
// If no bytes are available, the error returned is io.EOF.
// If the bytes are an erroneous UTF-8 encoding, it
// consumes one byte and returns U+FFFD, 1.
func (b *Buffer) ReadRune() (r rune, size int, err error) {
	if b.empty() {
		// Buffer is empty, reset to recover space.
		b.Reset()
		return 0, 0, io.EOF
	}
	c := b.buf[b.off]
	if c < utf8.RuneSelf {
		b.off++
		b.lastRead = opReadRune1
		return rune(c), 1, nil
	}
	r, n := utf8.DecodeRune(b.buf[b.off:])
	b.off += n
	b.lastRead = readOp(n)
	return r, n, nil
}

// MinRead is the minimum slice size passed to a Read call by
// Buffer.ReadFrom. As long as the Buffer has at least MinRead bytes beyond
// what is required to hold the contents of r, ReadFrom will not grow the
// underlying buffer.
const MinRead = 512

// ReadFrom reads data from r until EOF and appends it to the buffer, growing
// the buffer as needed. The return value n is the number of bytes read. Any
// error except io.EOF encountered during the read is also returned. If the
// buffer becomes too large, ReadFrom will panic with ErrTooLarge.
func (b *Buffer) ReadFrom(r io.Reader) (n int64, err error) {
	b.lastRead = opInvalid
	for {
		i := b.grow(MinRead)
		m, e := r.Read(b.buf[i:cap(b.buf)])
		if m < 0 {
			panic(errNegativeRead)
		}

		b.buf = b.buf[:i+m]
		n += int64(m)
		if e == io.EOF {
			return n, nil // e is EOF, so return nil explicitly
		}
		if e != nil {
			return n, e
		}
	}
}

回退读

UnreadRune和UnreadByte方法根据lastRead，将off回退。这两个方法的前一个方法一定要是读方法。

// UnreadRune unreads the last rune returned by ReadRune.
// If the most recent read or write operation on the buffer was
// not a successful ReadRune, UnreadRune returns an error.  (In this regard
// it is stricter than UnreadByte, which will unread the last byte
// from any read operation.)
func (b *Buffer) UnreadRune() error {
	if b.lastRead <= opInvalid {
		return errors.New("bytes.Buffer: UnreadRune: previous operation was not a successful ReadRune")
	}
	if b.off >= int(b.lastRead) {
		b.off -= int(b.lastRead)
	}
	b.lastRead = opInvalid
	return nil
}

// UnreadByte unreads the last byte returned by the most recent successful
// read operation that read at least one byte. If a write has happened since
// the last read, if the last read returned an error, or if the read read zero
// bytes, UnreadByte returns an error.
func (b *Buffer) UnreadByte() error {
	if b.lastRead == opInvalid {
		return errors.New("bytes.Buffer: UnreadByte: previous operation was not a successful read")
	}
	b.lastRead = opInvalid
	if b.off > 0 {
		b.off--
	}
	return nil
}

其他读方法

ReadBytes方法接收一个字节参数delim，然后将其传递给私有方法readSlice。
readSlice定位到buf未读部分中第一个delim，记录end变量为该字符后一个字符。如果在未读部分没有找到，那么end值为buf的长度。将off设置为end，然后返回未读部分的前end字符，以delim结尾（没找到的话返回的是所有未读）。
ReadBytes拿到返回值，重新复制了一下，没有暴露内部数据，将结果返回。
ReadString方法同理，只不过返回值是字符串

// ReadBytes reads until the first occurrence of delim in the input,
// returning a slice containing the data up to and including the delimiter.
// If ReadBytes encounters an error before finding a delimiter,
// it returns the data read before the error and the error itself (often io.EOF).
// ReadBytes returns err != nil if and only if the returned data does not end in
// delim.
func (b *Buffer) ReadBytes(delim byte) (line []byte, err error) {
	slice, err := b.readSlice(delim)
	// return a copy of slice. The buffer's backing array may
	// be overwritten by later calls.
	line = append(line, slice...)
	return line, err
}

// readSlice is like ReadBytes but returns a reference to internal buffer data.
func (b *Buffer) readSlice(delim byte) (line []byte, err error) {
	i := IndexByte(b.buf[b.off:], delim)
	end := b.off + i + 1
	if i < 0 {
		end = len(b.buf)
		err = io.EOF
	}
	line = b.buf[b.off:end]
	b.off = end
	b.lastRead = opRead
	return line, err
}

// ReadString reads until the first occurrence of delim in the input,
// returning a string containing the data up to and including the delimiter.
// If ReadString encounters an error before finding a delimiter,
// it returns the data read before the error and the error itself (often io.EOF).
// ReadString returns err != nil if and only if the returned data does not end
// in delim.
func (b *Buffer) ReadString(delim byte) (line string, err error) {
	slice, err := b.readSlice(delim)
	return string(slice), err
}

额外的初始化

NewBuffer和NewBufferString方法使用[]byte或string初始化Buffer。

// NewBuffer creates and initializes a new Buffer using buf as its
// initial contents. The new Buffer takes ownership of buf, and the
// caller should not use buf after this call. NewBuffer is intended to
// prepare a Buffer to read existing data. It can also be used to size
// the internal buffer for writing. To do that, buf should have the
// desired capacity but a length of zero.
//
// In most cases, new(Buffer) (or just declaring a Buffer variable) is
// sufficient to initialize a Buffer.
func NewBuffer(buf []byte) *Buffer { return &Buffer{buf: buf} }

// NewBufferString creates and initializes a new Buffer using string s as its
// initial contents. It is intended to prepare a buffer to read an existing
// string.
//
// In most cases, new(Buffer) (or just declaring a Buffer variable) is
// sufficient to initialize a Buffer.
func NewBufferString(s string) *Buffer {
	return &Buffer{buf: []byte(s)}
}

RealKeyboardMan

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Golang：bytes.Buffer

bytes.Buffer（缓冲区）结构bytes.Buffer开箱即用，不仅可以拼接、截断子序列，还可以顺序读取。读写一体Buffer是一个结构体，包含四个字段，buf是一个[]byte，用来保存内容，称之为内容容器；off表示当前读到了哪个位置，写的话从len(buf)后开始写；bootstrap是一个[64]byte，是为了快速支持长度较小的内容；lastRead是readOp类型，r...
复制链接

扫一扫

专栏目录