String系列源码解析02 - AbstractStringBuilder详细介绍

最新推荐文章于 2021-02-13 10:00:20 发布

bittenji

最新推荐文章于 2021-02-13 10:00:20 发布

阅读量667

点赞数

分类专栏： Java 文章标签： AbstractStringBuilde

本文链接：https://blog.csdn.net/shentianzhi2009/article/details/36892731

版权

Java 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

String系列源码解析02 - AbstractStringBuilder详细介绍

类属性：

AbstractStringBuilder有两个属性value和count，value是一个char数组，用于存储字符序列；count则表示所使用的字符数组的长度。

    /**
     * value是一个char数组，用于字符存储
     */
    char value[];

    /** 
     * count表示所使用的字符数量
     */
    int count;

构造函数：

    /** 
     * 无参构造器对于子类的序列化是必须的
     */
    AbstractStringBuilder() {
    }

    /** 
     * 构造指定容量的AbstractStringBuilder
     */
    AbstractStringBuilder(int capacity) {
        value = new char[capacity];
    }

核心类方法：

1. ensureCapacity

public void ensureCapacity(int minimumCapacity);

确保字符数组的容量至少等于指定的最小值。如果当前字符数组容量小于最小容量参数，那么新的具有更大容量的内部数组会被分配。默认增加后的字符数组容量是：

2*原字符数组容量 + 2

这也是许多JDK中的类（比如。。。）容量扩展的增长方式。

    /**
     * Ensures that the capacity is at least equal to the specified minimum.
     * If the current capacity is less than the argument, then a new internal
     * array is allocated with greater capacity. The new capacity is the
     * larger of:
     * <ul>
     * <li>The <code>minimumCapacity</code> argument.
     * <li>Twice the old capacity, plus <code>2</code>.
     * </ul>
     * If the <code>minimumCapacity</code> argument is nonpositive, this
     * method takes no action and simply returns.
     *
     * @param   minimumCapacity   期望扩展的最小容量
     */
    public void ensureCapacity(int minimumCapacity) {
    	// 只有当minimumCapacity参数大于零的时候才执行
        if (minimumCapacity > 0)
            ensureCapacityInternal(minimumCapacity);
    }

2. ensureCapacityInternal

该方法和ensureCapacity有相同的契约，但是它永远不会被同步。

private void ensureCapacityInternal(int minimumCapacity);

    /**
     * This method has the same contract as ensureCapacity, but is
     * never synchronized.
     */
    private void ensureCapacityInternal(int minimumCapacity) {
        // 防止内存溢出
        if (minimumCapacity - value.length > 0)
            expandCapacity(minimumCapacity);
    }

3. expandCapacity

这个方法是ensureCapacity扩展语义的具体实现，但是没有大小和同步检查。

void expandCapacity(int minimumCapacity);

    /**
     * This implements the expansion semantics of ensureCapacity with no
     * size check or synchronization.
     */
    void expandCapacity(int minimumCapacity) {
        int newCapacity = value.length * 2 + 2; // 默认生成新的字符数组的容量
        if (newCapacity - minimumCapacity < 0)
        	/*
        	 * 默认生成新的字符数组的容量 大于 期望扩展的最小容量，
        	 * 那么默认生成新的字符数组的容量取期望扩展的最小容量
        	 */
            newCapacity = minimumCapacity;
        if (newCapacity < 0) {
            if (minimumCapacity < 0) // 内存溢出
                throw new OutOfMemoryError();
            newCapacity = Integer.MAX_VALUE;
        }
        value = Arrays.copyOf(value, newCapacity); // 使用Arrays工具类生成新的字符数组
    }

4. charAt

返回这个字符序列特定索引值下的字符值（代码单元），直接返回字符数组的元素，非常简单。

public char charAt(int index) {}

    /**
     * Returns the <code>char</code> value in this sequence at the specified index.
     * The first <code>char</code> value is at index <code>0</code>, the next at index
     * <code>1</code>, and so on, as in array indexing.
     * <p>
     * The index argument must be greater than or equal to
     * <code>0</code>, and less than the length of this sequence.
     *
     * <p>If the <code>char</code> value specified by the index is a
     * <a href="Character.html#unicode">surrogate</a>, the surrogate
     * value is returned.
     *
     * @param      index   the index of the desired <code>char</code> value.
     * @return     the <code>char</code> value at the specified index.
     * @throws     IndexOutOfBoundsException  if <code>index</code> is
     *             negative or greater than or equal to <code>length()</code>.
     */
    public char charAt(int index) {
    	// 不合法的索引值
        if ((index < 0) || (index >= count))
            throw new StringIndexOutOfBoundsException(index);
        return value[index];
    }

5. codePointAt

返回这个字符序列指定索引下的unicode代码点（Unicode code point），内部使用的是char的引用类型Character的静态方法返回unicode代码点。

补充：代码点，我译为“码位值”。每个码位值实际上代表一个真正unicode字符。即unicode字符集上的码位值。为什么要这些码位相关的方法？源自1个java的char字符并不完全等于一个unicode的字符。char采用UCS-2编码是一种淘汰的UTF-16编码，最多65536种形态，也远少于当今unicode拥有11万字符的需求。java只好对后来新增的unicode字符用2个char拼出1个unicode字符。导致String中char的数量不等于unicode字符的数量。

public int codePointAt(int index) {}

    /**
     * Returns the character (Unicode code point) at the specified
     * index. The index refers to <code>char</code> values
     * (Unicode code units) and ranges from <code>0</code> to
     * {@link #length()}<code> - 1</code>.
     *
     * <p> If the <code>char</code> value specified at the given index
     * is in the high-surrogate range, the following index is less
     * than the length of this sequence, and the
     * <code>char</code> value at the following index is in the
     * low-surrogate range, then the supplementary code point
     * corresponding to this surrogate pair is returned. Otherwise,
     * the <code>char</code> value at the given index is returned.
     *
     * @param      index the index to the <code>char</code> values
     * @return     the code point value of the character at the
     *             <code>index</code>
     * @exception  IndexOutOfBoundsException  if the <code>index</code>
     *             argument is negative or not less than the length of this
     *             sequence.
     */
    public int codePointAt(int index) {
    	// 非法索引值
        if ((index < 0) || (index >= count)) {
            throw new StringIndexOutOfBoundsException(index);
        }
        return Character.codePointAt(value, index);
    }

6. codePointBefore

返回指定索引前一位的unicode代码点，具体实现同codePointAt类似。

public int codePointBefore(int index) {}

    /**
     * Returns the character (Unicode code point) before the specified
     * index. The index refers to <code>char</code> values
     * (Unicode code units) and ranges from <code>1</code> to {@link
     * #length()}.
     *
     * <p> If the <code>char</code> value at <code>(index - 1)</code>
     * is in the low-surrogate range, <code>(index - 2)</code> is not
     * negative, and the <code>char</code> value at <code>(index -
     * 2)</code> is in the high-surrogate range, then the
     * supplementary code point value of the surrogate pair is
     * returned. If the <code>char</code> value at <code>index -
     * 1</code> is an unpaired low-surrogate or a high-surrogate, the
     * surrogate value is returned.
     *
     * @param     index the index following the code point that should be returned
     * @return    the Unicode code point value before the given index.
     * @exception IndexOutOfBoundsException if the <code>index</code>
     *            argument is less than 1 or greater than the length
     *            of this sequence.
     */
    public int codePointBefore(int index) {
        int i = index - 1;
        if ((i < 0) || (i >= count)) {
            throw new StringIndexOutOfBoundsException(index);
        }
        return Character.codePointBefore(value, index);
    }

7. codePointCount

准确计算指定索引值范围内unicode代码点的数量，注意，并不是char的数量，这与length()方法不同，length()方法计算的是代码单元的数量。

public int codePointCount(int beginIndex, int endIndex) {}

    /**
     * Returns the number of Unicode code points in the specified text
     * range of this sequence. The text range begins at the specified
     * <code>beginIndex</code> and extends to the <code>char</code> at
     * index <code>endIndex - 1</code>. Thus the length (in
     * <code>char</code>s) of the text range is
     * <code>endIndex-beginIndex</code>. Unpaired surrogates within
     * this sequence count as one code point each.
     *
     * @param beginIndex the index to the first <code>char</code> of
     * the text range.
     * @param endIndex the index after the last <code>char</code> of
     * the text range.
     * @return the number of Unicode code points in the specified text
     * range
     * @exception IndexOutOfBoundsException if the
     * <code>beginIndex</code> is negative, or <code>endIndex</code>
     * is larger than the length of this sequence, or
     * <code>beginIndex</code> is larger than <code>endIndex</code>.
     */
    public int codePointCount(int beginIndex, int endIndex) {
        if (beginIndex < 0 || endIndex > count || beginIndex > endIndex) {
            throw new IndexOutOfBoundsException();
        }
        return Character.codePointCountImpl(value, beginIndex, endIndex-beginIndex);
    }

8. offsetByCodePoints

获取指定索引处的代码点偏移量，我个人的理解是根据代码单元的偏移量查找代码点的偏移量。

如果想获取i位置的代码点，则需要使用下面的方法：

String greeting = "Hello"; 
int index = greeting.offsetByCodePoints(0,i); 
int cp = greeting.codePointAt(index);

public int offsetByCodePoints(int index, int codePointOffset) {}

    /**
     * Returns the index within this sequence that is offset from the
     * given <code>index</code> by <code>codePointOffset</code> code
     * points. Unpaired surrogates within the text range given by
     * <code>index</code> and <code>codePointOffset</code> count as
     * one code point each.
     *
     * @param index the index to be offset
     * @param codePointOffset the offset in code points
     * @return the index within this sequence
     * @exception IndexOutOfBoundsException if <code>index</code>
     *   is negative or larger then the length of this sequence,
     *   or if <code>codePointOffset</code> is positive and the subsequence
     *   starting with <code>index</code> has fewer than
     *   <code>codePointOffset</code> code points,
     *   or if <code>codePointOffset</code> is negative and the subsequence
     *   before <code>index</code> has fewer than the absolute value of
     *   <code>codePointOffset</code> code points.
     */
    public int offsetByCodePoints(int index, int codePointOffset) {
	if (index < 0 || index > count) {
	    throw new IndexOutOfBoundsException();
	}
	return Character.offsetByCodePointsImpl(value, 0, count,
						index, codePointOffset);
    }

9. append

附加对象（实际上是对象的字符串表示形式）

public AbstractStringBuilder append(Object obj) {}

    /**
     * Appends the string representation of the <code>Object</code> 
     * argument.
     * <p>
     * The argument is converted to a string as if by the method 
     * <code>String.valueOf</code>, and the characters of that 
     * string are then appended to this sequence.
     *
     * @param   obj   an <code>Object</code>.
     * @return  a reference to this object.
     */
    public AbstractStringBuilder append(Object obj) {
	return append(String.valueOf(obj)); // 使用string的valueOf方法获取对象的字符串表示形式
    }

10. append

附加String

public AbstractStringBuilder append(String str) {}

    /**
     * Appends the specified string to this character sequence.
     * <p>
     * The characters of the <code>String</code> argument are appended, in 
     * order, increasing the length of this sequence by the length of the 
     * argument. If <code>str</code> is <code>null</code>, then the four 
     * characters <code>"null"</code> are appended.
     * <p>
     * Let <i>n</i> be the length of this character sequence just prior to 
     * execution of the <code>append</code> method. Then the character at 
     * index <i>k</i> in the new character sequence is equal to the character 
     * at index <i>k</i> in the old character sequence, if <i>k</i> is less 
     * than <i>n</i>; otherwise, it is equal to the character at index 
     * <i>k-n</i> in the argument <code>str</code>.
     *
     * @param   str   a string.
     * @return  a reference to this object.
     */
    public AbstractStringBuilder append(String str) {
	if (str == null) str = "null"; // 对象为null的处理方案
        int len = str.length();
	if (len == 0) return this; // 待附加的String长度为零时，不处理
	int newCount = count + len;
	if (newCount > value.length)
	    expandCapacity(newCount); // 如果附加后的容量大于此char数组的容量，则进行扩展
	str.getChars(0, len, value, count); // 将String复制到此char数组，使用String的getChars方法
	count = newCount; // 修改此char数组的count属性（所使用的字符数量）
	return this;
    }