String源码阅读（一）

最新推荐文章于 2023-06-17 21:21:27 发布

Livingdd

最新推荐文章于 2023-06-17 21:21:27 发布

阅读量171

点赞数

分类专栏： String源码阅读文章标签： String源码

本文链接：https://blog.csdn.net/Livingdd/article/details/98665401

版权

String源码阅读专栏收录该内容

2 篇文章 0 订阅

订阅专栏

本节只介绍String的构造方法。

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {

String实现了Serializable, Comparable<String>, CharSequence三个接口。

String中的字符串值存储在常量池中，jvm的PermGen 中，不同的类都有不同的常量池。new String（“abc”）在堆中存储了一个指向常量池中“abc”的指针。

其中charSequence是一个接口，表示char值的一个可读序列。此接口对许多不同种类的char序列提供统一的自读访问。此接口不修改该equals和hashCode方法的常规协定，因此，通常未定义比较实现 CharSequence 的两个对象的结果。他有几个实现类：CharBuffer、String、StringBuffer、StringBuilder。

　CharSequence与String都能用于定义字符串，但CharSequence的值是可读可写序列，而String的值是只读序列。

　对于一个抽象类或者是接口类，不能使用new来进行赋值，但是可以通过以下的方式来进行实例的创建：
　　CharSequence cs=”hello”;

//value存储String的值
private final char value[];

   //hashcode
    private int hash; // Default to 0

   //序列化ID
    private static final long serialVersionUID = -6849794470754667710L;

   //声明序列化字段
    private static final ObjectStreamField[] serialPersistentFields =
        new ObjectStreamField[0];

//返回空的字符串对象，文档中不推荐使用 
public String() {
        this.value = "".value;
    }

//返回相同String值的不同引用，不推荐使用
 public String(String original) {
        this.value = original.value;
        this.hash = original.hash;
    }

//复制一个新的字符数组给String的value成员，这样改变参数中的数组后不会影响到String字符串
 public String(char value[]) {
        this.value = Arrays.copyOf(value, value.length);
    }

//从offest开始截取count个字符来创建String
public String(char value[], int offset, int count) {
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count <= 0) {
            if (count < 0) {
                throw new StringIndexOutOfBoundsException(count);
            }
            if (offset <= value.length) {
                this.value = "".value;
                return;
            }
        }
        // Note: offset or count might be near -1>>>1.
        if (offset > value.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }
        this.value = Arrays.copyOfRange(value, offset, offset+count);
    }

传入参数为代码点。

代码点即为Unicode中的字符编号，代码单元指字符可以被编码的最小单元，例如UTF-8中一个字符可以被编码为1个字节或两个字节，3个，4个....，UTF-16中一个字符可以被编码成两个字节，3个。。。。。。。

String内部采用char数组形式存储Unicode字符串，由于char是16位，也可以说是UTF-16编码。但并不是一个char存储一个字符，当字符在BMP范围以外时，会用两个char存储一个字符。

//传入代码点，codePoints数组中的截取位置与截取数量
public String(int[] codePoints, int offset, int count) {
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count <= 0) {
            if (count < 0) {
                throw new StringIndexOutOfBoundsException(count);
            }
            if (offset <= codePoints.length) {
                this.value = "".value;
                return;
            }
        }
        // Note: offset or count might be near -1>>>1.
        if (offset > codePoints.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }

        final int end = offset + count;

        // Pass 1: Compute precise size of char[]
        int n = count;
        for (int i = offset; i < end; i++) {
            int c = codePoints[i];
            //判断代码点是否在Basic Multilingual Plane（BMP）中
             //在BMP范围内的字符，可以用4位十六进制数表示（16bit），而在BMP以外的字符，需要不止 
               //4位十六进制数表示。
            if (Character.isBmpCodePoint(c))
                continue;
            //判断代码点是否为一个有效的Unicode代码点的值
            else if (Character.isValidCodePoint(c))
                n++;
            else throw new IllegalArgumentException(Integer.toString(c));
        }

        // Pass 2: Allocate and fill in char[]
        final char[] v = new char[n];

        for (int i = offset, j = 0; i < end; i++, j++) {
            int c = codePoints[i];
            //如果代码点在Basic Multilingual Plane（BMP）中则转化为char放入v中
            if (Character.isBmpCodePoint(c))
                v[j] = (char)c;
            //否则说明代码点不在Basic Multilingual Plane（BMP）中，但是一个有效的Unicode代码 
               //点值
            else
                Character.toSurrogates(c, v, j++);
        }

        this.value = v;
    }

传入参数为字节数组。

decode为将字节转化为字符，encode为将字符转化为字节。

//传入字节数组，截取位置以及截取长度以及编码方式
public String(byte bytes[], int offset, int length, String charsetName)
            throws UnsupportedEncodingException {
        if (charsetName == null)
            throw new NullPointerException("charsetName");
        //检查截取位置以及截取长度是否合法，如不合法则抛出异常
        checkBounds(bytes, offset, length);
        //调用StringCoding的decode方法
        this.value = StringCoding.decode(charsetName, bytes, offset, length);
    }

StringCoder类的结构，其中有两个静态内部类，StringDecoder与StringEncoder。

StringCodeing使用ThreadLocal来保证线程安全，使用SoftReference来保证内存不足时GC，来防止OOM。ThreadLocal以后会写。

StringDecoder的成员以及构造方法。

首先通过deref当前线程threadlocal中的StringCoder对象，如果当前线程中没有StringCoder对象，或者参数传入的编码与当前线程中的StringCoder编码不一致，那么就新建一个StringCoder对象，并放入当前线程中。并调用重载方法decode。

static char[] decode(String charsetName, byte[] ba, int off, int len)
        throws UnsupportedEncodingException
    {
        
        StringDecoder sd = deref(decoder);
        //默认编码为ISO-8859-1
        String csn = (charsetName == null) ? "ISO-8859-1" : charsetName;
        if ((sd == null) || !(csn.equals(sd.requestedCharsetName())
                              || csn.equals(sd.charsetName()))) {
            sd = null;
            try {
                Charset cs = lookupCharset(csn);
                if (cs != null)
                    sd = new StringDecoder(cs, csn);
            } catch (IllegalCharsetNameException x) {}
            if (sd == null)
                throw new UnsupportedEncodingException(csn);
            set(decoder, sd);
        }
        return sd.decode(ba, off, len);
    }

此方法为decode的重载方法，并且为对象方法。

 char[] decode(byte[] ba, int off, int len) {
            //获取len与当前字符编码最大字节数的乘积
            int en = scale(len, cd.maxCharsPerByte());
            char[] ca = new char[en];
            if (len == 0)
                return ca;
            /*如果当前字符编码属于ArrayDecoder接口（ArrayDecoder可自行去看源码）
            *则使用ArrayDecoder中定义好的API
               */
            if (cd instanceof ArrayDecoder) {
                int clen = ((ArrayDecoder)cd).decode(ba, off, len, ca);
                /*safeTrim为通过返回的长度以及ca数组的长度以及isTrusted和管理员权限来判断
                  *返回当前ca数组或者Arrays.copyOf一个新数组。
                 */
                return safeTrim(ca, clen, cs, isTrusted);
            } else {
                /*如果当前编码不属于ArrayDecoder，则调用CharsetDecoder的decode方法
                 *
                 */
                cd.reset();
                ByteBuffer bb = ByteBuffer.wrap(ba, off, len);
                CharBuffer cb = CharBuffer.wrap(ca);
                try {
                    CoderResult cr = cd.decode(bb, cb, true);
                    if (!cr.isUnderflow())
                        cr.throwException();
                    cr = cd.flush(cb);
                    if (!cr.isUnderflow())
                        cr.throwException();
                } catch (CharacterCodingException x) {
                    // Substitution is always enabled,
                    // so this shouldn't happen
                    throw new Error(x);
                }
                return safeTrim(ca, cb.position(), cs, isTrusted);
            }
        }
    }

原理与上面相同，不同的是传入编码对象而不是编码名称。

public String(byte bytes[], int offset, int length, Charset charset) {
        if (charset == null)
            throw new NullPointerException("charset");
        checkBounds(bytes, offset, length);
        this.value =  StringCoding.decode(charset, bytes, offset, length);
    }

传入byte数组的构造方法好有很多，但原理相同，这里就不一一列举了。如果不传入编码方式，默认为"ISO-8859-1"。

传入参数为StringBuffer，由于StringBuffer线程不安全，所以要使用synchronized锁来保证线程安全。

public String(StringBuffer buffer) {
        synchronized(buffer) {
            this.value = Arrays.copyOf(buffer.getValue(), buffer.length());
        }
    }

传入参数为 StringBuilder。。

 public String(StringBuilder builder) {
        this.value = Arrays.copyOf(builder.getValue(), builder.length());
    }

最后问一个困扰许久的问题，在String源码中打断点时发现构造方法参数的值与我传入的不同。求有缘的大牛帮我解答一下。太感谢！！！

Livingdd

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录