【Java】基础知识巩固（char和String）&&示例（一）

最新推荐文章于 2024-08-16 10:49:14 发布

KingWang_WHU

最新推荐文章于 2024-08-16 10:49:14 发布

阅读量772

点赞数

分类专栏： Java 文章标签： java string char

本文链接：https://blog.csdn.net/wk1134314305/article/details/67069688

版权

Java 专栏收录该内容

34 篇文章 1 订阅

订阅专栏

最近在项目上使用replaceAll（）函数去掉小数点的时候，发现并没有得到自己想要的结果。之后便记录下自己遇到的问题，今天正好有空，不用上班（开心~），顺便整理一下此处遇到的问题。博客已经一个多星期没有更新了，终于又开始了！

先放测试代码！

下面的代码主要分为三部分

（1）replace和replaceAll的区别
（2）关于char类型数据的使用
（3）关于String类的理解

代码：

package test;

public class replaceTest {
    public static void main(String[] args) {
        /*第一部分：关于replace和replaceAll的测试(涉及知识点：CharSequence，Pattern，Matcher。具体区别查看源码)*/
        String literal_s="\\ab\\.";
        System.out.println(literal_s.replace("\\.", ""));//输出为\ab,替换的是\.而不是小数点
        System.out.println(literal_s.replaceAll("\\.", ""));//输出为\ab\,替换的是.
        String s="192.168.102.1";
        System.out.println(s.replace(".", ""));//输出为1921681021
        System.out.println(s);//输出为：192.168.102.1,说明经过上面一行代码的执行之后改变的不是本身，而是副本
        System.out.println(s.replaceAll(".", ""));//输出结果为：（空），什么都没有，因为这里是正则表达式中的.，.在正则表达式中代表任何字符，因此全部被替换
        String[] split_arr=s.split(".");//replaceAll(String regex, String replacement)和split(String regex)这里的参数指的是正则表达式的字符串，对于.，是特殊字符，在正则表达式里面代表任何字符，
        System.out.println(s.replaceAll("\\.", ""));//输出结果为：1921681021
        /*第二部分：关于char类型的理解*/
        char c='\\';
//      char b='\';//此种写法错误
        char d='/';
        String zhuanyi_s="\\a";//长度为2
        System.out.println(zhuanyi_s.length());//输出为2
        char[] c_arr=zhuanyi_s.toCharArray();//数组内容为：[\,a]
        /*第三部分：关于String的理解（涉及知识点：栈，堆，常量池，引用，值）*/
        String a="a";
        String a_1="a";//
        System.out.println(a==a_1);//true，指向同一个内存地址，共享一个字符串常量池
        String b_obj=new String("test");
        String b_obj_copy=new String("test");
        System.out.println(b_obj==b_obj_copy);//false，指向不同地址
    }
}

1、replace和replaceAll函数的区别

1.1、replace函数源码：

  /**
     * Replaces each substring of this string that matches the literal target
     * sequence with the specified literal replacement sequence. The
     * replacement proceeds from the beginning of the string to the end, for
     * example, replacing "aa" with "b" in the string "aaa" will result in
     * "ba" rather than "ab".
     *
     * @param  target The sequence of char values to be replaced
     * @param  replacement The replacement sequence of char values
     * @return  The resulting string
     * @since 1.5
     */
    public String replace(CharSequence target, CharSequence replacement) {
        return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(
                this).replaceAll(Matcher.quoteReplacement(replacement.toString()));
    }

从replace函数的内部来看，该处的target会被当作字面上的字符串，里面的字符不再具有正则表达式的特殊含义。Pattern.LITERAL的意思如下：

 /**
     * Enables literal parsing of the pattern.
     *
     * <p> When this flag is specified then the input string that specifies
     * the pattern is treated as a sequence of literal characters.
     * Metacharacters or escape sequences in the input sequence will be
     * given no special meaning.
     *
     * <p>The flags CASE_INSENSITIVE and UNICODE_CASE retain their impact on
     * matching when used in conjunction with this flag. The other flags
     * become superfluous.
     *
     * <p> There is no embedded flag character for enabling literal parsing.
     * @since 1.5
     */
    public static final int LITERAL = 0x10;

这里该变量的含义就是，对于输入的target字符串当作字面上的字符串来理解，对于正则表达式中的某些字符串可能会有特殊含义，比如.在正则表达式中表示小数点（\.在java定义字符串的时候是”\\.”），但是如果用\.作为replace(CharSequence target, CharSequence replacement)中的target变量传入的时候，这个时候，该变量会被当做\.（两个字符）的意思，而不再是小数点的意思。因此从上面的测试代码中可以看到literal_s被replace之后，输出值为\ab,而不是\ab\，而被replaceAll的时候，结果为\ab\。
注：
上面提到的元字符就是正则表达式中的元字符，元字符的知识参考：
（1）正则表达式 - 元字符：http://www.runoob.com/regexp/regexp-metachar.html
（2）String，StringBuilder等类是实现CharSequence类，其中，CharSequence类的知识移步。
String之String和CharSequence、StringBuilder和StringBuffer的区别：http://www.fengfly.com/plus/view-214077-1.html

1.2、replaceAll函数源码

此处是String类的replaceAll函数，避免和Matcher类的replaceAll函数混淆

/**
     * Replaces each substring of this string that matches the given <a
     * href="../util/regex/Pattern.html#sum">regular expression</a> with the
     * given replacement.
     *
     * <p> An invocation of this method of the form
     * <i>str</i>{@code .replaceAll(}<i>regex</i>{@code ,} <i>repl</i>{@code )}
     * yields exactly the same result as the expression
     *
     * <blockquote>
     * <code>
     * {@link java.util.regex.Pattern}.{@link
     * java.util.regex.Pattern#compile compile}(<i>regex</i>).{@link
     * java.util.regex.Pattern#matcher(java.lang.CharSequence) matcher}(<i>str</i>).{@link
     * java.util.regex.Matcher#replaceAll replaceAll}(<i>repl</i>)
     * </code>
     * </blockquote>
     *
     *<p>
     * Note that backslashes ({@code \}) and dollar signs ({@code $}) in the
     * replacement string may cause the results to be different than if it were
     * being treated as a literal replacement string; see
     * {@link java.util.regex.Matcher#replaceAll Matcher.replaceAll}.
     * Use {@link java.util.regex.Matcher#quoteReplacement} to suppress the special
     * meaning of these characters, if desired.
     *
     * @param   regex
     *          the regular expression to which this string is to be matched
     * @param   replacement
     *          the string to be substituted for each match
     *
     * @return  The resulting {@code String}
     *
     * @throws  PatternSyntaxException
     *          if the regular expression's syntax is invalid
     *
     * @see java.util.regex.Pattern
     *
     * @since 1.4
     * @spec JSR-51
     */
    public String replaceAll(String regex, String replacement) {
        return Pattern.compile(regex).matcher(this).replaceAll(replacement);
    }

从上面的源码可以看到，此处的replaceAll的参数regex是正则表达式的字符串的意思，也就是说，此处的参数会被当作正则表达式来处理（而String类的replace函数则是直接当作字面意思来看待）对比上面的测试代码可以看出差别！

总结：
从上面的两个函数可以看出来，两个函数的内部都使用了Matcher类的replaceAll函数，只不过在String类的replace函数在调用Matcher类的replaceAll之前对replacement参数进行了处理，处理过程是用Matcher.quoteReplacement(replacement.toString())，也就是调用了Matcher类的quoteReplacement函数，该函数的源码，见下：

/**
     * Returns a literal replacement <code>String</code> for the specified
     * <code>String</code>.
     *
     * This method produces a <code>String</code> that will work
     * as a literal replacement <code>s</code> in the
     * <code>appendReplacement</code> method of the {@link Matcher} class.
     * The <code>String</code> produced will match the sequence of characters
     * in <code>s</code> treated as a literal sequence. Slashes ('\') and
     * dollar signs ('$') will be given no special meaning.
     *
     * @param  s The string to be literalized
     * @return  A literal string replacement
     * @since 1.5
     */
    public static String quoteReplacement(String s) {
        if ((s.indexOf('\\') == -1) && (s.indexOf('$') == -1))
            return s;
        StringBuilder sb = new StringBuilder();
        for (int i=0; i<s.length(); i++) {
            char c = s.charAt(i);
            if (c == '\\' || c == '$') {
                sb.append('\\');
            }
            sb.append(c);
        }
        return sb.toString();
    }

从上面的源码可以看出来，如果replacement包含了\和$字符串，则需要在前面添加\进行转义。
Matcher类的replaceAll函数源码如下：

/**
     * Replaces every subsequence of the input sequence that matches the
     * pattern with the given replacement string.
     *
     * <p> This method first resets this matcher.  It then scans the input
     * sequence looking for matches of the pattern.  Characters that are not
     * part of any match are appended directly to the result string; each match
     * is replaced in the result by the replacement string.  The replacement
     * string may contain references to captured subsequences as in the {@link
     * #appendReplacement appendReplacement} method.
     *
     * <p> Note that backslashes (<tt>\</tt>) and dollar signs (<tt>$</tt>) in
     * the replacement string may cause the results to be different than if it
     * were being treated as a literal replacement string. Dollar signs may be
     * treated as references to captured subsequences as described above, and
     * backslashes are used to escape literal characters in the replacement
     * string.
     *
     * <p> Given the regular expression <tt>a*b</tt>, the input
     * <tt>"aabfooaabfooabfoob"</tt>, and the replacement string
     * <tt>"-"</tt>, an invocation of this method on a matcher for that
     * expression would yield the string <tt>"-foo-foo-foo-"</tt>.
     *
     * <p> Invoking this method changes this matcher's state.  If the matcher
     * is to be used in further matching operations then it should first be
     * reset.  </p>
     *
     * @param  replacement
     *         The replacement string
     *
     * @return  The string constructed by replacing each matching subsequence
     *          by the replacement string, substituting captured subsequences
     *          as needed
     */
    public String replaceAll(String replacement) {
        reset();
        boolean result = find();
        if (result) {
            StringBuffer sb = new StringBuffer();
            do {
                appendReplacement(sb, replacement);
                result = find();
            } while (result);
            appendTail(sb);
            return sb.toString();
        }
        return text.toString();
    }

下面通过示例对Matcher类的replaceAll函数方法理解

代码示例：

/*Matcher中的replaceAll的理解*/
Pattern p1 = Pattern.compile("cat");
Matcher m1 = p1.matcher("one cat two cats in the yard");
StringBuffer sb = new StringBuffer();
System.out.println(m1.replaceAll("\\$2"));//输出one $2 two $2s in the yard
System.out.println(m1.replaceAll("$2"));//运行错误，错误代码：java.lang.IndexOutOfBoundsException: No group 2，此处报错可以根据replaceAll函数里调用的appendReplacement函数查看其源代码，该异常即是该函数抛出

2、char的理解

char c='\\';//定义反斜杠字符（正确写法）
//      char b='\';//此种写法错误
char d='/';
String zhuanyi_s="\\a";//长度为2
char[] c_arr=zhuanyi_s.toCharArray();//数组内容为：[\,a]

从上面可以看出来，
（1）\表示的字符为反斜杠，长度为1。
（2）String类内部实现其实是一个char[]数组
根据源码，可以看到String类含有一个名为value的char[]私有成员，其中1个常用构造函数如下：

   /**
     * Initializes a newly created {@code String} object so that it represents
     * the same sequence of characters as the argument; in other words, the
     * newly created string is a copy of the argument string. Unless an
     * explicit copy of {@code original} is needed, use of this constructor is
     * unnecessary since Strings are immutable.
     *
     * @param  original
     *         A {@code String}
     */
    public String(String original) {
        this.value = original.value;
        this.hash = original.hash;
    }

3、String类的理解

测试代码如下：

String a="a";
String a_new=new String("a");
System.out.println(a==a_new);//false
String a_1="a";//
System.out.println(a==a_1);//true，指向同一个内存地址，共享一个字符串常量池
String b_obj=new String("test");
String b_obj_copy=new String("test");
System.out.println(b_obj==b_obj_copy);//false，指向不同地址

对比上面的a_1和a两个引用，可以知道，这两个引用指向同一个地址，共享一个字符串常量
而通过b_obj和b_obj_copy知道，每次new一个对象时，两个引用指向不同地址（这里有个疑问，那么这两个引用是否是共用一个字符串呢？以后如果理解了，再来解释，先留个坑）。
结论：
（1）若两个变量直接赋值相同的字符串，则是同一个地址
（2）若两个变量都是利用构造函数生成，即使内容相同，也指向不同内存地址
一个疑问：
String类中的value成员是如何初始化的，怎么得到这个char数组的？

欢迎技术交流
wkang1993@outlook.com