基本类型转String 和字符串连接机制

Primitive types to String conversion and String concatenation

基本类型转String 和字符串连接机制

 

 

 

Primitive types to String conversion

 

基本类型转换为String

 

From time to time you may need to create a string in your program from several values, some of them may be of primitive types. If you have two or more primitive type values in the beginning of your string concatenation, you need to explicitly convert first of them to a string (otherwise System.out.println( 1 + 'a' ) will print ’98′, but not ’1a’). Of course, there is a family of String.valueOf methods (or corresponding wrapper type methods), but who needs them if there is another way which requires less typing?

我们常常一次又一次的使用基本类型构建String.如果你有2个甚至更多的基本类型变量位于String连接之间.你需要把他们转换为String(否则 System.out.println(1+ 'a') 将打印'98'而不是'la').当然.你也可以使用String.valueOf方法(或者其他包装类的方法)

 

 

 

 

Concatenating an empty string literal and the first of your primitive type variables (in our example, "" + 1) is the easiest idea. Result of this expression is a String and after that you can safely concatenate any primitive type values to it – compiler will take care of all implicit conversions to String.

把一个字面量为空的字符串与一个基本类型的变量联系起来(在我们的示例中,""+1),把一个字符串后面加任何基本类型的数值都会被编译为String

 

 

Unfortunately, this is the worst way one can imagine. In order to understand why it is so, we need to review how string concatenation operator is translated in Java. If we have a String value (doesn’t matter which sort of it – literal or variable or method call) followed by + operator followed by any type expression:

 

不幸的是,这是能想象的最糟糕的方式,为了能理解为什么是这样,我们需要去检查String连接操作在java中是被咋样翻译的. 假如我们有一个String值(不需要在意它的的字面,变量,方法的排序 ) 同过+这个操作符后跟了个其他任何表达式

     String_exp +  any_exp

 

Java compiler will translate it to:

 java 编译器将这样翻译他

     new StringBuilder().append(String_exp).append(any_exp).toString(); 

 

If you have more than one + operator in the expression, you will end up with several StringBuilder.append calls before final toString call

 如果你的表达式中不止一个+操作符家,在调用toString()前将会持续append操作

 

 StringBuilder(String) constructor allocates a buffer containing 16 characters. So, appending up to 16 characters to that StringBuilder will not require buffer reallocation, but appending more than 16 characters will expand StringBuilder buffer. At the end, in the StringBuilder.toString() call a new String object with a copy of StringBuilder buffer will be created

  

  StringBuilder(String) 构造函数 分配一个包含16个字符的缓冲,所以,追加最多16个字符的的StringBuilder将不需要重新分配缓冲,但追加超过16个字符的将需要扩展缓冲.最后,在StringBuilder.toString()方法中将会返回一个对StringBuilder的缓冲拷贝的String对象

 

 

This means that for the worst case conversion of a single primitive type value to String, you will need to allocate: one StringBuilder, one char[ 16 ], one String and one char[] of appropriate size to fit your input value. By using one of String.valueOf methods you will at least avoid creating a StringBuilder.

 

这意味着一个基本类型转换为String,你需要分配:一个StringBuilder,一个长度为16的char数组,一个String和一个适合你输入值得char数组,用String.valueOf方法至少可以避免创建一个StringBuilder.

 

Sometimes you actually don’t have to convert primitive value to String at all. For example, you are parsing an input string, which is a comma-separated string. In the initial version you had something like such call:

 

有时你实际完全不需要把基本类型转换为String,例如:你解析一个被某个符号分割的String.最初的版本你可能会这样作

 

final int nextComma = str.indexOf("'");

or even 

 甚至这样

 

final int nextComma = str.indexOf('\'');

 

After that program requirements were extended in order to support any separator. Of course, a straightforward interpretation of “any” means you need to keep a separator in a String object and use String.indexOf(String) method. Let’s suggest that a preconfigured separator is stored in m_separator field. In this case your parsing may look like:

后来程序需要你扩展至支持任何分隔符,当然,支持任何分隔符意味着你需要一个Stirng对象的分隔符并且使用String.indexof(String)方法.我们建议把一个默认的分隔符存储在m_separator这变量中,你解析的代码看起来像这个:

   private static List<String> split( final String str )

{

    final List<String> res = new ArrayList<String>( 10 );

    int pos, prev = 0;

    while ( ( pos = str.indexOf( m_separator, prev ) ) != -1 )

    {

        res.add( str.substring( prev, pos ) );

        prev = pos + m_separator.length(); // start from next char after separator

    }

    res.add( str.substring( prev ) );

    return res;

}

 

But later it was discovered that you will never get more than a single character separator. In the initialization, you will replace String m_separator with char m_separator and change its setter appropriately. But you may be tempted not to update parsing method a lot (why should I change the working code anyway?):

但是后来你发现你使用的分隔符从来没有超过单一的character.在初始化时,你会定义一个char类型m_separtor来取缔String类型的m_separtor并且适当的改变他的setter方法.

但你又不想大量的改动解析方法(我们如果改变这工作的代码呢?):

private static List<String> split2( final String str )

{

    final List<String> res = new ArrayList<String>( 10 );

    int pos, prev = 0;

    while ( ( pos = str.indexOf("" + m_separatorChar, prev ) ) != -1 )

    {

        res.add( str.substring( prev, pos ) );

        prev = pos + 1; // start from next char after separator

    }

    res.add( str.substring( prev ) );

    return res;

}

 

As you may see, indexOf call was updated, but it still creates a string and uses it. Of course, this is wrong, because there is a same method accepting char instead of String. Let’s use it:

如你所看到的。indexOf方法被更改啦。但他依然创建以个字符串,并且使用它,当然这是错误的,因为这里也可以使用同样的方法用char来替代String,我们改动下:

  private static List<String> split3( final String str )

{

    final List<String> res = new ArrayList<String>( 10 );

    int pos, prev = 0;

    while ( ( pos = str.indexOf( m_separatorChar, prev ) ) != -1 )

    {

        res.add( str.substring( prev, pos ) );

        prev = pos + 1; // start from next char after separator

    }

    res.add( str.substring( prev ) );

    return res;

}

 

For the test, "abc,def,ghi,jkl,mno,pqr,stu,vwx,yz" string was parsed 10 million times using all 3 methods. Here are Java 6_41 and 7_15 running times. Java 7 running time was increased due to now linear complexity of String.substring method. You can read more about it here.

测试如下,  "abc,def,ghi,jkl,mno,pqr,stu,vwx,yz"这个字符串用这3种方法分别简析10次,下面是java 6_41 和 7_15的运行时间,java7的运行时间增加是因为String.subString方法变复杂啦。你可以去这里阅读他.

 

 

 

As you may see, this simple refactoring has considerably decreased time spent in splitting ( split/split2 -> split3 ).

 

 splitsplit2split3

Java 64.65 sec10.34 sec3.8 sec

Java 76.72 sec8.29 sec4.37 sec

 

如你所看到的,这简单的重构使splitting这个方法执行的时间得到了相当的递减

 

 

 

 

String concatenation 

字符串连接

 

 

This article will not be complete without mentioning the 2 other string concatenation methods. First one, rather rarely used, is String.concat method. Inside, it allocates a char[] of length equal to sum of concatenated strings lengths, copies string data into it and creates a new String using a private String constructor, which doesn’t make a copy of input char[], so only two objects are being created as a result – String and its internal char[]. Unfortunately, this method is only efficient when you need to concatenate exactly 2 strings

 

这文章不会提及2个完全不关联的字符串的操作, 第一个,相当少的被使用。是String.concat 方法.内部。它将分配一个char类型的数组,数组长度为连接的字符串的长度.把string的数据拷贝到char数组中,用私有的String构造函数创建以个新的字符串.不需要拷贝一个char[]的数组.所以2个对象被创建-String和其内部的char[].不幸的是,这种方法的效率很有限当你需要精确的连接2个字符串时

 

 

The third way of string concatenation is using StringBuilder class and its various append methods. This is definitely the fastest way when you need to concatenate many input values. It was introduced in Java 5 as a replacement for StringBuffer class. Their main difference is that a StringBuffer is thread-safe, while StringBuilder is not. Do you often create a string concurrently?

第三种字符串连接使用StringBuilder和它的多个append方法.这是最快速的方式当你需要连接多个输入时.在java5中被介绍用来替代StringBuffer.他们主要的不同是StringBuffer是线程安全的。而StringBuilder不是.你常常创建一个字符串吗?

 

As a test, all numbers between 0 and 100,000 were concatenated using String.concat, + operator and StringBuilder using code like this:

做一个测试.位于0到100,000的数值被连接起来用String.concat,  +操作符 和StringBulider ,代码如下:

 

String res = ""; 

for ( int i = 0; i < ITERS; ++i )

{

    final String s = Integer.toString( i );

    res = res.concat( s ); //second option: res += s;

}        

//third option:        

StringBuilder res = new StringBuilder(); 

for ( int i = 0; i < ITERS; ++i )

{

    final String s = Integer.toString( i );

    res.append( s );

}

 

 

String.concat+StringBuilder.append

10.145 sec42.677 sec0.012 sec

 

 

Results are obvious – O(n) algorithm is of course much faster than O(n2) algorithms. But in real life we have a lot of + operators in our programs – they are more convenient. In order to deal with it, -XX:+OptimizeStringConcat option was introduced in Java 6 update 20. It was turned on by default between Java 7_02 and Java 7_15 (and it is still off by default in Java 6_41), so you may have to explicitly turn it on. As many other -XX options, it is extremely badly documented:

 

Optimize String concatenation operations where possible. (Introduced in Java 6 Update 20)

 

 

结论相当明显-  0(n) 算法当然比 0(n2)的算法快.但在现实中,我们经常使用+操作符-他们太方便啦.为了处理这个问题。在java 6 更新版20本以上.参数-XX:+OptimizeStringConcat 可以被使用.这参数在java 7_02至java 7_15版本中默认被开启.(在java 6_41版本中依然没有启用). 所以你也许不得像其他-XX参数一样明确的开启它。

 

 

 

Let’s just assume that Oracle engineers did their best with this option. Anecdotal knowledge tells that it replaces some StringBuilder generated logic with logic similar to String.concat implementation – it creates a char[] with appropriate length for all concatenated values and copies them to that output array. After that it creates a result String. Probably, nested concatenations are also supported ( str1 + ( str2 + str3 ) + str4 ). Running our test with this option proves that time for + operator is getting very similar to String.concat implementation:

   我们假设oracle引擎使用了这个参数.经验告诉我们他将会使用StringBuilder逻辑来替代相似逻辑实现的String.concat-它会创建一个长度为所有输入连接的CHAR数组.然后再创建一个String. 适当的. 连接也支持这种( str1 + ( str2 + str3 ) + str4 ).  使用这些参数来进行测试用例,String.concat与+操作符的速率比较接近

   String.concat+StringBuilder.append

10.19 sec10.722 sec0.013 sec

 

 

Let’s make one more test for this option. As it was noticed before, default StringBuilder constructor allocates 16 characters buffer. The buffer is expanded when we need to add 17-th character to it. Let’s append each number between 100 and 100,000 to “12345678901234″ string. As a result we will have strings 17 to 20 characters long, so default + operator implementation will require StringBuilder resizing. As a counter example, let’s make another test in which we will explicitly create StringBuilder(21) to ensure that its buffer will not resize

让我们为这个参数做更多的测试。在他被关注之前.默认的StringBuilder构造函数分配16个characters的缓冲.这个缓冲必须被扩展当我们需要第17个character的话。 让我们把100到100,000的每个数字像这样联系起来"1234567891234". 这样我们的字符长度位于17到20直接。默认的的+操作符实现将需要重新分配StringBuilder的长度.像统计的示例.

让们确保另外一个测试.我们创建一个StringBuilder(21)的构造函数将不会重新分配长度.

  final String s = BASE + i;

  final String s = new StringBuilder( 21 ).append( BASE ).append( i ).toString();

 

  Without this option, time for + implementation is 50% higher than time for explicit StringBuilder implementation. Turning this option on makes both results equal. But what’s more interesting, even explicit StringBuilder implementation is getting faster with it!

 

没有使用这个参数.+操作符的实现比StringBuilder的实现所发时间多50%. 开启了该参数,2个发的时间基本一致.但更有意思的是.StringBuilder比之前更快啦

 

+, turned off+, turned onnew StringBuilder(21), turned offnew StringBuilder(21), turned on

0.958 sec0.494 sec0.663 sec0.494 sec

 

    

Summary

 

Never use concatenation with an empty string "" as a “to string conversion”. Use appropriate String.valueOf or wrapper types toString(value) methods instead.

不要使用空字符串""去连接. 使用更适合的String.valueOf 或者包装类的toString(value) 方法来替代    

 

Whenever possible, use StringBuilder for string concatenation. Check old code and get rid of StringBuffer is possible.

尽可能使用StringBuilder来连接。检查旧代码,尽可能抛弃StringBuffer

 

 

Use -XX:+OptimizeStringConcat option introduced in Java 6 update 20 in order to improve string concatenation performance. It is turned on by default in recent Java 7 releases, but it is still turned off in Java 6_41.

 

使用-XX:+OptimizeStringConcat参数来改善字符串连接的性能.在最近的java7版本中默认被开启了该参数.  但在java 6_41版本中没有被使用

 

 

 

 

 

 

 

 

 

 

 

  

 

 

  

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值