Java Development - String

1.What

A Java string is a sequence of characters that exists as an object of the class java.lang. Java strings are created and manipulated through the String class. Once created, a string is immutable – its value cannot be changed.
Java 字符串是作为类 java.lang 的对象存在的字符串。Java 字符串是通过字符串类创建和操作的。一旦创建,字符串是不可变的 - 其值无法更改。

2.Why

Strings are very frequently used by code and crucial in Java due to which supports comparing , searching and many other functionalities.
字符串支持比较,搜索和许多其他功能,因此被代码非常频繁地使用,并且在Java中非常重要。

3.How

3.1 Creating a java string(创建一个字符串)

Below is one simple example for creating a Java string.
下面是一个创建 Java 字符串的简单例子。
String greeting = "Hello world!";

In this example, “Hello world!” is a string literal, which is a series of characters encased in double quotation. The compiler will create an object with that value, and the string value is written into the source code of a computer program.
在此示例中,“Hello world!”是用一个包含在双引号内的一系列字符。编译器将使用该值创建一个对象,并将字符串值写入计算机程序的源代码中。

Whenever a new string variable or object is created, it is stored in computer memory.
每当创建新的字符串变量或对象时,它都会存储在计算机内存中。

Memory is split into two high-level blocks, the stack and the heap. Java stores object values in heap memory; references to the value are stored in the stack.
内存被分成两个高级块,即栈和堆。Java将对象值存储在堆内存中;对值的引用存储在栈中。

Another way to create strings is to use the new keyword, as in the following example.
创建字符串的另一种方法是使用 new 关键字,如下面的示例所示。
String s1 = new String("Hello world!");

That code will create a string object. The reference will be stored in variable s1.
该代码将创建一个字符串对象并且把引用存放在变量s1中。

Java strings are immutable. Once created, a string cannot be changed. Any attempted changes will create another string instance.
Java 字符串是不可变的。一旦被创建,对象字符串的值将无法改变。任何更改的尝试都将创建另一个字符串实例。

Users can perform basic operations, such as comparing strings. The .equals() method is used to compare strings, as in the following example.
用户可以执行基本操作,例如比较字符串。.equals() 方法用于比较字符串,如下所示。

String str = "one";
String st = new String("one");
System.out.println (str.equals(st));

The above code will compare the contents of the strings to see if they are the same. In this case they are, so the Java program returns the boolean value “true.”
上面的代码将比较每个字符串的两个值,以查看它们是否相同。在本例中,它们相同,因此 Java 程序返回布尔值“true”。

3.2 String and char(字符串和字符)

The basic difference (基本区别)

  • char is a primitive data type whereas String is a class in java.
    字符是基本数据类型,而字符串是 Java 中的类

  • char represents a single character whereas String can have zero or more characters. String is an array of chars.
    char 表示单个字符,而字符串可以包含零个或多个字符。字符串是一个字符的数组。

  • We define char in java program using single quote (‘) whereas we can define String in Java using double quotes ("). Since String is a special class, we get this option to define string using double quotes, we can also create a String using new keyword.
    在Java程序中使用单引号(’)定义char,使用双引号(“)定义字符串。由于 String 是一个特殊的类,因此可以选择使用双引号定义字符串之外,还可以使用 new 关键字创建 String。

It can use String.valueOf(char c) or Character.toString(char c) to convert char to string.
使用String.valueOf(char c) 或者 Character.toString(char c) 将字符转换为字符串。

public class JavaCharToString {

    public static void main(String[] args) {
	char c = 'X';
	String str = String.valueOf(c);
	String str1 = Character.toString(c);
	System.out.println(c + " char converted to String using String.valueOf(char c) = " + str);
	System.out.println(c + " char converted to String using Character.toString(char c) = " + str1);
    }
}

Since String is an array of char, we can convert a string to the char array.
由于字符串是字符的数组,我们可以将字符串转换为 char 数组。

public class JavaStringToCharArray {

    public static void main(String[] args) {
	String str = "journaldev.com";
	// get char at specific index
	char c = str.charAt(0);
	// Character array from String
	char[] charArray = str.toCharArray();
	System.out.println(str + " String index 0 character = " + c);
	System.out.println(str + " String converted to character array = " + Arrays.toString(charArray));
    }
}

3.3 Guarantee string immutability(保证字符串不变性)

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {
    @Stable
    private final char[] value;
    ......
}

Seeing from the definition of string, we can know that:
从字符串的定义来看,我们可以知道:

  • The array holding the string is final and private, and the String class doesn’t provide/exposed a method to modify this string.
    保存字符串的数组是final的和private的,并且String 类不提供/暴露修改此字符串的方法。

  • The String class is final modified so that it cannot be inherited, thus preventing subclasses from destroying String immutability.
    String 类被final修饰,使它不能被继承,从而防止子类破坏 String 的不变性。

  • The String class use the copy of the internal to construct string from StringBuilder and others. In other words, the newly created string is a copy of the argument string.
    String类从字符串生成器和其他构造字符串使用的是内部的副本。换句话说,新创建的字符串是参数字符串的副本。

  • Some string operations also use System.arraycopy to get the copy to access the data.
    字符串的某些操作也是通过System.arraycopy获取副本来进行访问。

在java 9之后,字符串的定义在下面发生了变化:
After java 9, the definition of string changed below:

 private final byte[] value;

The new version of String actually supports two encoding schemes: Latin-1 and UTF-16. If the string contains no more characters than can be represented by Latin-1, then Latin-1 is used as the encoding scheme. Under the Latin-1 encoding scheme, byte occupies one byte (8 bits), char occupies 2 bytes (16), and byte saves half of the memory space compared to char. If the string contains more characters than can be represented in Latin-1, the space occupied by byte and char is the same.
新版本的字符串实际上支持两种编码方案: Latin-1UTF-16。如果字符串包含的字符不超过Latin-1 可以表示的字符,则使用Latin-1 作为编码方案。在 Latin-1 编码方案下,以 char 为例,byte 占用 1 个字节(8 位),char 占用 2 个字节 (16位),节省一半的内存空间。如果字符串包含的字符数多于 Latin-1 中可以表示的字符数,则 byte 和 char 所占用的空间是相同的。

3.4 Why string is designed immutable in java(为什么字符串在java中被设计为不可变)

  • Requirement of String Pool. Refer to 3.5
    字符串常量池的要求。请参阅 3.5。

  • Caching Hashcode (缓存哈希)
    The hashcode of string is frequently used in Java. For example, in a HashMap. Being immutable guarantees that hashcode will always the same, so that it can be cashed without worrying the changes.That means, there is no need to calculate hashcode every time it is used. This is more efficient.
    字符串的哈希在 Java 中经常使用, 例如在HasMap中。不可变性可以保证哈希始终相同,因此可以保证一致而不必担心更改。这意味着,无需在每次使用时都重新计算哈希,这样效率更高。

  • Facilitating the Use of Other Objects (方便其他对象的使用)
    To make this concrete, consider the following program:
    具体来说,请参考以下程序:

Set<String> set = new HashSet<String>();
set.add(new String("a"));
set.add(new String("b"));
set.add(new String("c"));

for(String a: set)
   a.value = "a";

In this example, if String is mutable, it’s value can be changed which would violate the design of set (set contains distinct elements). This example is designed for simplicity sake, in the real String class there is no value field.
在此示例中,如果 String 是可变的,则可以更改其值,这将违反 set 的设计(set 不能包含重复的元素)。此示例是为简单起见而设计的,在实际的 String 类中没有值字段。

  • Security (安全)
    String is widely used as parameter for many java classes, e.g. network connection, opening files, etc. Were String not immutable, a connection or file would be changed and lead to serious security threat. The method thought it was connecting to one machine, but was not. Mutable strings could cause security problem in Reflection too, as the parameters are strings.
    字符串被广泛用作许多Java类的参数,例如网络连接,打开文件等。如果 String 是可变的,则连接或文件将被更改并导致严重的安全威胁。该方法认为它正在连接到一台机器,但事实并非如此。可变字符串也可能导致反射中的安全问题,因为参数是字符串。
boolean connect(string s){
    if (!isSecure(s)) { 
        throw new SecurityException(); 
    }
    //here will cause problem, if s is changed before this by using other references.    
    causeProblem(s);
}
  • Thread-safety (线程安全)
    Because immutable objects can not be changed, they can be shared among multiple threads freely. This eliminate the requirements of synchronization.
    由于不可变对象无法被更改,因此可以在多个线程之间自由共享它们,这消除了要执行同步的要求。

In summary, String is designed to be immutable for the sake of efficiency and security. This is also the reason why immutable classes are preferred in general.
总之,为了效率和安全性,String 被设计为不可变的。这也是为什么通常首选不可变类的原因。

3.5 Types of strings in java(Java中的字符串类型)

Java distinguishes between primitive strings and object strings:

Java 区分基元字符串和对象字符串:

  • Primitive strings. These are string literals or string calls from a nonconstructor context. A constructor is a special method used to initialize objects.
    基元字符串 是指字符串字面量文本或者是来自非构造函数上下文的调用字符串。构造函数是用于初始化对象的特殊方法。

  • Object strings. These are strings created using the new operator. Object strings create two objects, whereas primitives create just one. Object strings create the string literal and the variable to refer to it.
    对象字符串 是指用 new 运算符创建的字符串。对象字符串创建两个对象,而基元字符串只创建一个对象。对象字符串创建字符串文本和要引用它的变量。

The two string types are stored in memory differently.
这两种字符串类型以不同的方式存储在内存中。

When a string literal is created, the Java virtual machine (JVM) checks the string pool to see if it already exists. The string constant pool is a memory area where strings are stored. If the value exists, the string primitive will occupy the existing value. If the value does not exist, the JVM creates a new string and adds it to the pool.
创建字符串文本时,Java 虚拟机 (JVM) 将检查字符串池以查看它是否已经存在。字符串常量池是存储字符串的内存区域。如果该值存在,则字符串基元将占用现有值。如果该值不存在,JVM 将创建一个新字符串并将其添加到池中。

The memory heap provides global access to all data stored in the heap for the entire life of the Java application. All object values are stored in the heap, including string literals, which are stored in the string constant pool inside the heap. When an object string is created, the part between the double quotes goes into the string constant pool. The variable assigned to the string is stored in the stack and matched to the string inside the pool.
内存堆提供对在 Java 应用程序的整个生命周期内存储在堆中的所有数据的全局访问。所有对象值都存储在堆中,包括字符串文本,它们存储在堆内的字符串常量池中。创建对象字符串时,双引号之间的部分将进入字符串常量池。分配给字符串的变量存储在栈中,并与池中的字符串匹配。
在这里插入图片描述

  • String objects declared directly with double quotes are stored directly in the constant pool.
    直接用双引号声明的字符串对象直接存储在常量池中。

  • If the String object is not declared with double quotes, you can use the String#intern method provided by String. The intern method will query whether the current string exists from the string constant pool. If it does not exist, it will put the current string into the constant pool.
    如果字符串对象未使用双引号声明,则可以使用字符串提供的方法String#intern 。该方法将从字符串常量池中查询当前字符串是否存在。如果它不存在,它将把当前字符串放入常量池中。

The jdk7 version has made some changes to the intern operation and constant pool. It mainly includes 2 points:
jdk7 版本对String#intern和常量池进行了一些更改。它主要包括2点:

  • Moved String constant pool from Perm area to Java Heap area
    将字符串常量池从Perm 区域移动到 Java 堆区域

  • String#intern method, if there is an object in the heap, the reference to the object will be saved directly without re-creating the object.
    String#intern方法,如果堆中有一个对象,则将直接保存对该对象的引用,而无需重新创建该对象。

Tips: References to objects reside in the stack. The objects themselves reside in the heap. String values reside in the string pool inside the heap.
提示:对对象的引用驻留在栈中。对象本身驻留在堆中。字符串值驻留在堆内的字符串池中。

3.6 String and charSequence(字符串和字符序列)

A CharSequence is a readable sequence of char values. This interface provides uniform, read-only access to many different kinds of char sequences. In addition, CharSequence does not define the general purpose implementations of the equals() or hashCode() methods, so there is no guarantee that objects of different classes implementing CharSequence will compare to be equal even if the underlying sequence that they hold is the same.
字符序列是字符值的可读序列。此接口提供对许多不同种类的字符序列的统一只读访问。此外,CharSequence 没有指定 equals() 或 hashCode() 方法的通用实现,因此无法保证 CharSequence 的不同实现类的对象将比较相等,即使它们所持有的基础序列相同。

What needs to be said here is that for an abstract class or interface class, you cannot use new to assign values, but you can create instances in the following ways:
这里需要说的是,对于抽象类或接口类,您不能使用 new 来赋值,但可以通过以下方式创建实例:

CharSequence cs="hello";
But it cannot be created like this:

但不允许这样创建:
CharSequence cs=new CharSequence("hello");

Here is a diagram below shows classes implementing the CharSequence interface.
下图展示了 CharSequence 接口的几个具体实现类。

The diagram can show the difference between string and charSequence below :
该图展示了stringcharSequence 之间的区别

  • They both reside in the same package named java.lang., but the former is an interface and latter is a concrete class. String is a class that implements the CharSequence interface. Moreover, the String class is immutable.
    它们都被包含在名为java.lang.的同一个包中,但前者是一个接口,后者是一个具体的类。字符串是实现字符序列接口的类。此外,字符串类是不可变的。

  • The interface does not imply a built-in comparison strategy, whereas the String class implements the Comparable interface. To compare two CharSequences, we can cast them to Strings then subsequently compare them.
    该接口没有内置的比较策略,而 String 类实现了Comparable<String>接口。要比较两个CharSequences,我们可以将它们转换为字符串,然后比较它们。

String, StringBuffer and StringBuilder classes implement CharSequence interface. For mutable strings, you can use StringBuffer and StringBuilder classes. A list of differences between StringBuffer and StringBuilder is given below:
字符串、字符串缓冲区和字符串生成器类实现了CharSequence 接口。对于可变字符串,可以使用字符串缓冲区和字符串生成器类。下面给出了字符串缓冲区和字符串生成器之间的差异列表:

| No.| StringBuffer | StringBuilder |
|–|–|–|–|
| 1 | StringBuffer is synchronized i.e. thread safe. (字符串缓冲区是同步的,即线程安全。)| StringBuilder is non-synchronized i.e. not thread safe. (字符串生成器是非同步的,即线程不安全。)|
| 2 | StringBuffer is less efficient than StringBuilder. (字符串缓冲区比字符串生成器低效)| StringBuilder is more efficient than StringBuffer.(字符串生成器比字符串缓冲区更有效率) |
| 3 | StringBuffer was introduced in Java 1.0 (字符串缓冲区在 Java 1.0 中引入)| StringBuilder was introduced in Java 1.5(字符串生成器在 Java 1.5 中引入) |

Tips: StringJoiner is a new class added in Java 8 under java.util package. It is very useful, when you need to join Strings in a Stream. Simply put, it can be used for joining Strings making use of a delimiter, prefix, and suffix.

提示StringJoiner是在 Java 8 中 Java.util 包下添加的一个新类。当您需要在流中加入字符串时,它非常有用。简而言之,它可用于使用分隔符,前缀和后缀来连接字符串。

@Test
public void whenEmptyJoinerWithPrefixSuffixAndEmptyValue_thenDefaultValue() {
    StringJoiner joiner = new StringJoiner(",", PREFIX, SUFFIX);
    joiner.setEmptyValue("default");

    assertEquals(joiner.toString(), "default");
}
@Test
public void whenUsedWithinCollectors_thenJoined() {
    List<String> rgbList = Arrays.asList("Red", "Green", "Blue");
    String commaSeparatedRGB = rgbList.stream()
      .map(color -> color.toString())
      .collect(Collectors.joining(","));

    assertEquals(commaSeparatedRGB, "Red,Green,Blue");
}

Collectors.joining() internally uses StringJoiner to perform the joining operation.

Collectors.joining() 在内部使用StringJoiner 执行联接操作。

3.7 String and main character sets(字符串和主要字符集)

Before digging deeper, though, let’s quickly review three terms: encoding, charsets, and code point.
在深入挖掘之前,让我们快速回顾一下三个术语:编码、字符集和代码点。

3.7.1 Encoding(编码)

Computers can only understand binary representations like 1 and 0. Processing anything else requires some kind of mapping from the real-world text to its binary representation. This mapping is what we know as character encoding or simply just as encoding. For example, the first letter in our message, “T”, in US-ASCII encodes to “01010100”.
计算机只能理解像 1 和 0 这样的二进制表示形式。处理其他任何内容都需要从现实世界文本到其二进制表示的某种映射。这种映射就是我们所知道的字符编码或简称为编码。例如,我们消息中的第一个字母“T”在 US-ASCII 中编码为“01010100”。

3.7.2 Charsets(字符集)

The mapping of characters to their binary representations can vary greatly in terms of the characters they include. The number of characters included in a mapping can vary from only a few to all the characters in practical use. The set of characters that are included in a mapping definition is formally called a charset. For example, ASCII has a charset of 128 characters.
字符到其二进制表示形式的映射在它们包含的字符方面可能会有很大差异。映射中包含的字符数可以从实际使用中的少数字符到所有字符不等。映射定义中包含的字符集正式称为字符集。例如,ASCII 的字符集为 128 个字符。

3.7.3 Code point(码位)

A code point is an abstraction that separates a character from its actual encoding. A code point is an integer reference to a particular character.
码位是将字符与其实际编码分开的抽象。代码点是对特定字符的整数引用。

We can represent the integer itself in plain decimal or alternate bases like hexadecimal or octal. We use alternate bases for the ease of referring large numbers. For example, the first letter in our message, T, in Unicode has a code point “U+0054” (or 84 in decimal).
我们可以用纯十进制或替代基数(如十六进制或八进制)表示整数本身。我们使用备用基数以便于引用大数。例如,我们的消息中的第一个字母 T 在 Unicode 中有一个码位“U+0054”(或十进制的 84)。

3.7.4 Common character sets(常用字符集)

A character encoding can take various forms depending upon the number of characters it encodes. Let’s go through some of the popular encoding schemes in practice .
字符编码可以采用各种形式,具体取决于它编码的字符数。让我们介绍在实践中一些流行的编码方案 。

  • Single-Byte Encoding(单字节编码)
    ASCII’s 128-character set covers English alphabets in lower and upper cases, digits, and some special and control characters.The original ASCII left the most significant bit of every byte unused. One of the more popular ASCII extensions was ISO-8859-1, also referred to as “ISO Latin 1”.
    ASCII 的 128 个字符集涵盖小写和大写字母、数字以及一些特殊字符和控制字符的英语字母表。原始 ASCII 未使用每个字节中最重要的位。其中一个更流行的 ASCII 扩展是 ISO-8859-1,也称为 “ISO Latin 1”。

  • Multi-Byte Encoding(多字节编码)
    Unicode as a standard defines code points for every possible character in the world.
    Unicode 作为标准,可以为世界上每个可能的字符定义码位。

    UTF-32 is an encoding scheme for Unicode that employs four bytes to represent every code point defined by Unicode. Obviously, it is space inefficient to use four bytes for every character.
    UTF-32 是 Unicode 的一种编码方案,它使用四个字节来表示由 Unicode 定义的每个码位。显然,为每个字符使用四个字节是空间效率低下的。

    assertEquals(convertToBinary("T", "UTF-32"), "00000000 00000000 00000000 01010100");

    UTF-16 is also a variable-length character encoding. This encoding method is special. It encodes characters into 2 bytes or 4 bytes.
    UTF-16 也是一种可变长度字符编码。这种编码方法是特殊的。它将字符编码为 2 个字节或 4 个字节。

    UTF-8 is another encoding scheme for Unicode which employs a variable length of bytes to encode. Due to its space efficiency, is the most common encoding used on the web.
    UTF-8 是 Unicode 的另一种编码方案,它采用可变长度的字节进行编码。由于其空间效率,是网络上使用的最常见的编码

    assertEquals(convertToBinary("語", "UTF-8"), "11101000 10101010 10011110");

    GB 2312 uses two bytes to represent any graphic character and used to encode Chinese.
    GB 2312使用两个字节来表示任何图形字符,并用于对中文进行编码。

Here are the difference between UTF-8 and UTF-16 in Java :

以下是 Java 中 UTF-8 和 UTF-16 之间的区别:

  • UTF-8 uses a byte at the minimum in encoding the characters while UTF-16 uses at two bytes.
    UTF-8 在对字符进行编码时至少使用一个字节,而 UTF-16 最少两个字节。
  • A UTF-8 encoded file tends to be smaller than a UTF-16 encoded file.
    UTF-8 编码的文件往往小于 UTF-16 编码的文件。
  • UTF-8 is compatible with ASCII while UTF-16 is incompatible with ASCII.
    UTF-8 与 ASCII 兼容,而 UTF-16 与 ASCII 不兼容。
  • UTF-8 is byte oriented while UTF-16 is not.
    UTF-8 是面向字节的,而 UTF-16 不是。

The Java platform depends heavily on a property called the default charset. The Java Virtual Machine (JVM) determines the default charset during start-up. This is dependent on the locale and the charset of the underlying operating system on which JVM is running. For example on MacOS, the default charset is UTF-8.
Java 平台在很大程度上依赖于称为缺省字符集的属性。Java 虚拟机 (JVM) 在启动期间确定缺省字符集。这取决于运行 JVM 的基础操作系统的区域设置和字符集。例如,在 MacOS 上,默认字符集为 UTF-8。

3.7.5 JDK 18 and the default charset(JDK 18和默认字符集)

Before JDK 18 the default charset heavily depends on the operating system. Starting from JDK 18 the default charset is always UTF-8, unless it is explicitly configured otherwise. With this change, you should know that :

在 JDK 18 之前,默认字符集很大程度上取决于操作系统。 从 JDK 18 开始,默认字符集始终为 UTF-8,除非另有明确配置。经此更改,应该知道的点:

  • What is the default charset
    A character set is always involved when there is a conversion between a sequence of bytes and a sequence of characters (char in Java).When you write a code in Java like the following:
    什么是默认字符集?当字节序列和字符序列(Java 中的 char)之间存在转换时,总是会涉及到字符集 ,如下所示:

     FileWriter fw = new FileWriter("data.txt");
     Scanner sc = new Scanner(new File("data.txt"));
    

    you are implicitly using the default charset.
    此处正在隐式使用默认字符集。

  • Why should set UTF-8 as the default charset
    With this change, APIs that depend upon the default charset will behave consistently across all implementations, operating systems, locales, and configurations. Make Java programs more predictable and portable .
    为什么要将 UTF-8 设置为默认字符集?经此更改,依赖于默认字符集的 API 将在所有实现、操作系统、语言环境和配置中保持一致。 使 Java 程序更具可预测性和可移植性。

  • What needs to be done to adapt this change
    Note that on JDK 18 the default charset is now UTF-8 and this is also reflected in the file.encoding system property. While the native.encoding system property reflects the “native” encoding according to the operating system。If you use a JDK 18 and you want to revert to the “old” behavior, there is one simple and documented way: you can set the file.encoding system property to COMPAT when you launch the application:

    需要做些什么来适应这种变化? 请注意,在 JDK 18 上,默认字符集现在是 UTF-8,这也反映在 file.encoding 系统属性中。 而 native.encoding 系统属性反映了根据操作系统的“本机”编码。如果使用 JDK 18 并且想要恢复到“旧”行为,有一种简单的方法:启动应用程序时可以 将 file.encoding 系统属性设置为 COMPAT

    > java -Dfile.encoding=COMPAT CharsetInfo

    In this way the behavior is exactly the same as JDK 17 and earlier versions.
    The JDK 18 has not only changed the default charset at runtime, but also at compile time! This may pose compatibility problems at source level. If so, you could use the -encoding option to set the source encoding. to fix it.

    这样,行为与 JDK 17 及更早版本完全相同。
    JDK 18 不仅在运行时更改了默认字符集,还在编译时更改了默认字符集! 这可能会在源代码级别造成兼容性问题。 如果这样,可以使用 -encoding 选项来设置源文件编码来要解决。

4.Samples

4.1 Java string object example (Java 字符串对象示例)

The following is an example of how character values make up the char array ch.
以下是字符值如何组成 char 数组 ch 的示例。

char[] ch = {'t','e','c','h','t','a','r','g','e','t'};
String s2 = new String (ch);

This means the same thing as the example below.
String s2 = "techtarget";

4.2 Common used string lib class example (常用字符串库类示例)

// Java program to demonstrate use of
// StringJoiner class over StringBuilder class

import java.util.StringJoiner;

public class Test
{
	public static void main(String[] args)
	{
		// given string array
		String str[] = {"George","Sally","Fred"};
			
		// By using StringJoiner class
			
		// initializing StringJoiner instance with
		// required delimiter, prefix and suffix
		StringJoiner sj = new StringJoiner(":", "[", "]");
			
		// concatenating strings
		sj.add("George").add("Sally").add("Fred");
			
		// converting StringJoiner to String
		String desiredString = sj.toString();
			
		System.out.println(desiredString);
			
		// By using StringBuilder class
			
		// declaring empty stringbuilder
		StringBuilder sb = new StringBuilder();
			
		// appending prefix
		sb.append("[");
			
		// checking for empty string array
		if(str.length>0)
		{
			// appending first element
			sb.append(str[0]);
				
			// iterating through string array
			// and appending required delimiter
			for (int i = 1; i < str.length; i++)
			{
				sb.append(":").append(str[i]);
			}
		}
			
		// finally appending suffix
		sb.append("]");
			
		// converting StringBuilder to String
		String desiredString1 = sb.toString();
			
		System.out.println(desiredString1);
	}
}

Output:

[George:Sally:Fred]
[George:Sally:Fred]

4.3 String function example ( 字符串函数示例)

     String cde = "cde";
     System.out.println("abc" + cde);
     String c = "abc".substring(2,3);
     String d = cde.substring(1, 2);

4.4 String split performance(Strng split 性能问题)

String.split function accepts a regex argument to split string which may cause serious performance issue. In the implementation of Java6, each time String.split is called, a new Pattern object is directly created to compile the regular expression of the parameters, and then the strings are separated. The pattern is not cached in the implementation, so in the usage scenario of multiple frequent calls Performance is poor. Therefore, in the implementation of Java7, the separation of single characters is optimized, instead of implementing regular expressions, indexOf is used to quickly locate the separation position and improve performance. In addition, if you want to implement complex string splitting using regular expressions, you can use third-party components, such as Guava’s Splitter API.

String.split 方法接受一个正则表达式参数来拆分字符串,这可能会导致严重的性能问题。在Java6的实现里,String.split每次调用都直接新建Pattern对象对参数进行正则表达式的编译,再进行字符串分隔,实现中也没有对Pattern进行缓存,因此多次频繁调用的使用场景下性能很差。因此在Java7的实现里,针对单字符的分隔进行了优化,不走正则表达式的实现,直接利用indexOf快速定位分隔位置,提高性能。此外要实现利用正则表达式进行复杂字符串分割的话,可以利用第三方组件实现,如 Guava’s Splitter API。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

我是王小贱

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值