日撸 Java 三百行： DAY19 字符串类-CSDN博客

本文链接：https://blog.csdn.net/qq_69515036/article/details/124503726

本文介绍了如何创建一个自定义的字符串类，该类实现了字符串的基本操作，如构造、转换为字符串、字符串匹配以及获取子串。字符串匹配采用朴素的方法，时间复杂度为O(mn)，虽然存在更高效的KMP算法，但在实际应用中，由于模式串通常较短，朴素方法仍然常见。此外，文章还提供了测试用例以验证类的功能。

摘要由CSDN通过智能技术生成

0.主题

今天实现string类的一些功能，包括字符串匹配和求子串。

1.程序

1. 实例域

	/**
	 * The maximal length.
	 */
	public static final int MAX_LENGTH = 10;
	
	/**
	 * The actual length.
	 */
	int length;
	
	/**
	 * The data.
	 */
	char[ ] data;

2. 构造器
实现两个构造器，第一个用于生成空串，第二个用于根据给出的串生成对应的字符串。

	/**
	 *********************
	 * Construct an empty char array.
	 *********************
	 */
	public MyString( ) {
		length = 0;
		data = new char[ MAX_LENGTH ];
	} // Of the first constructor
	
	/**
	 *********************
	 * Constructor using a system defined string.
	 * 
	 * @param paraString The given string. Its length should not exceed MAX_LENGTH - 1.
	 *********************
	 */
	public MyString( String paraString ) {
		data = new char[ MAX_LENGTH ];
		length = paraString.length( );
		
		//Copy data.
		for( int i = 0; i < length; i++ ) {
			data[ i ] = paraString.charAt( i );
		} // Of for i
	} // Of the second constructor

3. toString

	/**
	 *********************
	 * Overrides the method claimed in Object, the superclass of any class.
	 *********************
	 */
	public String toString( ) {
		String resultString = "";
		
		for( int i = 0; i < length; i++ ) {
			resultString += data[ i ];
		} // Of for i
		
		return resultString;
	} // Of toString

4. locate
即字符串匹配，依次枚举所有子串与模式串进行比对，最后返回匹配的子串在主串中的第一个位置。

	/**
	 *********************
	 * Locate the position of a substring.
	 * 
	 * @param paraMyString The given substring.
	 * @return The first position. -1 for no matching.
	 *********************
	 */
	public int locate( MyString paraMyString ) {
		boolean tempMatch = false;
		for( int i = 0; i < length - paraMyString.length + 1; i++ ) {
			//Initialize.
			tempMatch = true;
			for( int j = 0; j < paraMyString.length; j++ ) {
				if( data[ i + j ] != paraMyString.data[ j ] ) {
					tempMatch = false;
					break;
				} // Of if
			} // Of for j
			
			if( tempMatch ) {
				return i;
			} // Of if
		} // Of for i
		return -1;
	} // Of locate

时间复杂度： $O (m n)$ ，其中m，n分别为主串和模式串长度。
5. substring
即返回主串中给定起始位置和长度的子串。

	/**
	 *********************
	 * Get a substring.
	 * 
	 * @param paraStartPosition The start position in the original string.
	 * @param paraLength The length of the new string.
	 * @return The first position. -1 for no matching.
	 *********************
	 */
	public MyString substring( int paraStartPosition, int paraLength ) {
		if( paraStartPosition + paraLength > length ) {
			System.out.println("The bound is exceed.");
			return null;
		} // Of if
		
		MyString resultMyString = new MyString( );
		resultMyString.length = paraLength;
		for( int i = 0; i < paraLength; i++ ) {
			resultMyString.data[ i ] = data[ paraStartPosition + i ];
		} // Of for i
		
		return resultMyString;
	} // Of substring

6. 测试
测试代码如下：

	/**
	 *********************
	 * The entrance of the program.
	 * 
	 * @param args Not used now.
	 *********************
	 */
	public static void main( String args[ ] ) {
		MyString tempFirstString = new MyString("I like it.");
		MyString tempSecondString = new MyString("ik");
		int tempPosition = tempFirstString.locate( tempSecondString );
		System.out.println("The position of \"" + tempSecondString + "\" in \"" + tempFirstString + "\" is: " + tempPosition);
		
		MyString tempThirdString = new MyString("ki");
		tempPosition = tempFirstString.locate(tempThirdString);
		System.out.println("The position of \"" + tempThirdString + "\" in \"" + tempFirstString
				+ "\" is: " + tempPosition);

		tempThirdString = tempFirstString.substring(1, 2);
		System.out.println("The substring is: \"" + tempThirdString + "\"");

		tempThirdString = tempFirstString.substring(5, 5);
		System.out.println("The substring is: \"" + tempThirdString + "\"");

		tempThirdString = tempFirstString.substring(5, 6);
		System.out.println("The substring is: \"" + tempThirdString + "\"");
	} // Of main

结果如下：
在这里插入图片描述

2.体会

substring方法中，先做个越界检查是很有必要的。
朴素的字符串匹配时间复杂度是 $O (m n)$ ，而另一种名为KMP的字符串匹配算法时间复杂度是 $O (m + n)$ ，看起来差别挺大，但书上说实际应用中二者的差距并不大，所以朴素字符串匹配仍被广泛使用。我想，这可能是因为实际应用中模式串的长度都比较短的缘故吧？毕竟大多数搜索的时候，也就是搜索几个词，或者一两个句子，所以导致 $O (m n)$ 的复杂度实际表现并没那么吓人？