按照字节切分字符串（一）

最新推荐文章于 2021-11-30 21:41:36 发布

wavefly_liu

最新推荐文章于 2021-11-30 21:41:36 发布

阅读量1.5k

点赞数

分类专栏： J2SE 文章标签： string byte

本文链接：https://blog.csdn.net/liu251/article/details/2594858

版权

J2SE 专栏收录该内容

56 篇文章 0 订阅

订阅专栏

曾经做过一道笔试题：

编写一个截取字符串的函数，输入为一个字符串和字节数，输出为按字节截取的字符串。但是要保证汉字不被截半个，如“我ABC”4，应该截为“我AB”，输入“我ABC汉DEF”，6，应该输出为“我ABC”而不是“我ABC+汉的半个”。

在网上找的答案不符合要求：

    public void split(String , int length)

    {

    	int loopCount ;

    	loopCount = (srcStr.length()%length==0)?(srcStr.length()/length):(srcStr.length()/length + 1);

    	System.out.println("loopCount is: "+loopCount);

    	System.out.println("Str will be splited into " + loopCount + " pieces");

    	for (int i = 1; i<=loopCount; i++) {

    		if(i == loopCount)

    		{

    			System.out.println(srcStr.substring((i-1)*length,srcStr.length()));

    		}

    		else{

    			System.out.println(srcStr.substring((i-1)*length,(i*length)));

    		}

	} 

  }

这段代码是使用String类自带的substring方法直接截取的，根本不可能出现题目上的要求。

重新查资料后，重新写的2个方法，都可以完成题目的要求：

方法1：

    /**

     *split String by Byte

     *@param srcStr: src String will be splited

     *@parem length: the byte length of splited String 

     *@return : the new splited String

     */

    public static String splitByByte(String srcStr,int length)

    {

    	StringBuffer sb = new StringBuffer(length);

    	int srcLength = srcStr.length();//source string length

    	int tempLength = 0;//the byte length

    	for(int i = 0;i < srcLength;i++){

    		String tempStr = String.valueOf(srcStr.charAt(i));//string consists of a char 

    		byte[] b = tempStr.getBytes();//the byte length in the tempStr

    		tempLength += b.length;

    		if(length>=tempLength)

    			sb.append(tempStr);

    		else

    			break;

    	}

    	return sb.toString();

    }
这个方法是获取字符串中所有字符的，然后将每个字符的byte相加所得的和与length相比较，符合条件，则将这个字符放入新的字符串中。

方法2：

    public static String splitByByte(String str,int len){
     String result = "";
    char temp;
    //取得的字节数
    int counter=0;
    int i=0;
   //汉字个数
    int han = 0;
    while(counter < len){
      temp = str.charAt(i);
      if(Character.getNumericValue(temp)!=-1){
       //遇到字母的时候
       result = result +temp;
       counter++;
       i++;
         }else{
      //遇到汉字的时候，作为unicode字符，汉字的整数值是-1
       result = result +temp;
       counter = counter +2;
       i++;
       han = han +1;
      }
    }
  
    if(counter > len){
      if(len ==1){
       result = "";
      }else{
       System.out.println(result);
       result = result.substring(0,counter-(han+1));
      }
    }
   return result;
 }
这个方法是在网上搜的，主要思路是按照要求截取一个字符串，如果这个字符串的byte长度符合要求，就直接返回。否则，使用这句来重整：result = result.substring(0,counter-(han+1));每个汉字的byte是2，拉丁字符byte长度是1，counter-han 则将包含汉字的字符串转换为拉丁字符串，使其byte长度和字符串的长度一致，然后再 -1,获得最后一个字符在字符串中的位置即可。