字符串根据字节长度截取

吃饱饭长的快

已于 2022-03-31 11:17:06 修改

阅读量1.5k

点赞数 1

文章标签： java eclipse intellij-idea

于 2022-03-31 10:55:17 首次发布

本文链接：https://blog.csdn.net/qq_29942637/article/details/123865717

版权

----本文的适用场景：根据字符的字节数截取字符串乱码问题；因为字符串中有可能存在中英文混合、标点符号等，在需要根据字节长度截取时，直接getByte,然后截取byte[]会导致还原后的字符串最后一个字符乱码；

【直接开始】

众所周知，按照编码格式不同，中文可能会占据2、3、4个字节长度，而英文一般情况下占用1字节，标点符号分全角半角，占位1或者2个字节（听说还有三字节？）

本方法参照`夹逼准则`的理念，`红黑树`实现原理。嘿嘿装一下，就是1/2长度渐次逼近截取字符串来实现，代码如下

/**
 * 按比特数截取字符串，解决中文乱码
 *
 * @param info   字符串
 * @param length 比特字节数
 * @return 字符串截取位置
 */
private int splitByte(String info, int length) {
    // 不需要截取
    if (info.getBytes().length <= length) {
        return info.length();
    }
    // 字节数与字符数相等时（全英文情况，此处成立）
    if(info.getBytes().length == info.length()){
        return length;
    }
    // 截取指针，从1/2比特长度处分割字符串
    int half = length / 2;
    int nowLength = info.substring(0, half).getBytes().length;
    // 长度未达到从1/2比特长度差开始逼近
    while (nowLength < length) {
        if ((length - nowLength) / 2 > 1) {
            half += ((length - nowLength) / 2);
        } else {
            half++;
        }
        nowLength = info.substring(0, half).getBytes().length;
    }
    // 长度超出后按1/2长度差逼近
    while (nowLength > length) {
        if ((nowLength - length) / 2 > 1) {
            half -= ((nowLength - length) / 2);
        } else {
            half--;
        }
        nowLength = info.substring(0, half).getBytes().length;
    }
    return half;
}

测试

// 测试
private void test(){
    int limitLength = 20;
    String body="文件 maven-repo.zip 接收完成，保存路径:此处理hellohellohellohellohellohellohellohello1234abcd方法需限制处理时间，当处理时间超过5分钟，程序会强制终止";
    String spliStr = null;
    if (body.getBytes().length > limitLength) {
        int splitIndex = splitByte(body, limitLength);
        spliStr = body.substring(0,splitIndex);
    } else {
        spliStr = body;
    }
    System.out.println(spliStr);
}

第二篇分享文章，有错误之处欢迎指出；大家有什么其他好的方法，也请分享给我