java按照字节截取字符串

最新推荐文章于 2024-07-30 03:18:59 发布

java小酱油啊

最新推荐文章于 2024-07-30 03:18:59 发布

阅读量1.4k

点赞数

分类专栏： javaSE 文章标签： java 控制台 utf-8 字节截取

本文链接：https://blog.csdn.net/luckey_zh/article/details/43342027

版权

javaSE 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

在Java中，我们经常遇到的一个问题就是截取字符串。

public static void main(String[] args) {
		String str="我爱java";
		System.out.println(str.substring(0, 2));
	}

运行这行代码，我们可以知道，控制台输出的是：我爱

有些时间，这种截取方法不可取。比如：在Oracle数据库中，字段大小为varchar2(10),这个是这个字段最大能保存10个字节的字符串。我们如果这么写，那等待的就是报错了

	public static void main(String[] args) {
		String str="测试小酱油啊，我是来测试的。";
		System.out.println(str.substring(0, 10));
	}

我们知道在GBK下，汉字是占两个字节，UTF-8下汉字是占三个字节，所以，这个截取只能按照字节来截取了。

现在我们又有一个疑问，一个汉字占三个字节，如果我只要10个字节，那么4个汉字是12个字节最后一个汉字只截取了一般，导致乱码。所以我们在截取的时间相对的处理下。

见代码：ps在网上找了个代码，大家参考下，本人语文不好，见谅。

<pre class="java" name="code">package cn.ztz.test;

public class Test {
	public static void main(String[] args)throws Exception {
		String str = "测试小酱油啊，我是来测试的";
		System.out.println(subStringByByte(str,10));
	}

	private static String subStringByByte(String str, int len)throws Exception {
		String result = null;
		if (str != null) {
			byte[] a = str.getBytes("UTF-8");
			if (a.length <= len) {
				result = str;
			} else if (len > 0) {
				result = new String(a, 0, len);
				int length = result.length();
				if (str.charAt(length - 1) != result.charAt(length - 1)) {
					if (length < 2) {
						result = null;
					} else {
						result = result.substring(0, length - 1);
					}
				}
			}
		}
		return result;
	}
}

运行这个代码，控制台输出：测试小

UTF-8下，三个汉字正好9个字节。