一、编码范围
1. gbk (gb2312/gb18030)
x00-xff gbk双字节编码范围
x20-x7f ascii
xa1-xff 中文
x80-xff 中文
2. utf-8 (unicode)
u4e00-u9fa5 (中文)
x3130-x318f (韩文)
xac00-xd7a3 (韩文)
u0800-u4e00 (日文)
ps教程: 韩文是大于[u9fa5]的字符
正则例子:
preg_replace("/([x80-xff])/","",$str);
preg_replace("/([u4e00-u9fa5])/","",$str);
二、代码例子
//判断内容里有没有中文-gbk (php教程)
function check_is_chinese($s){
return preg_match('/[x80-xff]./', $s);
}
//获取字符串长度-gbk (php)
function gb_strlen($str){
$count = 0;
for($i=0; $i
$s = substr($str, $i, 1);
if (preg_match("/[x80-xff]/", $s)) ++$i;
++$count;
}
return $count;
}
//截取字符串字串-gbk (php)
function gb_substr($