wkman php,PHP: mb_convert_encoding - Manual

最新推荐文章于 2021-07-16 13:36:32 发布

yueyuz

最新推荐文章于 2021-07-16 13:36:32 发布

阅读量122

点赞数

文章标签： wkman php

分享PHP代码，如何将带有拉丁字符变音符号（如à, ç, é, î等）转换为标准7位字符形式（如a, c, e, i等）。函数还考虑了不同语言的特殊需求，如德语的ß和荷兰的ÿ。更新版解决了非字母数字字符导致的问题，将其替换为下划线。

摘要由CSDN通过智能技术生成

I\'d like to share some code to convert latin diacritics to their

traditional 7bit representation, like, for example,

- à,ç,é,î,... to a,c,e,i,...

- ß to ss

- ä,Ä,... to ae,Ae,...

- ë,... to e,...

(mb_convert \"7bit\" would simply delete any offending characters).

I might have missed on your country\'s typographic

conventions--correct me then.

<?php

/**

* @args string $text line of encoded text

* string $from_enc (encoding type of $text, e.g. UTF-8, ISO-8859-1)

* @returns 7bit representation

function to7bit($text,$from_enc) {

$text = mb_convert_encoding($text,\'HTML-ENTITIES\',$from_enc);

$text = preg_replace(

array(\'/ß/\',\'/&(..)lig;/\',

\'/&([aouAOU])uml;/\',\'/&(.)[^;]*;/\'),

array(\'ss\',\"$1\",\"$1\".\'e\',\"$1\"),

$text);

return $text;

}

Enjoy :-)

Johannes

[EDIT BY danbrown AT php DOT net: Author provided the following update on 27-FEB-2012.]

An addendum to my "to7bit" function referenced below in the notes.

The function is supposed to solve the problem that some languages require a different 7bit rendering of special (umlauted) characters for sorting or other applications. For example, the German ß ligature is usually written "ss" in 7bit context. Dutch ÿ is typically rendered "ij" (not "y").

The original function works well with word (alphabet) character entities and I've seen it used in many places. But non-word entities cause funny results:

E.g., "©" is rendered as "c", "" as "s" and "&rquo;" as "r".

The following version fixes this by converting non-alphanumeric characters (also chains thereof) to '_'.

<?php

/**

* @args string $text line of encoded text

* string $from_enc (encoding type of $text, e.g. UTF-8, ISO-8859-1)

* @returns 7bit representation

function to7bit($text,$from_enc) {

$text = preg_replace(/W+/,'_',$text);

$text = mb_convert_encoding($text,'HTML-ENTITIES',$from_enc);

$text = preg_replace(

array('/ß/','/&(..)lig;/',

'/&([aouAOU])uml;/','/ÿ/','/&(.)[^;]*;/'),

array('ss',"$1","$1".'e','ij',"$1"),

$text);

return $text;

}

Enjoy again,

Johannes

yueyuz

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫