wkman php,PHP: mb_convert_encoding - Manual

分享PHP代码,如何将带有拉丁字符变音符号(如à, ç, é, î等)转换为标准7位字符形式(如a, c, e, i等)。函数还考虑了不同语言的特殊需求,如德语的ß和荷兰的ÿ。更新版解决了非字母数字字符导致的问题,将其替换为下划线。
摘要由CSDN通过智能技术生成

I\'d like to share some code to convert latin diacritics to their

traditional 7bit representation, like, for example,

- à,ç,é,î,... to a,c,e,i,...

- ß to ss

- ä,Ä,... to ae,Ae,...

- ë,... to e,...

(mb_convert \"7bit\" would simply delete any offending characters).

I might have missed on your country\'s typographic

conventions--correct me then.

<?php

/**

* @args string $text line of encoded text

*       string $from_enc (encoding type of $text, e.g. UTF-8, ISO-8859-1)

*

* @returns 7bit representation

*/

function to7bit($text,$from_enc) {

$text = mb_convert_encoding($text,\'HTML-ENTITIES\',$from_enc);

$text = preg_replace(

array(\'/ß/\',\'/&(..)lig;/\',

\'/&([aouAOU])uml;/\',\'/&(.)[^;]*;/\'),

array(\'ss\',\"$1\",\"$1\".\'e\',\"$1\"),

$text);

return $text;

}

?>

Enjoy :-)

Johannes

==

[EDIT BY danbrown AT php DOT net: Author provided the following update on 27-FEB-2012.]

==

An addendum to my "to7bit" function referenced below in the notes.

The function is supposed to solve the problem that some languages require a different 7bit rendering of special (umlauted) characters for sorting or other applications. For example, the German ß ligature is usually written "ss" in 7bit context. Dutch ÿ is typically rendered "ij" (not "y").

The original function works well with word (alphabet) character entities and I've seen it used in many places. But non-word entities cause funny results:

E.g., "©" is rendered as "c", "­" as "s" and "&rquo;" as "r".

The following version fixes this by converting non-alphanumeric characters (also chains thereof) to '_'.

<?php

/**

* @args string $text line of encoded text

*       string $from_enc (encoding type of $text, e.g. UTF-8, ISO-8859-1)

*

* @returns 7bit representation

*/

function to7bit($text,$from_enc) {

$text = preg_replace(/W+/,'_',$text);

$text = mb_convert_encoding($text,'HTML-ENTITIES',$from_enc);

$text = preg_replace(

array('/ß/','/&(..)lig;/',

'/&([aouAOU])uml;/','/ÿ/','/&(.)[^;]*;/'),

array('ss',"$1","$1".'e','ij',"$1"),

$text);

return $text;

}

?>

Enjoy again,

Johannes

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值