php mb_encode,mb_encode_mimeheader()

mb_encode_mimeheader()

(PHP 4 >= 4.0.6, PHP 5, PHP 7)

为 MIME 头编码字符串

说明mb_encode_mimeheader (string$str[,string$charset= determined by mb_language() [,string$transfer_encoding= "B" [,string$linefeed= "\r\n" [,int$indent= 0 ]]]] ) :string

按 MIME 头编码方案将指定的字符串$str进行编码。

参数

$str要编码的 string。 它的编码应该和 mb_internal_encoding() 一样。$charset$charset指定了$str的字符集名。 其默认值由当前的 NLS 设置(mbstring.language)来确定。$transfer_encoding$transfer_encoding指定了 MIME 的编码方案。 它可以是"B"(Base64)也可以是"Q"(Quoted-Printable)。 如果未设置,将回退为"B"。$linefeed$linefeed指定了 EOL(行尾)标记,使 mb_encode_mimeheader() 执行了一个换行(» RFC 文档中规定,超过长度的一行将换成多行,当前该长度硬式编码为 74 个字符)。 如果没有设定,则回退为"\r\n"(CRLF)。$indent首行缩进(header 里$str前的字符数目)。

返回值

转换后的字符串版本以 ASCII 形式表达。

范例

mb_encode_mimeheader() 例子<?php

$name = ""; // kanji

$mbox = "kru";

$doma = "gtinn.mon";

$addr = mb_encode_mimeheader($name, "UTF-7", "Q") . " ";

echo $addr;

?>

注释Note:

这个函数没有设计成据更高级上下文的中断点来换行(单词边界等)。 这个特性将导致意外的空格可能会让原始字符串看上去很乱。

参见

Some solution for using national chars and have problem with UTF-8 for example in mail subject. Before you use mb_encode_mimeheader with UTF-8 set mb_internal_encoding('UTF-8').Read this FIRST: http://bugs.php.net/bug.php?id=23192 because mb_encode_mimeheaders is BUGGY!

a work around for the multibyte broken error for too long subjects for ISO-2022-JP:

$pos=0;

$split=36; // after 36 single bytes characters, if then comes MB, it is broken

while ($pos

{

$output=mb_strimwidth($string,$pos,$split,"",$encoding);

$pos+=mb_strlen($output,$encoding);

$_string.=(($_string)?' ':'').mb_encode_mimeheader($output,$encoding);

}

$string=$_string;

is not the best, but it worksmb_encode_mimeheader() depends on correct mbstring.internal_encoding setting. It tries to convert $str from internal encoding to $charset. If you ignore mbstring internal encoding, function might encode strings incorrectly even when $str character set matches $charsetMy first post was around 2003, and still the mb_mime_header is broken. It is *NOT* usable with longer subjects, and mostly unusable with anything else than japanese.

iwakura at junx dot org is also not working for me, it produces also some gargabe.

I updated my old function (the one I posted 2003) and I tested it with overlong subjects in UTF-8, ISO-2022-JP (japanese), GB2312 (simplified chinese) and EUC-KR (korean) and I got readable results in thunderbird, mail.app, outlook, etc.

function _mb_mime_encode($string, $encoding)

{

$pos = 0;

// after 36 single bytes characters if then comes MB, it is broken

// but I trimmed it down to 24, to stay 100% < 76 chars per line

$split = 24;

while ($pos < mb_strlen($string, $encoding))

{

$output = mb_strimwidth($string, $pos, $split, "", $encoding);

$pos += mb_strlen($output, $encoding);

$_string_encoded = "=?".$encoding."?B?".base64_encode($output)."?=";

if ($_string)

$_string .= "\r\n";

$_string .= $_string_encoded;

}

$string = $_string;

return $string;

}

?>I could not find a PHP function to MIME encode the name for a n email address.

Input = "Karl Müller"

Output = "Karl%20M%FCller"

I wrote it on my own:

// required to encode names in email addresses

// replace " " with "%20"

// replace "ü" with "%FC"

// replace "%" with "%25" etc....

// Use "%" as Delimiter for MIME

// Use "=" as Delimiter for Quoted Printable

// Input string must be UTF8 encoded

public static function EncodeMime($Text, $Delimiter)

{

$Text = utf8_decode($Text);

$Len = strlen($Text);

$Out = "";

for ($i=0; $i

{

$Chr = substr($Text, $i, 1);

$Asc = ord($Chr);

if ($Asc > 0x255) // Unicode not allowed

{

$Out .= "?";

}

else if ($Chr == " "|| $Chr == $Delimiter || $Asc > 127)

{

$Out .= $Delimiter . strtoupper(bin2hex($Chr));

}

else $Out .= $Chr;

}

return $Out;

}

?>True, function is broken (PHP5.1, encoding from UTF-8 with pl_PL charset). Below is about 15% faster version of proposed _mb_mime_encode. Also it has header more like othe mb_* functions and doesn't trigger any errors/warnings/notices.

function mb_mime_header($string, $encoding=null, $linefeed="\r\n") {

if(!$encoding) $encoding = mb_internal_encoding();

$encoded = '';

while($length = mb_strlen($string)) {

$encoded .= "=?$encoding?B?"

. base64_encode(mb_substr($string,0,24,$encoding))

. "?=$linefeed";

$string = mb_substr($string,24,$length,$encoding);

}

return $encoded;

}

?>If mb_ version doesn't work for you in MIME-B mode:

function encode_mimeheader($string, $charset=null, $linefeed="\r\n") {

if (!$charset)

$charset = mb_internal_encoding();

$start = "=?$charset?B?";

$end = "?=";

$encoded = '';

/* Each line must have length <= 75, including $start and $end */

$length = 75 - strlen($start) - strlen($end);

/* Average multi-byte ratio */

$ratio = mb_strlen($string, $charset) / strlen($string);

/* Base64 has a 4:3 ratio */

$magic = $avglength = floor(3 * $length * $ratio / 4);

for ($i=0; $i <= mb_strlen($string, $charset); $i+=$magic) {

$magic = $avglength;

$offset = 0;

/* Recalculate magic for each line to be 100% sure */

do {

$magic -= $offset;

$chunk = mb_substr($string, $i, $magic, $charset);

$chunk = base64_encode($chunk);

$offset++;

} while (strlen($chunk) > $length);

if ($chunk)

$encoded .= ' '.$start.$chunk.$end.$linefeed;

}

/* Chomp the first space and the last linefeed */

$encoded = substr($encoded, 1, -strlen($linefeed));

return $encoded;

}In countries where there's non-us ASCII, it's a very good example, for sending mail:

mb_internal_encoding('iso-8859-2');

setlocale(LC_CTYPE, 'hu_HU');

function encode($str,$charset){

$str=mb_encode_mimeheader(trim($str),$charset, 'Q', "\n\t");

return $str;

}

print encode('the text with spec. chars: ő Ű Ő ű, ','iso-8859-2');

It creates a 7bit stringi think mb_encode_mimeheader still have bug. here is sample code:

function mb_encode_mimeheader2($string, $encoding = "ISO-2022-JP") {

$string_array = array();

$pos = 0;

$row = 0;

$mode = 0;

while ($pos < mb_strlen($string)) {

$word = mb_strimwidth($string, $pos, 1);

if (!$word) {

$word = mb_strimwidth($string, $pos, 2);

}

if (mb_ereg_match("[ -~]", $word)) { // ascii

if ($mode != 1) {

$row++;

$mode = 1;

$string_array[$row] = NULL;

}

} else { // multibyte

if ($mode != 2) {

$row++;

$mode = 2;

$string_array[$row] = NULL;

}

}

$string_array[$row] .= $word;

$pos++;

}

//echo "

";

//print_r($string_array);

//echo "";

foreach ($string_array as $key => $value) {

$value = mb_convert_encoding($value, $encoding);

$string_array[$key] = mb_encode_mimeheader($value, $encoding);

}

//echo "

";

//print_r($string_array);

//echo "

";

return implode("", $string_array);

}

is not the best, but it worksAt least for Q encoding, this function is unsafe and does not encode correctly. Raw characters which appear as RFC2047 sequences are simply left as is.

Ex:

mb_encode_mimeheader( '=?iso-8859-1?q?this=20is=20some=20text?=' );

returns '=?iso-8859-1?q?this=20is=20some=20text?='

The exact same string, which is obviously not the encoding for the source string. That is, mb_encode_mimeheader does not do any type of escaping.

That is, the following condition is not always true:

mb_decode_mimeheader( mb_encode_mimeheader( $text ) ) == $textI found a bad function.

function encodeHeader($input, $charset = 'ISO-8859-2')

{

preg_match_all('/(\\w*[\\x80-\\xFF]+\\w*)/', $input, $matches);

foreach ($matches[1] as $value) {

$replacement = preg_replace('/([\\x80-\\xFF])/e', '"=" . strtoupper(dechex(ord("\\1")))', $value);

$input = str_replace($value, '=?'. $charset . '?Q?'. $replacement . '?=', $input);

}

return $input;

}

?>

This function should be used:

function encodeHeader($input, $charset = 'ISO-8859-2')

{

$m=preg_match_all('/(\w*[\x80-\xFF]+\w*)/', $input, $matches);

if($m)$input=mb_encode_mimeheader($input,$charset, 'Q');

return $input;

}

?>second parameter 'charset' is character encoding name, but default must be UTF-8 on PHP4.3.1.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值