网页中的字符编码(html的unicode实体编码)

1、编码转换(to Unicode)

(程序代码来源于网络)

 

Js版

<script>
      test = "你好abc"
      str = ""
      for( i=0;     i<test.length; i++ )
      {
       temp = test.charCodeAt(i).toString(16);
       str     += "\\u"+ new Array(5-String(temp).length).join("0") +temp;
      }
      document.write (str)
</script>


vbs版

Function Unicode(str1)
      Dim str,temp
      str = ""
      For i=1     to len(str1)
       temp = Hex(AscW(Mid(str1,i,1)))
       If len(temp) < 5 Then     temp = right("0000" & temp, 4)
       str = str & "\u" & temp
      Next
      Unicode = str
End Function


Function htmlentities(str)
      For i = 1 to Len(str)
          char = mid(str, i, 1)
          If Ascw(char) > 128 then
              htmlentities = htmlentities & "&#" & Ascw(char) & ";"
          Else
              htmlentities = htmlentities & char
          End if
      Next
End Function

 

coldfusion

 

function nochaoscode(str)
{
      var new_str = “”;
      for(i=1; i lte len(str);i=i+1){
          if(asc(mid(str,i,1)) lt 128){
              new_str = new_str & mid(str,i,1);
          }else{
              new_str = new_str & “&##” & asc(mid(str,i,1));
          }
      }
      return new_str;
}

 


 

附:

在php中我们可以用mbstring的mb_convert_encoding函数实现这个正向及反向的转化。 如:

 

mb_convert_encoding ("你好", "HTML-ENTITIES", "gb2312"); //输出:&#20320;&#22909;
mb_convert_encoding ("&#20320;&#22909;", "gb2312", "HTML-ENTITIES"); //输出:你好

 

如果需要对整个页面转化,则只需要在php文件的头部加上这三行代码:

 

mb_internal_encoding("gb2312"); // 这里的gb2312是你网站原来的编码
mb_http_output("HTML-ENTITIES");
ob_start('mb_output_handler');


如果没有打开mbstring扩展,可以参考coolcode.cn上的这两篇文章:
在任意字符集下正常显示网页的方法
在任意字符集下正常显示网页的方法(续)


 

2、HTML实体

 

HTML 4.01 支持 ISO 8859-1 (Latin-1) 字符集。

提示 实体名是区分大小写的。

备注 同一个符号,可以用“实体名称”和“实体编号”两种方式引用,“实体名称”的优势在于便于记忆,但不能保证所有的浏览器都能顺利识别它,而“实体编号”则没有这种担忧,但它实在不方便记忆。


ASCII中部分实体的新名字

显示

描述

实体名称

实体编号

"

quotation mark

&quot;&#34;
'apostrophe

&apos; (IE下无效)

&#39;
&ampersand&amp;&#38;
<less-than&lt;&#60;
>greater-than&gt;&#62;

ISO 8859-1 符号实体

显示

描述

实体名称

实体编号

 

non-breaking space

&nbsp;&#160;
¡

inverted exclamation mark

&iexcl;&#161;
¤currency&curren;&#164;

cent&cent;&#162;

pound&pound;&#163;

yen&yen;&#165;
¦

broken vertical bar

&brvbar;&#166;
§section&sect;&#167;
¨

spacing diaeresis

&uml;&#168;
©copyright&copy;&#169;
a

feminine ordinal indicator

&ordf;&#170;
«

angle quotation mark (left)

&laquo;&#171;
?negation&not;&#172;
-

soft hyphen

&shy;&#173;
®

registered trademark

&reg;&#174;
trademark&trade;&#8482;
ˉ

spacing macron

&macr;&#175;
°degree&deg;&#176;
±plus-or-minus&plusmn;&#177;
2

superscript 2

&sup2;&#178;
3

superscript 3

&sup3;&#179;

spacing acute

&acute;

&#180;
μmicro&micro;&#181;
?paragraph&para;&#182;
·

middle dot

&middot;&#183;
?

spacing cedilla

&cedil;&#184;
1

superscript 1

&sup1;&#185;
o

masculine ordinal indicator

&ordm;&#186;
»

angle quotation mark (right)

&raquo;&#187;
?

fraction 1/4

&frac14;&#188;
?

fraction 1/2

&frac12;&#189;
?

fraction 3/4

&frac34;&#190;
?

inverted question mark

&iquest;&#191;
×multiplication&times;&#215;
÷division&divide;&#247;

ISO 8859-1 字符实体

显示

描述

实体名称

实体编号

À

capital a, grave accent

&Agrave;&#192;
Á

capital a, acute accent

&Aacute;&#193;
Â

capital a, circumflex accent

&Acirc;&#194;
Ã

capital a, tilde

&Atilde;&#195;
Ä

capital a, umlaut mark

&Auml;&#196;
Å

capital a, ring

&Aring;&#197;
Æ

capital ae

&AElig;&#198;
Ç

capital c, cedilla

&Ccedil;&#199;
È

capital e, grave accent

&Egrave;&#200;
É

capital e, acute accent

&Eacute;&#201;
Ê

capital e, circumflex accent

&Ecirc;&#202;
Ë

capital e, umlaut mark

&Euml;&#203;
Ì

capital i, grave accent

&Igrave;&#204;
Í

capital i, acute accent

&Iacute;&#205;
Î

capital i, circumflex accent

&Icirc;&#206;
Ï

capital i, umlaut mark

&Iuml;&#207;
Ð

capital eth, Icelandic

&ETH;&#208;
Ñ

capital n, tilde

&Ntilde;&#209;
Ò

capital o, grave accent

&Ograve;&#210;
Ó

capital o, acute accent

&Oacute;&#211;
Ô

capital o, circumflex accent

&Ocirc;&#212;
Õ

capital o, tilde

&Otilde;&#213;
Ö

capital o, umlaut mark

&Ouml;&#214;
Ø

capital o, slash

&Oslash;&#216;
ù

capital u, grave accent

&Ugrave;&#217;
ú

capital u, acute accent

&Uacute;&#218;
?

capital u, circumflex accent

&Ucirc;&#219;
ü

capital u, umlaut mark

&Uuml;&#220;
Y

capital y, acute accent

&Yacute;&#221;
T

capital THORN, Icelandic

&THORN;&#222;
?

small sharp s, German

&szlig;&#223;
à

small a, grave accent

&agrave;&#224;
á

small a, acute accent

&aacute;&#225;
a

small a, circumflex accent

&acirc;&#226;
?

small a, tilde

&atilde;&#227;
?

small a, umlaut mark

&auml;&#228;
?

small a, ring

&aring;&#229;
?

small ae

&aelig;&#230;
?

small c, cedilla

&ccedil;&#231;
è

small e, grave accent

&egrave;&#232;
é

small e, acute accent

&eacute;&#233;
ê

small e, circumflex accent

&ecirc;&#234;
?

small e, umlaut mark

&euml;&#235;
ì

small i, grave accent

&igrave;&#236;
í

small i, acute accent

&iacute;&#237;
?

small i, circumflex accent

&icirc;&#238;
?

small i, umlaut mark

&iuml;&#239;
e

small eth, Icelandic

&eth;&#240;
?

small n, tilde

&ntilde;&#241;
ò

small o, grave accent

&ograve;&#242;
ó

small o, acute accent

&oacute;&#243;
?

small o, circumflex accent

&ocirc;&#244;
?

small o, tilde

&otilde;&#245;
?

small o, umlaut mark

&ouml;&#246;
?

small o, slash

&oslash;&#248;
ù

small u, grave accent

&ugrave;&#249;
ú

small u, acute accent

&uacute;&#250;
?

small u, circumflex accent

&ucirc;&#251;
ü

small u, umlaut mark

&uuml;&#252;
y

small y, acute accent

&yacute;&#253;
t

small thorn, Icelandic

&thorn;&#254;
?

small y, umlaut mark

&yuml;&#255;

其它一些 HTML 所支持的实体

显示

描述

实体名称

实体编号

Œ

capital ligature OE

&OElig;&#338;
œ

small ligature oe

&oelig;&#339;
Š

capital S with caron

&Scaron;&#352;
š

small S with caron

&scaron;&#353;
Ÿ

capital Y with diaeres

&Yuml;&#376;
ˆ

modifier letter circumflex accent

&circ;&#710;
˜

small tilde

&tilde;&#732;

en space

&ensp;&#8194;

em space

&emsp;&#8195;

thin space

&thinsp;&#8201;

zero width non-joiner

&zwnj;&#8204;

zero width joiner

&zwj;&#8205;

left-to-right mark

&lrm;&#8206;

right-to-left mark

&rlm;&#8207;

en dash

&ndash;&#8211;

em dash

&mdash;&#8212;

left single quotation mark

&lsquo;&#8216;

right single quotation mark

&rsquo;&#8217;

single low-9 quotation mark

&sbquo;&#8218;

left double quotation mark

&ldquo;&#8220;

right double quotation mark

&rdquo;&#8221;

double low-9 quotation mark

&bdquo;&#8222;
dagger&dagger;&#8224;

double dagger

&Dagger;&#8225;

horizontal ellipsis

&hellip;&#8230;

per mille

&permil;&#8240;

single left-pointing angle quotation

&lsaquo;&#8249;

single right-pointing angle quotation

&rsaquo;&#8250;
 euro&euro;&#8364;

转载于:https://www.cnblogs.com/zccee/archive/2012/02/04/2338515.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值