几种 HtmlEncode 的区别

最新推荐文章于 2021-06-20 04:29:51 发布

一个大猴子

最新推荐文章于 2021-06-20 04:29:51 发布

阅读量1k

点赞数

分类专栏： asp.net 文章标签： ASP.NET HtmlEncode

asp.net 专栏收录该内容

41 篇文章 0 订阅

订阅专栏

问题：

HttpUtility.HtmlDecode ，HttpUtility.HtmlEncode 与 Server.HtmlDecode ，Server.HtmlEncode 与 HttpServerUtility.HtmlDecode ， HttpServerUtility.HtmlEncode 有什么区别？

他们与下面一般手工写的代码有什么不一样的？

public static string htmlencode(string str)
 {
        if (str == null || str == "")
            return "";
        str = str.Replace(">", "&gt;");
        str = str.Replace(" <", "&lt;");
        str = str.Replace(" ", "&nbsp;");
        str = str.Replace("  ", " &nbsp;");
        str = str.Replace("\"", "&quot;");
        str = str.Replace("\'", "'");
        str = str.Replace("\n", " <br/> ");
        return str;
}

答案：

HtmlEncode：将 Html 源文件中不允许出现的字符进行编码，通常是编码以下字符"<"、">"、"&" 等。

HtmlDecode：刚好跟 HtmlEncode 相关，解码出来原本的字符。

HttpServerUtility 实体类的 HtmlEncode 方法是一种简便方式，用于在运行时从 ASP.NET Web 应用程序访问 System.Web.HttpUtility.HtmlEncode 方法。HttpServerUtility 实体类的 HtmlEncode 方法在内部使用 System.Web.HttpUtility.HtmlEncode 对字符串进行编码。

Server.HtmlEncode 其实就是 System.Web.UI.Page 类封装的 HttpServerUtility 实体类的 HtmlEncode 方法； System.Web.UI.Page 类有这样的一个属性： public HttpServerUtility Server { get; }

所以我们可以认为：

Server.HtmlDecode = HttpServerUtility 实体类的 HtmlDecode 方法 = HttpUtility.HtmlDecode ;

Server.HtmlEncode = HttpServerUtility 实体类的 HtmlEncode 方法 = HttpUtility.HtmlEncode ;

他们只不过是为了调用方便，做了封装而已。

在 ASP 中， Server.HTMLEncode Method 过滤的字符描述如下：

如果字符串不是 DBCS 编码。这个方法将转换下面字符：

less-than character (<)	<
greater-than character (>)	>
ampersand character (&)	&
double-quote character (")	"
Any ASCII code character whose code is greater-than or equal to 0×80	&#<number>, where <number> is the ASCII character value.

如果是 DBCS 编码

All extended characters are converted.
Any ASCII code character whose code is greater-than or equal to 0×80 is converted to &#<number>, where <number> is the ASCII character value.
Half-width Katakana characters in the Japanese code page are not converted.

相关资料：

Server.HTMLEncode Method

http://msdn.microsoft.com/en-us/library/ms525347.aspx

在ASP.net 中情况也类似

下面是一个简单的替换测试代码，测试结果看之后的注释：

protected void Page_Load(object sender, EventArgs e)
{

    TestChar("<"); // 小于号    替换   &lt;
    TestChar(">"); // 大于号    替换   &gt;
    TestChar("'"); // 单引号    替换   '
    TestChar(" "); // 半角英文空格    不做替换
    TestChar(" "); // 全角中文空格    不做替换
    TestChar("&"); // &    替换   &amp;
    TestChar("\""); // 英文双引号    替换   &quot;
    TestChar("\n"); // 回车    不做替换
    TestChar("\r"); // 回车    不做替换
    TestChar("\r\n"); // 回车    不做替换
}


public void TestChar(string t)
{
    Response.Write(Server.HtmlEncode(t));
    Response.Write("__");
    Response.Write(HttpUtility.HtmlEncode(t));
    Response.Write("<br />");
}

所以上面我们提到的常用替换方式还是非常有用的，他还处理了一些 HttpUtility.HtmlEncode 不支持的替换。

public static string htmlencode(string str)
{
    if (str == null || str == "")
        return "";
    str = str.Replace(">", "&gt;");
    str = str.Replace(" <", "&lt;");
    str = str.Replace(" ", "&nbsp;");       // HttpUtility.HtmlEncode( 并不支持这个替换
    str = str.Replace("  ", " &nbsp;");     // HttpUtility.HtmlEncode( 并不支持这个替换
    str = str.Replace("\"", "&quot;");
    str = str.Replace("\'", "'");
    str = str.Replace("\n", " <br/> ");     // HttpUtility.HtmlEncode( 并不支持这个替换
    return str;
}

我们使用 Reflector 查看 HttpUtility.HtmlEncode 的实现，我们就可以看到，它只考虑的五种情况，空格，回车是没有处理的：

使用 Reflector 查看 HttpUtility.HtmlEncode 实现代码其中最重要的代码如下：

public static unsafe void HtmlEncode(string value, TextWriter output){ if (value != null) { if (output == null) { throw new ArgumentNullException("output"); } int num = IndexOfHtmlEncodingChars(value, 0); if (num == -1) { output.Write(value); } else { int num2 = value.Length - num; fixed (char* str = ((char*) value)) { char* chPtr = str; char* chPtr2 = chPtr; while (num-- > 0) { chPtr2++; output.Write(chPtr2[0]); } while (num2-- > 0) { chPtr2++; char ch = chPtr2[0]; if (ch <= '>') { switch (ch) { case '&': { output.Write("&"); continue; } case '\'': { output.Write("'"); continue; } case '"': { output.Write("""); continue; } case '<': { output.Write("<"); continue; } case '>': { output.Write(">"); continue; } } output.Write(ch); continue; } if ((ch >= '\x00a0') && (ch < 'ā')) { output.Write("&#"); output.Write(((int) ch).ToString(NumberFormatInfo.InvariantInfo)); output.Write(';'); } else { output.Write(ch); } } } } }}

二、JS中的编码和解码

[c-sharp] view plain copy print ?

一、escape/unescape
escape:escape 方法返回一个包含 charstring 内容的字符串值（Unicode 格式）。所有空格、标点、重音符号以及任何其他非 ASCII 字符都用 %xx 编码替换，其中 xx 等于表示该字符的十六进制数
unescape:从用 escape 方法编码的 String 对象中返回已解码的字符串
例外字符： @ * / +
二、encodeURI/decodeURI
encodeURI:方法返回一个已编码的 URI。如果将编码结果传递给 decodeURI，则将返回初始的字符串。encodeURI 不对下列字符进行编码：“:”、“/”、“;”和“?”。请使用 encodeURIComponent 对这些字符进行编码
decodeURI:从用encodeURI方法编码的String对象中返回已解码的字符串
例外字符：! @ # $ & * ( ) = : / ; ? + '
三、encodeURIComponent/decodeURIComponent
encodeURIComponent:encodeURIComponent 方法返回一个已编码的 URI。如果将编码结果传递给decodeURIComponent，则将返回初始的字符串。因为 encodeURIComponent 方法将对所有字符编码
decodeURIComponent:从用encodeURIComponent方法编码的String对象中返回已解码的字符串
例外字符：! * ( ) '