ASP.NET中过滤HTML字符串的两个方法

最新推荐文章于 2023-06-13 16:29:34 发布

xfth200810

最新推荐文章于 2023-06-13 16:29:34 发布

阅读量95

点赞数

分类专栏：牛腩的搏客收藏学习文章标签： ASP.net .net HTML ASP 编程

本文链接：https://blog.csdn.net/xfth200810/article/details/83638119

版权

牛腩的搏客收藏学习专栏收录该内容

51 篇文章 0 订阅

订阅专栏

文章分类:.net编程
先记下来，以作备用！


///   <summary>去除HTML标记  
///         
///   </summary>     
///   <param name="Htmlstring">包括HTML的源码< /param>     
///   <returns>已经去除后的文字</returns>     
public static string GetNoHTMLString(string Htmlstring)  
{  
    //删除脚本     
    Htmlstring = Regex.Replace(Htmlstring, @"<script[^>]*?>.*?</script>", "", RegexOptions.IgnoreCase);  
    //删除HTML     
    Htmlstring = Regex.Replace(Htmlstring, @"<(.[^>]*)>", "", RegexOptions.IgnoreCase);  
    Htmlstring = Regex.Replace(Htmlstring, @"([\r\n])[\s]+", "", RegexOptions.IgnoreCase);  
    Htmlstring = Regex.Replace(Htmlstring, @"-->", "", RegexOptions.IgnoreCase);  
    Htmlstring = Regex.Replace(Htmlstring, @"<!--.*", "", RegexOptions.IgnoreCase);  


    Htmlstring = Regex.Replace(Htmlstring, @"&(quot|#34);", "\"", RegexOptions.IgnoreCase);  
    Htmlstring = Regex.Replace(Htmlstring, @"&(amp|#3   );", "&", RegexOptions.IgnoreCase);  
    Htmlstring = Regex.Replace(Htmlstring, @"&(lt|#   0);", "<", RegexOptions.IgnoreCase);  
    Htmlstring = Regex.Replace(Htmlstring, @"&(gt|#   2);", ">", RegexOptions.IgnoreCase);  
    Htmlstring = Regex.Replace(Htmlstring, @"&(nbsp|#1   0);", "   ", RegexOptions.IgnoreCase);  
    Htmlstring = Regex.Replace(Htmlstring, @"&(iexcl|#1   1);", "\xa1", RegexOptions.IgnoreCase);  
    Htmlstring = Regex.Replace(Htmlstring, @"&(cent|#1   2);", "\xa2", RegexOptions.IgnoreCase);  
    Htmlstring = Regex.Replace(Htmlstring, @"&(pound|#1   3);", "\xa3", RegexOptions.IgnoreCase);  
    Htmlstring = Regex.Replace(Htmlstring, @"&(copy|#1   9);", "\xa9", RegexOptions.IgnoreCase);  
    Htmlstring = Regex.Replace(Htmlstring, @"&#(\d+);", "", RegexOptions.IgnoreCase);  


    Htmlstring.Replace("<", "");  
     Htmlstring.Replace(">", "");  
     Htmlstring.Replace("\r\n", "");  
     Htmlstring = HttpContext.Current.Server.HtmlEncode(Htmlstring).Trim();  


     return Htmlstring;  
 }  


 /// <summary>获取显示的字符串，可显示HTML标签，但把危险的HTML标签过滤，如 iframe,script等。  
 ///   
 /// </summary>  
 /// <param name="str">未处理的字符串</param>  
 /// <returns></returns>  
 public static string GetSafeHTMLString(string str)  
 {  
     str = Regex.Replace(str, @"<applet[^>]*?>.*?</applet>", "", RegexOptions.IgnoreCase);  
     str = Regex.Replace(str, @"<body[^>]*?>.*?</body>", "", RegexOptions.IgnoreCase);  
     str = Regex.Replace(str, @"<embed[^>]*?>.*?</embed>", "", RegexOptions.IgnoreCase);  
     str = Regex.Replace(str, @"<frame[^>]*?>.*?</frame>", "", RegexOptions.IgnoreCase);  
      str = Regex.Replace(str, @"<script[^>]*?>.*?</script>", "", RegexOptions.IgnoreCase);  
      str = Regex.Replace(str, @"<frameset[^>]*?>.*?</frameset>", "", RegexOptions.IgnoreCase);  
      str = Regex.Replace(str, @"<html[^>]*?>.*?</html>", "", RegexOptions.IgnoreCase);  
      str = Regex.Replace(str, @"<iframe[^>]*?>.*?</iframe>", "", RegexOptions.IgnoreCase);  
      str = Regex.Replace(str, @"<style[^>]*?>.*?</style>", "", RegexOptions.IgnoreCase);  
      str = Regex.Replace(str, @"<layer[^>]*?>.*?</layer>", "", RegexOptions.IgnoreCase);  
      str = Regex.Replace(str, @"<link[^>]*?>.*?</link>", "", RegexOptions.IgnoreCase);  
      str = Regex.Replace(str, @"<ilayer[^>]*?>.*?</ilayer>", "", RegexOptions.IgnoreCase);  
      str = Regex.Replace(str, @"<meta[^>]*?>.*?</meta>", "", RegexOptions.IgnoreCase);  
      str = Regex.Replace(str, @"<object[^>]*?>.*?</object>", "", RegexOptions.IgnoreCase);  
      return str;  
  }

xfth200810

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
ASP.NET中过滤HTML字符串的两个方法

文章分类:.net编程先记下来，以作备用！[code="C#"]/// 去除HTML标记 /// /// /// 包括HTML的源码< /param> /// 已经去除后的文字 public static string GetNoHTMLString(string Htmlstring) { ...
复制链接

扫一扫