一个页面标题和过滤输出的解决方案

最新推荐文章于 2024-10-10 11:15:29 发布

feihu_guest

最新推荐文章于 2024-10-10 11:15:29 发布

阅读量585

点赞数

分类专栏： C# 文章标签： string application html byte stream filter

C# 专栏收录该内容

22 篇文章 0 订阅

订阅专栏

首先要提到一个东西：Response.Filter
filter可以让你截取到最后的html输出，如果你的程序需要在输出之前，做一些处理，用这个比较方便。

在哪里，如何使用Response.Filter
这里使用上全局的Global.asax处理,在Global.asax的Application_BeginRequest事件里截取html

事件代码

protected void Application_BeginRequest(object sender, EventArgs e)
{
    HttpContext.Current.Response.Filter = new HttpResponseFilter(HttpContext.Current.Response.Filter,new ReplaceTextList());
}

代码中的HttpResponseFilter类
功能：这类主要实现的功能是，接替默认的Filter，然后换成自定义的Filter，方便处理自己要处理的事情。
哪来的：由于Response.Filter 是一个Stream类，所以新类HttpResponseFilter需要继承自Stream，然后复写Write方法，实现自定义方法即可。

public override void Write(byte[] buffer, int offset, int count)
{
            //读出写的文字

            byte[] data = new byte[count];

            Buffer.BlockCopy(buffer, offset, data, 0, count);

            string inputText = Encoding.UTF8.GetString(data);

            //开始替换
            if (replaceTextList != null && replaceTextList.Count > 0)
            {
                foreach (KeyValuePair<string, string> values in replaceTextList)
                {
                    inputText = Regex.Replace(inputText, values.Key, values.Value, RegexOptions.Singleline);
                }
                replaceTextList.Clear();
            }
            replaceTextList = null;

            //将替换后的写入response
            byte[] newdata = Encoding.UTF8.GetBytes(inputText);
            filterStream.Write(newdata, 0, newdata.Length);
}

代码解读
分三步走：
：读取原文本内容
：然后替换修改成自己的内容
：写回去输出
注意事项：要注意网站编码是UTF8还是GB2312
重点是：我扩展了替换那一块，我用了一个Dictionary<string, string>
然后循环替换，当然支持正则，所以替换的原始文字和替换后的文字就对应上两个string上了

为了可扩展，我定义了一个抽象类，先实现了三个正则用于截取标题，说明，和关键字。

public abstract class ReplaceTextListBase
    {
        /// <summary>
        /// 将被返回的替换文本集合列表
        /// </summary>
        public Dictionary<string, string> replaceTextList = new Dictionary<string, string>();
        /// <summary>
        /// 获取当前请求页面的url信息
        /// </summary>
        public Uri PageUrl { get { return HttpContext.Current.Request.Url; } }
        /// <summary>
        /// 获取html的title的正则
        /// </summary>
        public string TitleRegex { get { return "<title.*>.*</title>"; } }
        public string TitleFormat(string titleText)
        {
            return "<title>" + titleText + "</title>";
        }
        /// <summary>
        /// 获取html的Description的正则
        /// </summary>
        public string DescriptionRegex { get { return "<meta[^<>]+name=[\"\']description[^<>]*[/]>"; } }
        public string DescriptionFormat(string descriptionText)
        {
            return "<meta id=\"description\" name=\"description\" content=\"" + descriptionText + "\" />";
        }
        /// <summary>
        /// 获取html的Keyword的正则
        /// </summary>
        public string KeywordRegex { get { return "<meta[^<>]+name=[\"\']keywords[^<>]*[/]>"; } }
        public string KeywordFormat(string keywordText)
        {
            return "<meta id=\"keywords\" name=\"keywords\" content=\"" + keywordText + "\" />";
        }
        /// <summary>
        /// 复写此方法,调用replaceTextList.add()方法后，return replaceTextList;
        /// </summary>
        /// <returns></returns>
        public virtual Dictionary<string, string> GetReplaceTextList()
        {
            return replaceTextList;
        }
    }

抽象类后，留下一个虚方法GetReplaceTextList(), 这是重点

现在看一下我的示例中的子类的实现，继承自抽象类，复写虚方法：

public class ReplaceTextList:ReplaceTextListBase
{
        public override System.Collections.Generic.Dictionary<string, string> GetReplaceTextList()
        {
            replaceTextList.Add(TitleRegex,TitleFormat("TitleRegex"));
            replaceTextList.Add(DescriptionRegex,DescriptionFormat("descriptionttest"));
            replaceTextList.Add(KeywordRegex,KeywordFormat("keywordadfdfdf"));
            return replaceTextList;
        }
}

例子中的子类实现很简单，就复写了一个虚方法，最终页面的输出标题为：TitleRegex。其它两个同理。
如果要替换其它或过滤文件，只要写多几个add方法把要替换的文字给替换掉就行了,具体也可以结合下数据库或其它文件操作
另外：
例子上，直接就定死了标题输出为：TitleRegex，这里可以结合自己的需要，替换成任意字符串。
提示：抽象类里还留下了PageUr吧，可以根据Url查出Title和description和keyword来实现自己的扩展。

另外给出一些我早期实现的思路：

建数据库表，对url主机头进行分类管理，自己定义替换字符等,最后查询与替换。