聚合搜索(五)

1.5  聚合搜索的实现
上面介绍了服务器端基础类代码的实现。本节将整合这些代码,具体实现聚合搜索功能。Search.aspx是本系统的主页面文件。它是一个框架,顶部是Top.htm页面,用于选择搜索引擎和执行查询;底部是一个由Handler生成的页面S.ashx,用于执行具体的查询任务。Search.xml存储了6大搜索引擎的相关信息,而result.xsl是搜索结果的格式化文件。下面具体介绍它们的实现。
1.5.1  主页面Search.aspx
Search.aspx是聚合搜索的主页面,实际上它是一个框架,由Top.htm和S.ashx两个页面组成。它只是负责给两个页面传递查询参数,其具体代码如下:
// Search.aspx的代码
<%@ Page Language="C#" %>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3. org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<script runat="server">
      public string key = "";                         //当前搜索关键字
      public string no = "";                          //当前搜索引擎的编号
      public string str = "var location='';";         //输出到客户端的脚本
      protected void Page_Load(object sender, EventArgs e)
      {
           key = Server.UrlEncode(Server.UrlEncode(Tools.getPostItem("key")));
           no = Tools.getPostItem("no");
           str = "";
           if (no == "") no = "9";                     //no默认为9,google搜索
           if (key == "") key = "趣查";
           str += "var s_key='" + key + "';var s_no=" + no + ";";
      }
</script>
<html xmlns="http://www.w3.org/1999/xhtml">
<head id="Head1" runat="server">
      <title>聚合搜索</title>
      <meta http-equiv="Content-Type" content="text/html; charset=gb2312">
      <script type="text/javascript">
      <%=str %>
      </script>
</head>
<frameset name="topframe" border="0" rows="130px,*" frameborder="0" framespacing="0"
     runat="server">
     <frame name="search_engine" scrolling='no' src="<%=Tools.getApplicationPath() %>Top.htm?keyword=<%=Tools.getPostItem("key") %>&NO=<%= no%>">
     scrolling="no">
     <frame name="right" id="other" src="S.ashx?Times=1&key=<%= key %>&no=<%= no%>" noresize
          frameborder="0" marginwidth="0" marginheight="0" scrolling="auto">
</frameset>
<noframes>
     此 HTML 框架集显示多个 Web 页。若要查看此框架集,请使用支持 HTML 4.0 及更高版本的 Web 浏览器。
</noframes>
</html>
1.5.2  搜索页面Top.htm
搜索页面Top.htm包括一个文本输入框、一个提交按钮和一个搜索引擎列表,它是用户执行查询操作的页面。页面初始化时,首先生成一个搜索引擎数组(这个数组和服务器端的Search.xml文件对应),初始化搜索引擎列表。单击“搜索”按钮后,页面中的相关数据被POST到Search.aspx页面上,然后执行实际的搜索任务。Top.htm的具体代码如下:
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
     <title>聚合搜索</title>
     <meta http-equiv="Content-Type" content="text/html; charset=gb2312">
     <script type="text/javascript" src="xmlhttp.js"></script>
     <style type="text/css">
.currentSearch{color:Red;font:Bold;}
.innerBody{width:98%;text-align:center; font-size:12px}
</style>
</head>
<body>
    <div class="innerBody">
         <h3>
             <span style="color: #ff0033">聚合搜索</span></h3>
         <form action="search.aspx" method="get" id="fram2" target="_top" name=
              "my_search">
             <input name="key" type="text" id="s_soutext" style="font-size: 16px"
                  value="趣查" size="40" />
             <input name="no" type="hidden" id="s_souvalue" value="11" />
             <input type="button" class="button" style="font-size: 14px; padding-top:
                  2px" οnclick="s_souclick3();"
                 value=" 搜  索" /></form>
         <div id="searchListCon">
         </div>
    </div>
    <script type="text/javascript">
    //搜索引擎数组。类别,value,text。
    var s_sou = new Array();
    s_sou[0] = new Array(9,"谷歌(google)");
    s_sou[1] = new Array(11,"搜狗(sogou)");
    s_sou[2] = new Array(8,"百度(baidu)");
    s_sou[3] = new Array(13,"雅虎(yahoo)");
    s_sou[4] = new Array(10,"爱问(iask)");
    s_sou[5] = new Array(14,"中搜(zhongsou)");
    function search_init()                   //初始化页面
    {
    var url=window.location.search;//url
    var zid=9;                               //搜索引擎编号
    if(url)
    {
          var mt=url.match(/keyword=([^&]+)/i)
          if(mt)                               //如果存在搜索关键字,搜索框中显示关键字
           {
                 setHtmlElementValue($('s_soutext'),(mt[1]));
           }
          var mb = url.match(/no=([/d]+)/i);   //搜索引擎编号
           if(mb)
           {
                  zid=mb[1];
           }  
          }
          var searchStr=new Array();           //所有的搜索引擎的字符串
          for(var i=0;i<s_sou.length ;i++)     //遍历数组,构造搜索引擎字符串
          {
          var ta = s_sou[i];
          if (ta[0]==zid)                      //当前选中的搜索引擎
          searchStr.push('<a class="currentSearch" href=S.ashx?Times=1&key='+$ ("s_
soutext").value + '&no=' + ta[0] + ' target="right" id=a'+ ta[0] +' οnclick="search Click(this);">'+ ta[1] +'</a>');
          else
          searchStr.push('<a  href="S.ashx?Times=1&key=' + $("s_soutext").value + '&no=' + ta[0] + '" target="right" id=a'+ ta[0] +'  οnclick="searchClick(this);">' + ta[1] + '</a>');
          }
          $("s_souvalue").value= zid;
          // 显示搜索引擎
          $("searchListCon").innerHTML = "选择搜索引擎:" + searchStr.join("&nbsp;");
  } 
          function searchClick(obj)            //单击搜索引擎后,搜索引擎的css改变
          {
               var obj2 = $("searchListCon");
               var temp = obj2.getElementsByTagName("a");
               for(var i=0;i<temp.length;i++)   //遍历各搜索引擎
               {
                  if(temp[i]==obj)               //当前搜索引擎
                  temp[i].className='currentSearch';
                  else
                  temp[i].className='';
               }
               return true;
          }
          function s_souclick3()               //单击搜索按纽
          {
               key = $("s_soutext").value;
               if(key=="")                      //搜索关键字为空
               {
               alert("请输入搜索内容。");
               return false;
               }
               $("fram2").submit();
          }
          search_init();                       //初始化显示
          </script>
</body>
</html>
1.5.3  搜索信息文档Search.xml
Search.xml存储的是和各搜索引擎相关的信息,如查询的URL、查询字符串的格式、所使用的搜索类等。执行搜索时,程序首先读取这些信息,构造成查询的URL和查询字符串。各搜索引擎是以对应的“id”来区分的。Search.xml的具体代码如下:
<?xml version="1.0" encoding="utf-8" ?>
<Search>
  <item id="8" name="百度网页">
      <url><![CDATA[http://www.baidu.com/s?]]></url>
      <reg><![CDATA[ie=gb2312&bs={*|key@word|*}&sr=&z=&cl=3&f=8&wd={*|key@word|*}
           &ct =0]]></reg>
      <provide type="Baidu">
      </provide>
  </item>
  <item id="9" name="谷歌网页">
      <url><![CDATA[http://www.google.com/search?]]></url>
      <reg><![CDATA[hl=zh-CN&q={*|key@word|*}&btnG=Google+%E6%90%9C%E7%B4%A2&lr
           =]]></reg>
      <provide type="Google">
      </provide>
  </item>
  <item id="10" name="爱问网页">
      <url><![CDATA[http://www.iask.com/s?]]></url>
      <reg><![CDATA[tag=n&k={*|key@word|*}]]></reg>
      <provide type="Iask">
      </provide>
  </item>
  <item id="11" name="搜狗网页">
      <url><![CDATA[http://www.sogou.com/web?]]></url>
      <reg><![CDATA[query={*|key@word|*}]]></reg>
      <provide type="Sogou">
      </provide>
  </item>
  <item id="13" name="雅虎网页">
      <url><![CDATA[http://search.cn.yahoo.com/search?]]></url>
      <reg><![CDATA[p={*|key@word|*}&ei=gb2312&source=ysearch_web_hp_button&z=
           &meta=]]></reg>
      <provide type="Yahoo">
      </provide>
  </item>
  <item id="14" name="中搜网页">
      <url><![CDATA[http://p.zhongsou.com/p?]]></url>
      <reg><![CDATA[w={*|key@word|*}&pt=1&k=&rt=o]]></reg>
      <provide type="Zhongsou">
      </provide>
  </item>
</Search>
1.5.4  搜索结果格式化文档Result.xsl
Result.xsl是搜索结果的格式化文件。经过搜索类的分析,所获得的搜索结果被转化成了XML数据。Result.xsl文件把这个XML文件格式化后输出给客户端显示。同时,本系统所附带的广告也写在了这个格式化文件中,广告也可以存储在数据库中。Result.xsl的详细代码如下:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="/">
    <xsl:text disable-output-escaping="yes"><![CDATA[<!DOCTYPE html PUBLIC "-//W3C //DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1- transitional. dtd">]]></xsl:text>
    <html>
      <head>
        <xsl:text disable-output-escaping="yes">
                        <![CDATA[
         <script type="text/javascript" src="xmlhttp.js"></script>
              ]]>
                    </xsl:text>
        <xsl:text disable-output-escaping="yes">
                        <![CDATA[
                        <style type='text/css'>
                        img
                        {
                              border:0px;
                        }
        body{font-size:12px;margin-left:20px;widht:95%;}
        .bodyinner{widht:95%;margin-left:20px;}
        .text-left{float:left; width:65%}
        .text-right{float:right;margin-right:20px;; width:15%}
                        </style>
                        ]]>
                  </xsl:text>
        <xsl:value-of select="search/head" disable-output-escaping="yes"/>
      </head>
      <body >
        <div class="bodyinner">
          <xsl:choose>
            <xsl:when test="search/body">
              <xsl:apply-templates  select="search/body"/>
            </xsl:when>
            <xsl:otherwise>
              <div>
                <xsl:text>对不起!没有你要查找的内容!</xsl:text>
              </div>
            </xsl:otherwise>
          </xsl:choose>
          <div  class="text-right">
            <br/>
            <a href="http://www.qucha.net" target="_blank"> 趣查网---精神休闲家园,让
                 查询成为乐趣!</a>
            <br/><br/>
          </div>
        </div>
      </body>
    </html>
  </xsl:template>
  <xsl:template match="search/body">
    <div class="text-left">
      <xsl:apply-templates select="item"/>
      <xsl:value-of select="pageSite" disable-output-escaping="yes"/>
    </div>
  </xsl:template>
  <xsl:template match="item">
    <xsl:value-of select="begin" disable-output-escaping="yes"/>
    <xsl:apply-templates select="sour"/>
    <xsl:value-of select="end" disable-output-escaping="yes"/>
    <br/>
  </xsl:template>
  <xsl:template match="sour">
    <div>
      <xsl:value-of select="." disable-output-escaping="yes"/>
      <xsl:if test="position()=1">
        <xsl:text>    </xsl:text>
      </xsl:if>
    </div>
  </xsl:template>
</xsl:stylesheet>
1.5.5  查询页面S.ashx
S.ashx是真正执行查询的页面文件。它负责读取配置文件Search.xml中的数据,为查询准备好参数,并把参数传递给指定搜索引擎专用类。搜索引擎专用类执行Search()方法,把搜索引擎返回的结果以result.xsl的格式输出给客户端显示。具体代码如下:
<%@ WebHandler Language="C#" Class="S" %>
using System;
using System.Collections.Generic;
using System.Text;
using System.Web;
using System.Xml;
using System.Web.Caching;
public class S : IHttpHandler
{
      //获取配置文件中某个搜索引擎的相关数据
      public XmlNode getSearchContent(string id)
      {
           Object config = HttpContext.Current.Cache["search"];
           XmlDocument doc = config as XmlDocument;
           if (doc == null)
           {
                 //搜索引擎配置文件的路径
                 string path = HttpContext.Current.Server.MapPath(Tools.getApplica
                      tionPath() + "Search.xml");
                  doc = new XmlDocument();
                 doc.Load(path);                             //装载文件
                 config = doc;
                 //配置文件装入cache中,方便以后调用
                 HttpContext.Current.Cache.Insert("search", config, new System.Web.
                      Caching.CacheDependency(path), DateTime.MaxValue, TimeSpan.Zero,
                      CacheItemPriority. AboveNormal, null);
           }
           return doc.DocumentElement.SelectSingleNode(string.Format("item[@id=
                '{0}']", id));
      }
      public void ProcessRequest(HttpContext context)
      {
           XmlDocument document = null;
           string id = Tools.getPostItem("no");                //选定的搜索引擎编号
           string key = Tools.getPostItem("key");          //搜索关键字
           string style = "result.xsl";                        //搜索结果的显示样式
           if (id == "")                                   //id不存在
           {
                  id = Tools.getPostItem("itemtype");
                  if (id == "")
                  {
                         id = "9";
                  }
           }
           if (id == "")
           {
           }
           else                                            //id存在
           {
                XmlNode config = getSearchContent(id);       //搜索引擎的相关信息
                if (config != null)
                {
                       string url = config.SelectSingleNode("url").InnerText;
                                                        //搜索引擎的url
                       string reg = config.SelectSingleNode("reg").InnerText;
                                                //与之相关的查询字符串的正则表达式
                       if (config.Attributes["style"] != null && config.Attributes
                            ["style"].Value != "")
                       {
                             style = config.Attributes["style"].Value;
                                                //如果指定了特定的样式,则应用
                       }
                       XmlNode provide = config.SelectSingleNode("provide");
                                                //搜索执行需要的类
                       if (provide != null)            //初始化合适的搜索类
                       {
                               ISearch search = null;
                               string type = provide.Attributes["type"].InnerText;
                                                //类名
                               switch (type)
                               {
                                     case "Google": search = new Google(); break;
                                     case "Baidu": search = new Baidu(); break;
                                     case "Iask": search = new Iask(); break;
                                     case "Sogou": search = new Sogou(); break;
                                     case "Yahoo": search = new Yahoo(); break;
                                     case "Zhongsou": search = new Zhongsou(); break;
                                     default: search = new Baidu(); break;
                               }
                               if (search != null)
                               {
                                     if (key == "")        //没有搜索关键字,再取一次
                                     {
                                            string get_url = SearchQuery.get_Nav().To
                                                 String();
                                            //匹配
                                            get_url = Tools.Replace(get_url, "(no)|(key)
                                                 [^&#]+", "");
                                            search.URL = url + get_url;      //路径
                                     }
                                     else
                                     {
                                            search.URL = url + reg.Replace("{*|key@ word|*}",
                                                 key);
                                     }
                                     if (provide.Attributes["en"] != null)//编码方式
                                     {
                                            search.EnCode = provide.Attributes["en"].
                                                 Value;
                                     }
                                     search.ItemType = id;
                                     document = search.Search();           //搜索得到结果
                               }
                       }
                       else                                    //为空,重新搜索
                       {
                            string desculr = url + reg.Replace("{*|key@word|*}", Http
                                 Context.Current.Server.UrlEncode(key));
                            context.Response.Redirect(desculr, true);
                            return;
                       }
                }
                else
                {
                     document = new XmlDocument();
                     document.LoadXml("<search/>");           //空文档
                }
           }
           if (document != null)
           {
                  string xmlstr = Tools.XmlToString(document, style);
                                                        //把xml转换成字符串
                  context.Response.ContentType = "text/html";
                  context.Response.Write(xmlstr);            //输出内容
                  context.Response.End();
           }
      }
      public bool IsReusable                              //不重用
      {
           get
           {
                return false;
           }
      }
}
 
1.6  小结
本文详述了一个聚合搜索引擎的实现过程。是一个相当实用的项目,通过应用此示例可以大大提高搜索的效率和准确度。本系统只考虑了聚合搜索技术实现方面的问题,对此而引起的任何法律问题未做考虑,所以如果把本例用做商业用途,请谨慎为之。
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值