<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>草屋主人(孙立)的专栏 - 搜索引擎</title><link>http://blog.csdn.net/cao5/category/150541.aspx</link><description>搜索引擎</description><dc:language>zh-CN</dc:language><lastUpdateTime>Tue, 26 Feb 2008 09:43:00 GMT</lastUpdateTime><ttl>60</ttl><item><dc:creator>草屋主人</dc:creator><title>图书搜索引擎</title><link>http://blog.csdn.net/cao5/archive/2005/10/24/515235.aspx</link><pubDate>Mon, 24 Oct 2005 21:24:00 GMT</pubDate><guid>http://blog.csdn.net/cao5/archive/2005/10/24/515235.aspx</guid><wfw:comment>http://blog.csdn.net/cao5/comments/515235.aspx</wfw:comment><comments>http://blog.csdn.net/cao5/archive/2005/10/24/515235.aspx#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://blog.csdn.net/cao5/comments/commentRss/515235.aspx</wfw:commentRss><trackback:ping>http://tb.blog.csdn.net/TrackBack.aspx?PostId=515235</trackback:ping><description>采用多种方法开发
http://book.ku6.cn/

 ADSL自用电脑做的服务器&lt;img src ="http://blog.csdn.net/cao5/aggbug/515235.aspx" width = "1" height = "1" /&gt;</description></item><item><dc:creator>草屋主人</dc:creator><title>计划做个电子书资源搜索搜索</title><link>http://blog.csdn.net/cao5/archive/2005/10/24/514670.aspx</link><pubDate>Mon, 24 Oct 2005 12:42:00 GMT</pubDate><guid>http://blog.csdn.net/cao5/archive/2005/10/24/514670.aspx</guid><wfw:comment>http://blog.csdn.net/cao5/comments/514670.aspx</wfw:comment><comments>http://blog.csdn.net/cao5/archive/2005/10/24/514670.aspx#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://blog.csdn.net/cao5/comments/commentRss/514670.aspx</wfw:commentRss><trackback:ping>http://tb.blog.csdn.net/TrackBack.aspx?PostId=514670</trackback:ping><description>昨天晚上索引了３０００多视频教程和电子书籍,抓取数据的软件用c#开发，&lt;img src ="http://blog.csdn.net/cao5/aggbug/514670.aspx" width = "1" height = "1" /&gt;</description></item><item><dc:creator>草屋主人</dc:creator><title>spider对文档内容的分析又一方法</title><link>http://blog.csdn.net/cao5/archive/2005/10/05/495656.aspx</link><pubDate>Wed, 05 Oct 2005 22:51:00 GMT</pubDate><guid>http://blog.csdn.net/cao5/archive/2005/10/05/495656.aspx</guid><wfw:comment>http://blog.csdn.net/cao5/comments/495656.aspx</wfw:comment><comments>http://blog.csdn.net/cao5/archive/2005/10/05/495656.aspx#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://blog.csdn.net/cao5/comments/commentRss/495656.aspx</wfw:commentRss><trackback:ping>http://tb.blog.csdn.net/TrackBack.aspx?PostId=495656</trackback:ping><description>   网页文档内容的分析
   一般可以分为内容提取，title标签，keywords等对页面内容的分析。
 其实可以在我们的spider的url连接来源的innertext进行分析&lt;img src ="http://blog.csdn.net/cao5/aggbug/495656.aspx" width = "1" height = "1" /&gt;</description></item></channel></rss>