电子书网站:
对网页进行分析可知,
<div class="index_toplist mright mbottom">
<div class="toptab" id="top_all_1">
<span>玄幻奇幻排行</span>
<div class="index_toplist mright mbottom">
<div class="toptab" id="top_all_2">
<span>武侠仙侠排行</span>
<div class="index_toplist mbottom">
<div class="toptab" id="top_all_4">
<span>历史军事排行</span>
<div class="index_toplist mbottom">
<div class="toptab" id="top_all_8">
<span>完本小说排行</span>
观察发现,历史军事和完本小说的时一致的,其余类别是一样的,所以进行分开处理。
每一个分类都是包裹在:
<div class="index_toplist mright mbottom">
之中 这种调理清晰的网站,大大方便了我们爬虫的编写
一个类别里,出现了排行榜上所有的小说:
<div class="index_toplist mright mbottom">
<div class="toptab" id="top_all_1">
<span>玄幻奇幻排行</span><div>
<div class="topbooks" id="con_o1g_1" style="display: block;">
<ul>
<li><span class="hits">05-06</span><span class="num">1.</span><a href="/book/168/" title="择天记" target="_blank">择天记</a></li>
<li><span class="hits">05-06</span><span class="num">2.</span><a href="/book/176/" title="大主宰" target="_blank">大主宰</a></li>
<!--中间省略了不少 -->
<li><span class="hits">05-06</span><span class="num">3.</span><a href="/book/4140/" title="太古神王" target="_blank">太古神王</a></li>
<li><span class="hits">05-06</span><span class="num">4.</span><a href="/book/5094/"