任务描述
爬取58同城的手机号类目下,所有帖子的标题和链接,存在数据库中
设计爬取详细信息的爬虫2,将手机号卖家信息存入数据库
3、使用技能:定位网页元素,存储数据库,读取数据库
我的代码
from bs4 import BeautifulSoup
import requests
import time
import pymongo
headers = {
'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36',
}
client = pymongo.MongoClient('localhost',27017)
test_58 = client['test_58']
url_list_phone_number = test_58['url_list_phone_number']
item_info_phone_number = test_58['item_info_phone_number']
channel = 'http://bj.58.com/shoujihao/'
#spider 1
def get_links_from(channel,pages):
list_view = '{}pn{}/'.format(channel,str(pages))
web_data