使用MongoDB保存抓取结果,应如何设置settings.py与pipelines.py
1. settings.py
# settings.py
MONGODB_SERVER = 'localhost'
MONGODB_PORT = 27017
MONGODB_DB = 'freebuf_db'
MONGODB_COLLECTION = 'wenzhang'
在settings.py文件的设置中,MONGODB_PORT变量起初的定义是在27017上加了单引号,运行后系统报错,称“MONGODB_PORT must be int”, 所以这里要注意该变量的值为整形,非字符串型。
2. pipelines.py
# pipelines.py
import pymongo
from scrapy.conf import settings
#from scrapy.exceptions import DropItem
#from scrapy import log
class MongoPipeline(object):
def __init__(self):
self.server = settings['MONGODB_SERVER']
self.port = settings['MONGODB_PORT']
self.db = settings['MONGODB_DB']