自定义爬虫框架(基于scrapy)时,解决用户直接在配置文件setting里添加自定义中间件,管道以及爬虫,无需在main.py中手动导入模块的问题
代码示范:
用户配置文件setting.py自定义中间件、管道以及爬虫如下:
SPIDERS=[
"spiders.baidu.BaiduSpider",
"spiders.douban.DoubanSpider",
]
PIPILINES=[
"pipelines.BaiduPipeline",
"pipelines.BaiduPipeline2",
"pipelines.DoubanPipeline",
"pipelines.DoubanPipeline2",
]
SPIDERS_MIDDLEWARES=[
"middlewares.SpiderMiddleware",
"middlewares.SpiderMiddleware2",
]
DOWNLOADER_MIDDLEWARES=[
"middlewares.DownloaderMiddleware",
"middlewares.DownloaderMiddleware2",
]
框架engine.py处理动态导入模块代码如下:
import importlib
def _auto_import_module(module_list):
intance=[]
for module in module_list:
path_name = module[:module.rfind(".")]
class_name = module[module.rfind(".") + 1:]
path = importlib.import_module(path_name)
cls = getattr(path, class_name)
instance.append(cls())
return instance
cls即为获取到的类对象