搜索引擎关键词排名爬虫，采集获取

最新推荐文章于 2022-12-06 14:52:56 发布

学良

最新推荐文章于 2022-12-06 14:52:56 发布

阅读量889

点赞数

分类专栏：易语言 Python seo 文章标签：搜索引擎百度关键词 Python 爬虫采集

本文链接：https://blog.csdn.net/li947011039/article/details/102486246

版权

易语言同时被 3 个专栏收录

3 篇文章 1 订阅

订阅专栏

Python

3 篇文章 0 订阅

订阅专栏

seo

1 篇文章 0 订阅

订阅专栏

# coding=utf-8
import threading, queue, time, urllib
from urllib import request
import work.baidu_pc as bd_pc


# 将所需要的数据塞入队列之中
urlQueue = queue.Queue()
lines=[]
with open('seo.txt', encoding='UTF-8') as f:
    for line in f:
        lines.append(list(line.strip('\n').split(',')))

for wd in lines:
    urlQueue.put(str(wd[0]))

def fetchUrl(urlQueue):

    while True:
        try:
            # 不阻塞的读取队列数据
            wd = urlQueue.get_nowait()
            i = urlQueue.qsize()
        except Exception as e:
            break
        print('Current Thread Name %s, wd: %s ' % (threading.currentThread().name, wd))
        # 将拿出的数据进行操作
        try:
            result = bd_pc.getOrder(wd, 'www.51seo.net')
            # 为了突出效果， 设置延时
            # time.sleep(1)
            print(result)
        except Exception as e:
            continue

if __name__ == '__main__':
    startTime = time.time()
    threads = []
    # 可以调节线程数， 进而控制抓取速度
    threadNum = 10
    for i in range(0, threadNum):
        t = threading.Thread(target=fetchUrl, args=(urlQueue,))
        threads.append(t)

    for t in threads:
        t.start()
    for t in threads:
        # 多线程多join的情况下，依次执行各线程的join方法, 这样可以确保主线程最后退出， 且各个线程间没有阻塞
        t.join()
    endTime = time.time()
    print('Done, Time cost: %s ' % (endTime - startTime))

Python和易语言写的关键词排名抓取，多线程批量抓取，需要的可以siliao

在这里插入图片描述

学良

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
1
评论
搜索引擎关键词排名爬虫，采集获取

# coding=utf-8import threading, queue, time, urllibfrom urllib import requestimport work.baidu_pc as bd_pc# 将所需要的数据塞入队列之中urlQueue = queue.Queue()lines=[]with open('seo.txt', encoding='UTF-8')...
复制链接

扫一扫