python-自动监测博主发布最新文章时，自己发布评论。实时获取微博最新热搜。

本文链接：https://blog.csdn.net/weixin_49065061/article/details/136348958

文章目录

效果图

在这里插入图片描述

引言

评论内容：可获取微博热搜榜，计算当前日期离节假日或周末的天数，个人博客信息等内容
获取到指定博主的个人主页的文章列表，得到最新的文章信息，判断是否为当天所发，若是则发表评论，不是继续监听
脚本启动使用window自带的任务计划程序

微博热搜获取

def journalism():
    news = []
    # 新建数组存放热搜榜
    hot_url = 'https://s.weibo.com/top/summary/'
    headers = {
        'cookie' : '你的微博网站cookie',
        'user-agent' : '你的网站'
    }
    # 热搜榜链接
    r = requests.get(hot_url, headers=headers)
    # 向链接发送get请求获得页面
    soup = BeautifulSoup(r.text, 'lxml')
    # 解析页面
    urls_titles = soup.select('#pl_top_realtimehot > table > tbody > tr > td.td-02 > a')
    hotness = soup.select('#pl_top_realtimehot > table > tbody > tr > td.td-02 > span')

    for i in range(len(urls_titles)-1):
        hot_news = {}
        # 将信息保存到字典中
        hot_news['title'] = urls_titles[i+1].get_text()
        # get_text()获得a标签的文本
        hot_news['url'] = "https://s.weibo.com"+urls_titles[i]['href']
        # ['href']获得a标签的链接，并补全前缀
        hot_news['hotness'] = hotness[i].get_text()
        # 获得热度文本
        news.append(hot_news)
        # 字典追加到数组中
    return [f"{i['title']}:{i['url']}\n" for i in  news[:3]]

注意：这里我们只获取热搜前三，cookie你得去网站按F12自己看。

发送评论

def myfunction():
    res = requests.get(url = 你要评论的博主主页 , headers = headers).text
    tree = etree.HTML(res)
    lst = tree.xpath('//*[@id="floor-user-profile_485"]/div/div[2]/div/div[2]/div/div[2]/div/div')
    for i in lst :
        href = i.xpath('//article[@class="blog-list-box"]/a/@href')
        # 只获取最新的文章
        hre = href[0]
        # 判断当天是否发出
        response = requests.get(hre, headers=headers)
        # 解析响应内容
        soup = BeautifulSoup(response.text, 'html.parser')
        # 搜索包含所需数据的div
        divs = soup.find_all('span', class_='time')
        curdata = divs[0].get_text(strip=True)
        now = datetime.now()
        # 尝试使用初始格式解析
        try:
            post_date = datetime.strptime(curdata, date_format)
        except ValueError:
            # 如果失败，尝试使用其他格式
            date_format = "最新推荐文章于 %Y-%m-%d %H:%M:%S 发布"
            try:
                post_date = datetime.strptime(curdata, date_format)
            except ValueError:
                # 如果仍然失败，则可能需要更复杂的处理，或者使用默认值
                post_date = None
                post_date = datetime.strptime(curdata, date_format)
        # 比较两个日期是否为同一天
        is_same_day = now.year == post_date.year and now.month == post_date.month and now.day == post_date.day
        #end


        url1 = 'https://blog.csdn.net/phoenix/web/v1/comment/submit'

        headers1 = {
            'cookie' : '你的博客网站cookie',
            'referer': '{0}?spm=你的博客网站referer'.format(hre),
            'user-agent': '你的博客网站user-agent'
        }
        hre = hre + "@"
        ex = '/article/details/(.*?)@'
        hr = re.findall(ex , hre)[0]
        data1 = {
            'commentId' : '',
            'content': journalism(),
            'articleId': '{0}'.format(hr)
        }
        try:
            if is_same_day:
                requests.post(url=url1, headers=headers1, data=data1)
                print("成功!")
                return True
            else:
                print("还未发布文章!")
                return False
        except:
            print("失败！")
            return False

这里是以CSDN网站的个人主页分析的，具体网站要具体去看html，特别是对是否为同一天的判断，这里只会对当天发布的最新文章进行评论。
着重介绍以下url1为的由来，在F12模式下可以看到当我们发布评论的时候，在CSDN网站都是通过这样的头进行POST请求的，所以我们就可以拿到要进行评论的网址进行拼接信息后发送请求即可完成我们的评论。
headers内的数据自己去找

定时监测是否有新文章发布

def condition_check():
    result = myfunction()
    return result

attempts = 0
max_attempts = 15
wait_time = 10 * 60 # 10分钟转换成秒

while attempts < max_attempts:
    if condition_check():
        print("条件满足，程序结束。")
        break
    else:
        attempts += 1
        print(f"条件不满足，等待再次检查...尝试次数 {attempts}/{max_attempts}")
        if attempts < max_attempts:
            time.sleep(wait_time)
if attempts == max_attempts:
    print("已达到最大尝试次数，程序结束。")