DataWhale的MetaGPT学习笔记——④

订阅智能体

什么是订阅智能体

我们可以通过Agent获取我们所关注的一些信息,然后Agent对信息处理后以某种形式将其再发送给用户,这样的Agent就是订阅智能体

在这里Agent的Role是给我们获取资讯,而Action就是从外界获取信息并进行处理,然后我们可以为其开发更多的额外功能,比如定时运行任务和发送渠道的功能

用MetaGPT实现订阅智能体

MetaGPT提供了一个SubscriptionRunner类,提供了一个Role的运行方式。我们可以定时触发一个Role,然后将Role的执行输出通知给用户

import asyncio
from metagpt.subscription import SubscriptionRunner
from metagpt.roles import Searcher
from metagpt.schema import Message
from duckduckgo_search import DDGS
​
​
async def trigger():
    while True:
        yield Message("the latest news about OpenAI")
        await asyncio.sleep(3600 * 24)
​
async def callback(msg: Message):
     print(msg.content)
​
async def main():
    pb = SubscriptionRunner()
    await pb.subscribe(Searcher(), trigger(), callback)
    await pb.run()
​
asyncio.run(main())

注意,我们需要先下载一个duckduckgo_search包

pip install duckduckgo_search

输出信息

2024-05-19 13:55:45.259 | INFO | metagpt.roles.searcher:act_sp:52 - Alice(Smart Assistant): to do SearchAndSummarize(SearchAndSummarize) OpenAI has recently launched a new AI model named GPT-4o, which offers advancements in usability and integration, especially highlighting its voice conversation capabilities and multimodal interactions (handling both text and image inputs). Notably, this model matches the previous version's abilities in English and coding but has improved upon other functionalities. OpenAI is also introducing a new desktop app for ChatGPT aimed at both free and paid users, designed to offer seamless integration with macOS through simple keyboard shortcuts, enhancing user interaction directly on computers source. 2024-05-19 13:56:06.846 | WARNING | metagpt.provider.openai_api:calc_usage:258 - usage calculation failed: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1122)'))) OpenAI has recently launched a new AI model named GPT-4o, which offers advancements in usability and integration, especially highlighting its voice conversation capabilities and multimodal interactions (handling both text and image inputs). Notably, this model matches the previous version's abilities in English and coding but has improved upon other functionalities. OpenAI is also introducing a new desktop app for ChatGPT aimed at both free and paid users, designed to offer seamless integration with macOS through simple keyboard shortcuts, enhancing user interaction directly on computers source.

从这个简单的例子我们知道,我们订阅智能体需要三个要素

  • 智能体

  • 触发器

  • 数据回调

注意:虽然我们不会对订阅智能体的Role做出限制,但并不是所有的Role都适合当订阅智能体。我们选取订阅智能体的时候应该从应用的角度出发,选择有实时性的输出的订阅智能体。通俗来说,就是同一件事,我们在不同时间问,输出内容会根据时间的变化而产生变化,比如新闻舆论,技术发展,股票信息等等。

OSSWatcher Role实现

在实现Role的时候,我们要明确我们要实现的哪些Action

比如我们要从网页获取信息,我们可以把它拆解成两个不同的Action,一个是爬取信息,一个是对信息进行分析和处理。

这里以GitHub Trending为例,我们现对其进行爬取,然后进行分析。

GitHub Trending爬取

这里直接对html进行复制和保存操作,进行一个脚本的处理之后,我们大概就能获得HTML的文件。

from bs4 import BeautifulSoup
​
with open("github-trending-raw.html") as f:
    html = f.read()
​
soup = BeautifulSoup(html, "html.parser")
for i in soup.find_all(True):
    for name in list(i.attrs):
        if i[name] and name not in ["class"]:
            del i[name]
​
for i in soup.find_all(["svg", "img", "video", "audio"]):
    i.decompose()
​
with open("github-trending-slim.html", "w") as f:
    f.write(str(soup))

我们从Github上面把这个Trending界面的图片元素的HTML复制下来了

然后我们修改一下读取的代码,使其支持UTF-8模式

from bs4 import BeautifulSoup
​
# 使用 'utf-8' 编码打开文件
with open("github-trending-raw.html", encoding="utf-8") as f:
    html = f.read()
​
soup = BeautifulSoup(html, "html.parser")
​
# 清除不需要的属性
for i in soup.find_all(True):
    for name in list(i.attrs):
        if i[name] and name not in ["class"]:
            del i[name]
​
# 删除不需要的标签
for i in soup.find_all(["svg", "img", "video", "audio"]):
    i.decompose()
​
# 使用 'utf-8' 编码写入文件
with open("github-trending-slim.html", "w", encoding="utf-8") as f:
    f.write(str(soup))
​

然后我们得到一个简化版本的

我们只选取部分进行处理,多余的进行删除。

然后我们得到这个代码

import aiohttp
import asyncio
from bs4 import BeautifulSoup
​
async def fetch_html(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as response:
            return await response.text()
​
async def parse_github_trending(html):
    soup = BeautifulSoup(html, 'html.parser')
​
    repositories = []
​
    for article in soup.select('article.Box-row'):
        repo_info = {}
​
        repo_info['name'] = article.select_one('h2 a').text.strip()
        repo_info['url'] = article.select_one('h2 a')['href'].strip()
​
        # Description
        description_element = article.select_one('p')
        repo_info['description'] = description_element.text.strip() if description_element else None
​
        # Language
        language_element = article.select_one('span[itemprop="programmingLanguage"]')
        repo_info['language'] = language_element.text.strip() if language_element else None
​
        # Stars and Forks
        stars_element = article.select('a.Link--muted')[0]
        forks_element = article.select('a.Link--muted')[1]
        repo_info['stars'] = stars_element.text.strip()
        repo_info['forks'] = forks_element.text.strip()
​
        # Today's Stars
        today_stars_element = article.select_one('span.d-inline-block.float-sm-right')
        repo_info['today_stars'] = today_stars_element.text.strip() if today_stars_element else None
​
        repositories.append(repo_info)
​
    return repositories
​
async def main():
    url = 'https://github.com/trending'
    html = await fetch_html(url)
    repositories = await parse_github_trending(html)
​
    for repo in repositories:
        print(f"Name: {repo['name']}")
        print(f"URL: https://github.com{repo['url']}")
        print(f"Description: {repo['description']}")
        print(f"Language: {repo['language']}")
        print(f"Stars: {repo['stars']}")
        print(f"Forks: {repo['forks']}")
        print(f"Today's Stars: {repo['today_stars']}")
        print()

我们把他修改为一个Action类

  • 用一个async with 创建一个ClientSession对象用于发送HTTP请求

  • 使用 client.get() 方法发送GET请求,并且指定URL

  • 然后用proxy=CONFIG.global_proxy 参数设置代理

  • 使用 response.raise_for_status() 方法检测响应状态,不为200就是异常

  • 使用 await response.text() 方法读取相应的内容并且存储到HTML中去

  • 然后返回HTML内容给调用者

    async def run(self, url: str = "https://github.com/trending"):
        async with aiohttp.ClientSession() as client:
            async with client.get(url, proxy=CONFIG.global_proxy) as response:
                response.raise_for_status()
                html = await response.text()

完整版本代码如下

import aiohttp
from bs4 import BeautifulSoup
from metagpt.actions.action import Action
from metagpt.config import CONFIG
​
class CrawlOSSTrending(Action):
​
    async def run(self, url: str = "https://github.com/trending"):
        async with aiohttp.ClientSession() as client:
            async with client.get(url, proxy=CONFIG.global_proxy) as response:
                response.raise_for_status()
                html = await response.text()
 
        soup = BeautifulSoup(html, 'html.parser')
    
        repositories = []
    
        for article in soup.select('article.Box-row'):
            repo_info = {}
            
            repo_info['name'] = article.select_one('h2 a').text.strip().replace("\n", "").replace(" ", "")
            repo_info['url'] = "https://github.com" + article.select_one('h2 a')['href'].strip()
    
            # Description
            description_element = article.select_one('p')
            repo_info['description'] = description_element.text.strip() if description_element else None
    
            # Language
            language_element = article.select_one('span[itemprop="programmingLanguage"]')
            repo_info['language'] = language_element.text.strip() if language_element else None
    
            # Stars and Forks
            stars_element = article.select('a.Link--muted')[0]
            forks_element = article.select('a.Link--muted')[1]
            repo_info['stars'] = stars_element.text.strip()
            repo_info['forks'] = forks_element.text.strip()
    
            # Today's Stars
            today_stars_element = article.select_one('span.d-inline-block.float-sm-right')
            repo_info['today_stars'] = today_stars_element.text.strip() if today_stars_element else None
    
            repositories.append(repo_info)
    
        return repositories

然后,我们实现了网页爬取的Action,然后是分析的Action,只需要写提示词就行了

然后我们把Action写到一个文件里面去

from typing import Any
from metagpt.actions.action import Action
import aiohttp
from bs4 import BeautifulSoup
from metagpt.actions.action import Action
from metagpt.config2 import Config
​
​
TRENDING_ANALYSIS_PROMPT = """# Requirements
You are a GitHub Trending Analyst, aiming to provide users with insightful and personalized recommendations based on the latest
GitHub Trends. Based on the context, fill in the following missing information, generate engaging and informative titles, 
ensuring users discover repositories aligned with their interests.
​
# The title about Today's GitHub Trending
## Today's Trends: Uncover the Hottest GitHub Projects Today! Explore the trending programming languages and discover key domains capturing developers' attention. From ** to **, witness the top projects like never before.
## The Trends Categories: Dive into Today's GitHub Trending Domains! Explore featured projects in domains such as ** and **. Get a quick overview of each project, including programming languages, stars, and more.
## Highlights of the List: Spotlight noteworthy projects on GitHub Trending, including new tools, innovative projects, and rapidly gaining popularity, focusing on delivering distinctive and attention-grabbing content for users.
---
# Format Example
​
```
# [Title]
​
## Today's Trends
Today, ** and ** continue to dominate as the most popular programming languages. Key areas of interest include **, ** and **.
The top popular projects are Project1 and Project2.
​
## The Trends Categories
1. Generative AI
    - [Project1](https://github/xx/project1): [detail of the project, such as star total and today, language, ...]
    - [Project2](https://github/xx/project2): ...
...
​
## Highlights of the List
1. [Project1](https://github/xx/project1): [provide specific reasons why this project is recommended].
...
```
​
---
# Github Trending
{trending}
"""
​
class AnalysisOSSTrending(Action):
​
    async def run(
            self,
            trending: Any
    ):
        return await self._aask(TRENDING_ANALYSIS_PROMPT.format(trending=trending))
​
​
class CrawlOSSTrending(Action):
​
    async def run(self, url: str = "https://github.com/trending"):
        async with aiohttp.ClientSession() as client:
            async with client.get(url, proxy=Config.global_proxy) as response:
                response.raise_for_status()
                html = await response.text()
​
        soup = BeautifulSoup(html, 'html.parser')
​
        repositories = []
​
        for article in soup.select('article.Box-row'):
            repo_info = {}
​
            repo_info['name'] = article.select_one('h2 a').text.strip().replace("\n", "").replace(" ", "")
            repo_info['url'] = "https://github.com" + article.select_one('h2 a')['href'].strip()
​
            # Description
            description_element = article.select_one('p')
            repo_info['description'] = description_element.text.strip() if description_element else None
​
            # Language
            language_element = article.select_one('span[itemprop="programmingLanguage"]')
            repo_info['language'] = language_element.text.strip() if language_element else None
​
            # Stars and Forks
            stars_element = article.select('a.Link--muted')[0]
            forks_element = article.select('a.Link--muted')[1]
            repo_info['stars'] = stars_element.text.strip()
            repo_info['forks'] = forks_element.text.strip()
​
            # Today's Stars
            today_stars_element = article.select_one('span.d-inline-block.float-sm-right')
            repo_info['today_stars'] = today_stars_element.text.strip() if today_stars_element else None
​
            repositories.append(repo_info)
​
        return repositories

Trigger实现

Trigger即触发器,代表了OSSWatcher角色的执行时机,最简单的触发方式即定时触发。Github Trending不是实时更新的,参考Do you know when trending repositories on GitHub are usually updated? · community · Discussion #64295 · GitHub,大约是在10:00 AM UTC,实测更新时间不是很准时,所以我们可以一天触发一次,选取一个比较适合自己的推送时间即可,比如每天早上9点

import asyncio
import time
​
from datetime import datetime, timedelta
from metagpt.schema import Message
from pydantic import BaseModel, Field
​
​
class OssInfo(BaseModel):
    url: str
    timestamp: float = Field(default_factory=time.time)
​
​
async def oss_trigger(hour: int, minute: int, second: int = 0, url: str = "https://github.com/trending"):
    while True:
        now = datetime.now()
        next_time = datetime(now.year, now.month, now.day, hour, minute, second)
        if next_time < now:
            next_time = next_time + timedelta(1)
        wait = next_time - now
        print(wait.total_seconds())
        await asyncio.sleep(wait.total_seconds())
        yield Message(url, OssInfo(url=url))

这里的定时直接可以使用asyncio.sleep来实现

yield 语句被用于异步函数"oss_trigger"中,用于生成消息。每当调用这个异步函数时,它会在指定的时间间隔内生成一个消息,并在下一次调用时继续执行。此处我们预定义了OssInfo的结构,加入了时间戳的信息,并将其实例作为trigger生成的Messageinstruct_content属性,作用是在早期的版本中,角色在接收Message会有一个去重操作,如果我们每次生成的Message只有url信息,那么第2次运行时,角色将不会接收新的Message,但是加入时间戳后,trigger生成的每个Message就不再相等,角色也会接收对应的Message

上述的简单例子,可以实现简单的按天定时触发的能力,不过如果需要更精细的控制,这个函数还需要继续优化。但我们可以借助一些第三方包实现这个功能,使用crontab实现定时触发是非常常见的一个做法,而且python也有一个异步的cron工具,即aiocron,使用aiocron我们可以直接使用cron的语法制定定时任务。上面我们使用了函数的方式来实现了定时Trigger异步生成器,接下来我们结合aiocron使用类的方式,来实现定时Trigger

import time
from aiocron import crontab
from typing import Optional
from pytz import BaseTzInfo
from pydantic import BaseModel, Field
from metagpt.schema import Message
​
class GithubTrendingCronTrigger:
    def __init__(
        self,
        spec: str,
        tz: Optional[BaseTzInfo] = None,
        url: str = "https://github.com/trending",
    ) -> None:
        self.crontab = crontab(spec, tz=tz)
        self.url = url
​
    def __aiter__(self):
        return self
​
    async def __anext__(self):
        await self.crontab.next()
        return Message(content=self.url)
​

基于aiocron我们可以少写很多代码,功能也更加强大,可以用cron语法非常灵活地配置定时规则

如果我们想指定UTC 时间 10:00 AM 触发

# 创建 GithubTrendingCronTrigger 实例,指定每天 UTC 时间 10:00 AM 触发
cron_trigger = GithubTrendingCronTrigger("0 10 * * *")

如果您想指定北京时间上午8:00来触发这个任务,您需要做两件事:

  1. 设置正确的 cron 表达式。

  2. 确保时区设置正确。

北京时间是东八区(UTC+8),所以您应该在 tz 参数中设置相应的时区。而 cron 表达式遵循特定的格式,通常是:分钟、小时、日、月、星期几。

对于每天上午8:00,cron 表达式应该是 "0 8 * * *",这表示每天的第8小时的第0分钟触发。

因此,我们的 GithubTrendingCronTrigger 类的初始化代码应该类似于以下形式:

from pytz import timezone
beijing_tz = timezone('Asia/Shanghai')  获取北京时间的时区
cron_trigger = GithubTrendingCronTrigger("0 8 * * *", tz=beijing_tz)

思考1:实现榜单更新再推送

为了实现只有在榜单更新时才推送,我们需要在每次定时触发时获取当前的榜单,并将其与上一次获取的榜单进行比较。如果有变化,则推送更新。

具体步骤如下:

  1. 定时获取榜单:继续使用现有的定时触发机制。

  2. 保存上次获取的榜单:将上一次获取的榜单存储在一个变量或持久化存储中。

  3. 比较榜单:在每次获取新榜单后,与上一次的榜单进行比较。

  4. 推送更新:如果榜单有变化,则生成并推送消息。

import aiohttp
import asyncio
from aiocron import crontab
from bs4 import BeautifulSoup
from typing import Optional
from pytz import timezone
from pydantic import BaseModel, Field
from metagpt.schema import Message
​
​
class OssInfo(BaseModel):
    url: str
    timestamp: float = Field(default_factory=time.time)
​
​
class GithubTrendingCronTrigger:
    def __init__(
        self,
        spec: str,
        tz: Optional[timezone] = None,
        url: str = "https://github.com/trending",
    ) -> None:
        self.crontab = crontab(spec, tz=tz)
        self.url = url
        self.last_trending = None
​
    def __aiter__(self):
        return self
​
    async def __anext__(self):
        await self.crontab.next()
        current_trending = await self.fetch_github_trending()
        if self.has_trending_changed(current_trending):
            self.last_trending = current_trending
            return Message(content=self.url, instruct_content=OssInfo(url=self.url))
        return None
​
    async def fetch_github_trending(self):
        async with aiohttp.ClientSession() as session:
            async with session.get(self.url) as response:
                if response.status == 200:
                    return await response.text()
                else:
                    print(f"Failed to fetch {self.url}: {response.status}")
                    return None
​
    def has_trending_changed(self, current_trending):
        if self.last_trending is None:
            return True
        return current_trending != self.last_trending
​
​
async def main():
    beijing_tz = timezone('Asia/Shanghai')
    cron_trigger = GithubTrendingCronTrigger("0 8 * * *", tz=beijing_tz)
​
    async for message in cron_trigger:
        if message:
            print(f"New trending list detected: {message.content}")
            # Handle the message (e.g., send a notification)
​
​
if __name__ == "__main__":
    asyncio.run(main())
 
思考2:调试Crontab的定时方式

在调试过程中,频繁运行定时任务以查看效果可能很麻烦。为了方便调试,可以使用以下几种方法:

  1. 减少等待时间:将 cron 表达式调整为更短的时间间隔,例如每分钟或每五分钟运行一次。

  2. 使用手动触发:在调试期间,提供一个手动触发功能,允许开发者随时运行任务。

  3. 模拟时间:使用模拟时间的方法,快速推进时间以触发任务。

以下是修改后的示例,包含一个手动触发机制:

import aiohttp
import asyncio
from aiocron import crontab
from bs4 import BeautifulSoup
from typing import Optional
from pytz import timezone
from pydantic import BaseModel, Field
from metagpt.schema import Message
​
​
class OssInfo(BaseModel):
    url: str
    timestamp: float = Field(default_factory=time.time)
​
​
class GithubTrendingCronTrigger:
    def __init__(
        self,
        spec: str,
        tz: Optional[timezone] = None,
        url: str = "https://github.com/trending",
    ) -> None:
        self.crontab = crontab(spec, tz=tz)
        self.url = url
        self.last_trending = None
​
    def __aiter__(self):
        return self
​
    async def __anext__(self):
        await self.crontab.next()
        current_trending = await self.fetch_github_trending()
        if self.has_trending_changed(current_trending):
            self.last_trending = current_trending
            return Message(content=self.url, instruct_content=OssInfo(url=self.url))
        return None
​
    async def fetch_github_trending(self):
        async with aiohttp.ClientSession() as session:
            async with session.get(self.url) as response:
                if response.status == 200:
                    return await response.text()
                else:
                    print(f"Failed to fetch {self.url}: {response.status}")
                    return None
​
    def has_trending_changed(self, current_trending):
        if self.last_trending is None:
            return True
        return current_trending != self.last_trending
​
    async def trigger_now(self):
        current_trending = await self.fetch_github_trending()
        if self.has_trending_changed(current_trending):
            self.last_trending = current_trending
            return Message(content=self.url, instruct_content=OssInfo(url=self.url))
        return None
​
​
async def main():
    beijing_tz = timezone('Asia/Shanghai')
    cron_trigger = GithubTrendingCronTrigger("0 8 * * *", tz=beijing_tz)
​
    # For debugging, manually trigger
    message = await cron_trigger.trigger_now()
    if message:
        print(f"Manually triggered: {message.content}")
​
    async for message in cron_trigger:
        if message:
            print(f"New trending list detected: {message.content}")
            # Handle the message (e.g., send a notification)
​
​
if __name__ == "__main__":
    asyncio.run(main())

通过这种方式,可以在调试期间随时手动触发任务,从而快速检查代码是否按预期工作。

CallBack设计

Callback定义了处理智能体生成信息系的方式,最关键的点是在于用什么方式发送到我们日常使用的应用程序。我们这里以Wechat为例,也就是将这个生成的数据信息发送到微信。

发送信息到wechat

我们直接选用第三方的公众号的消息推送来进行这个发送过程,比如wxpusher

我们首先实现一个异步的客户端

import os
from typing import Optional
import aiohttp
​
​
class WxPusherClient:
    def __init__(self, token: Optional[str] = None, base_url: str = "http://wxpusher.zjiecode.com"):
        self.base_url = base_url
        self.token = token or os.environ["WXPUSHER_TOKEN"]
​
    async def send_message(
        self,
        content,
        summary: Optional[str] = None,
        content_type: int = 1,
        topic_ids: Optional[list[int]] = None,
        uids: Optional[list[int]] = None,
        verify: bool = False,
        url: Optional[str] = None,
    ):
        payload = {
            "appToken": self.token,
            "content": content,
            "summary": summary,
            "contentType": content_type,
            "topicIds": topic_ids or [],
            "uids": uids or os.environ["WXPUSHER_UIDS"].split(","),
            "verifyPay": verify,
            "url": url,
        }
        url = f"{self.base_url}/api/send/message"
        return await self._request("POST", url, json=payload)
​
    async def _request(self, method, url, **kwargs):
        async with aiohttp.ClientSession() as session:
            async with session.request(method, url, **kwargs) as response:
                response.raise_for_status()
                return await response.json()


然后我们写一个callback

async def wxpusher_callback(msg: Message):
    client = WxPusherClient()
    await client.send_message(msg.content, content_type=3)

获取到APIToken后,我们再去拿关注我们的用户的UID

然后我们写完整的代码

import asyncio
import os
from typing import Any, AsyncGenerator, Awaitable, Callable, Dict, Optional
​
import aiohttp
import discord
from aiocron import crontab
from bs4 import BeautifulSoup
from pydantic import BaseModel, Field
from pytz import BaseTzInfo
​
from metagpt.actions.action import Action
from metagpt.config2 import config
from metagpt.logs import logger
from metagpt.roles import Role
from metagpt.schema import Message
​
# fix SubscriptionRunner not fully defined
from metagpt.environment import Environment as _  # noqa: F401
​
​
# 订阅模块,可以from metagpt.subscription import SubscriptionRunner导入,这里贴上代码供参考
class SubscriptionRunner(BaseModel):
    """A simple wrapper to manage subscription tasks for different roles using asyncio.
    Example:
        >>> import asyncio
        >>> from metagpt.subscription import SubscriptionRunner
        >>> from metagpt.roles import Searcher
        >>> from metagpt.schema import Message
        >>> async def trigger():
        ...     while True:
        ...         yield Message("the latest news about OpenAI")
        ...         await asyncio.sleep(3600 * 24)
        >>> async def callback(msg: Message):
        ...     print(msg.content)
        >>> async def main():
        ...     pb = SubscriptionRunner()
        ...     await pb.subscribe(Searcher(), trigger(), callback)
        ...     await pb.run()
        >>> asyncio.run(main())
    """
​
    tasks: Dict[Role, asyncio.Task] = Field(default_factory=dict)
​
    class Config:
        arbitrary_types_allowed = True
​
    async def subscribe(
            self,
            role: Role,
            trigger: AsyncGenerator[Message, None],
            callback: Callable[
                [
                    Message,
                ],
                Awaitable[None],
            ],
    ):
        """Subscribes a role to a trigger and sets up a callback to be called with the role's response.
        Args:
            role: The role to subscribe.
            trigger: An asynchronous generator that yields Messages to be processed by the role.
            callback: An asynchronous function to be called with the response from the role.
        """
        loop = asyncio.get_running_loop()
​
        async def _start_role():
            async for msg in trigger:
                resp = await role.run(msg)
                await callback(resp)
​
        self.tasks[role] = loop.create_task(_start_role(), name=f"Subscription-{role}")
​
    async def unsubscribe(self, role: Role):
        """Unsubscribes a role from its trigger and cancels the associated task.
        Args:
            role: The role to unsubscribe.
        """
        task = self.tasks.pop(role)
        task.cancel()
​
    async def run(self, raise_exception: bool = True):
        """Runs all subscribed tasks and handles their completion or exception.
        Args:
            raise_exception: _description_. Defaults to True.
        Raises:
            task.exception: _description_
        """
        while True:
            for role, task in self.tasks.items():
                if task.done():
                    if task.exception():
                        if raise_exception:
                            raise task.exception()
                        logger.opt(exception=task.exception()).error(
                            f"Task {task.get_name()} run error"
                        )
                    else:
                        logger.warning(
                            f"Task {task.get_name()} has completed. "
                            "If this is unexpected behavior, please check the trigger function."
                        )
                    self.tasks.pop(role)
                    break
            else:
                await asyncio.sleep(1)
​
​
# Actions 的实现
TRENDING_ANALYSIS_PROMPT = """# Requirements
You are a GitHub Trending Analyst, aiming to provide users with insightful and personalized recommendations based on the latest
GitHub Trends. Based on the context, fill in the following missing information, generate engaging and informative titles, 
ensuring users discover repositories aligned with their interests.
​
# The title about Today's GitHub Trending
## Today's Trends: Uncover the Hottest GitHub Projects Today! Explore the trending programming languages and discover key domains capturing developers' attention. From ** to **, witness the top projects like never before.
## The Trends Categories: Dive into Today's GitHub Trending Domains! Explore featured projects in domains such as ** and **. Get a quick overview of each project, including programming languages, stars, and more.
## Highlights of the List: Spotlight noteworthy projects on GitHub Trending, including new tools, innovative projects, and rapidly gaining popularity, focusing on delivering distinctive and attention-grabbing content for users.
---
# Format Example
​
```
# [Title]
​
## Today's Trends
Today, ** and ** continue to dominate as the most popular programming languages. Key areas of interest include **, ** and **.
The top popular projects are Project1 and Project2.
​
## The Trends Categories
1. Generative AI
    - [Project1](https://github/xx/project1): [detail of the project, such as star total and today, language, ...]
    - [Project2](https://github/xx/project2): ...
...
​
## Highlights of the List
1. [Project1](https://github/xx/project1): [provide specific reasons why this project is recommended].
...
```
​
---
# Github Trending
{trending}
"""
​
​
class CrawlOSSTrending(Action):
    async def run(self, url: str = "https://github.com/trending"):
        async with aiohttp.ClientSession() as client:
            async with client.get(url, proxy=config.global_proxy) as response:
                response.raise_for_status()
                html = await response.text()
​
        soup = BeautifulSoup(html, "html.parser")
​
        repositories = []
​
        for article in soup.select("article.Box-row"):
            repo_info = {}
​
            repo_info["name"] = (
                article.select_one("h2 a")
                .text.strip()
                .replace("\n", "")
                .replace(" ", "")
            )
            repo_info["url"] = (
                    "https://github.com" + article.select_one("h2 a")["href"].strip()
            )
​
            # Description
            description_element = article.select_one("p")
            repo_info["description"] = (
                description_element.text.strip() if description_element else None
            )
​
            # Language
            language_element = article.select_one(
                'span[itemprop="programmingLanguage"]'
            )
            repo_info["language"] = (
                language_element.text.strip() if language_element else None
            )
​
            # Stars and Forks
            stars_element = article.select("a.Link--muted")[0]
            forks_element = article.select("a.Link--muted")[1]
            repo_info["stars"] = stars_element.text.strip()
            repo_info["forks"] = forks_element.text.strip()
​
            # Today's Stars
            today_stars_element = article.select_one(
                "span.d-inline-block.float-sm-right"
            )
            repo_info["today_stars"] = (
                today_stars_element.text.strip() if today_stars_element else None
            )
​
            repositories.append(repo_info)
​
        return repositories
​
​
class AnalysisOSSTrending(Action):
    async def run(self, trending: Any):
        return await self._aask(TRENDING_ANALYSIS_PROMPT.format(trending=trending))
​
​
# Role实现
class OssWatcher(Role):
    def __init__(
            self,
            name="Codey",
            profile="OssWatcher",
            goal="Generate an insightful GitHub Trending analysis report.",
            constraints="Only analyze based on the provided GitHub Trending data.",
    ):
        super().__init__(name=name, profile=profile, goal=goal, constraints=constraints)
        self.actions = [CrawlOSSTrending(), AnalysisOSSTrending()]
        self._set_react_mode(react_mode="by_order")
​
    async def _act(self) -> Message:
        logger.info(f"{self._setting}: ready to {self.rc.todo}")
        # By choosing the Action by order under the hood
        # todo will be first SimpleWriteCode() then SimpleRunCode()
        todo = self.rc.todo
​
        msg = self.get_memories(k=1)[0]  # find the most k recent messages
        result = await todo.run(msg.content)
​
        msg = Message(content=str(result), role=self.profile, cause_by=type(todo))
        self.rc.memory.add(msg)
        return msg
​
# Trigger
class GithubTrendingCronTrigger:
    def __init__(
            self,
            spec: str,
            tz: Optional[BaseTzInfo] = None,
            url: str = "https://github.com/trending",
    ) -> None:
        self.crontab = crontab(spec, tz=tz)
        self.url = url
​
    def __aiter__(self):
        return self
​
    async def __anext__(self):
        await self.crontab.next()
        return Message(content=self.url)
​
​
# callback
async def discord_callback(msg: Message):
    intents = discord.Intents.default()
    intents.message_content = True
    client = discord.Client(intents=intents, proxy=config.global_proxy)
    token = os.environ["DISCORD_TOKEN"]
    channel_id = int(os.environ["DISCORD_CHANNEL_ID"])
    async with client:
        await client.login(token)
        channel = await client.fetch_channel(channel_id)
        lines = []
        for i in msg.content.splitlines():
            if i.startswith(("# ", "## ", "### ")):
                if lines:
                    await channel.send("\n".join(lines))
                    lines = []
            lines.append(i)
​
        if lines:
            await channel.send("\n".join(lines))
​
​
class WxPusherClient:
    def __init__(
            self,
            token: Optional[str] = None,
            base_url: str = "http://wxpusher.zjiecode.com",
    ):
        self.base_url = base_url
        self.token = token or os.environ["WXPUSHER_TOKEN"]
​
    async def send_message(
            self,
            content,
            summary: Optional[str] = None,
            content_type: int = 1,
            topic_ids: Optional[list[int]] = None,
            uids: Optional[list[int]] = None,
            verify: bool = False,
            url: Optional[str] = None,
    ):
        payload = {
            "appToken": self.token,
            "content": content,
            "summary": summary,
            "contentType": content_type,
            "topicIds": topic_ids or [],
            "uids": uids or os.environ["WXPUSHER_UIDS"].split(","),
            "verifyPay": verify,
            "url": url,
        }
        url = f"{self.base_url}/api/send/message"
        return await self._request("POST", url, json=payload)
​
    async def _request(self, method, url, **kwargs):
        async with aiohttp.ClientSession() as session:
            async with session.request(method, url, **kwargs) as response:
                response.raise_for_status()
                return await response.json()
​
​
async def wxpusher_callback(msg: Message):
    client = WxPusherClient()
    await client.send_message(msg.content, content_type=3)
​
​
# 运行入口,
async def main(spec: str = "0 9 * * *", discord: bool = True, wxpusher: bool = True):
    callbacks = []
    if discord:
        callbacks.append(discord_callback)
​
    if wxpusher:
        callbacks.append(wxpusher_callback)
​
    if not callbacks:
​
        async def _print(msg: Message):
            print(msg.content)
​
        callbacks.append(_print)
​
    async def callback(msg):
        await asyncio.gather(*(call(msg) for call in callbacks))
​
    runner = SubscriptionRunner()
    await runner.subscribe(OssWatcher(), GithubTrendingCronTrigger(spec), callback)
    await runner.run()
​
​
if __name__ == "__main__":
    import fire
​
    fire.Fire(main)

然后执行BASH指令设置环境变量

set WXPUSHER_TOKEN=AT_V9j0GICEd5EYULoRGNHvYXbzPpZFmHZV
set WXPUSHER_UIDS= UID_CPIWGzB962vxr0DtC4cO88jd5ePW
​

如果linux那就用export

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值