先看效果
特点:
原理GitHub 可以设置一个同名库,这个库的 `README.md` 会自动显示到首页上
GitHub action 可以触发事件后运行 Python 脚本和 Linux 命令,这个触发事件可以是定时/star/push等
触发事件后,Python 脚本爬虫爬取知乎内容,并修改 `README.md`
Linux 命令push并更新状态。
简单来说,就是这么个流程
详细步骤:
第一步,建立仓库建立一个与你GitHub用户名同名的仓库。例如,我的GitHub用户名是 guofei9987,那么仓库名也是guofei9987
仓库新建文件 `README.md`,填入以下信息
此行文本之后会被覆盖为“知乎状态”
第二步,Python脚本
分为两部分,
1)爬虫爬取自己的知乎数据(点赞/收藏/喜欢)
2)把爬取的内容写入 `README.md`
爬虫部分
from bs4 import BeautifulSoup
import requests
import re
import sys
handle = sys.argv[1]
token = sys.argv[2]
readmePath = sys.argv[3]
# %%
# https://github.com/egrcc/zhihu-python
headers = {
'User-Agent': "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36",
'Host': "www.zhihu.com",
'Origin': "http://www.zhihu.com",
'Pragma': "no-cache",
'Referer': "http://www.zhihu.com/"
}
url = 'https://www.zhihu.com/people/guo-fei-16-12'
r = requests.get(url, headers=headers, verify=False)
soup = BeautifulSoup(r.content, "lxml")
# %%
stars = soup.find_all(name='div', attrs={'class': 'css-vurnku'})
print(stars[1].text) # 点赞、喜欢、收藏
follows = soup.find_all(name='strong', attrs={'class': 'NumberBoard-itemValue'})
print(follows[1].text) # 关注
写入 `README.md` 部分
agree, like, collection = re.findall('[0-9,]+', stars[1].text)
zhihu = '获得{}次赞同,{}次喜欢,{}次收藏,{}个关注'.format(agree, like, collection, follows[1].text)
with open(readmePath, "r") as readme:
content = readme.read()
newContent = re.sub(r"(?<=)[\s\S]*(?=)",
f"\n{zhihu}\n", content)
with open(readmePath, "w") as readme:
readme.write(newContent)
第三步,GitHub Action
name: Get Top Followers
on:
schedule:
- cron: '0 20 * * *'
watch:
types: started
jobs:
top-followers:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Setup python
uses: actions/setup-python@v2
with:
python-version: 3.8
- name: Install dependencies
run: |
python -m pip install --upgrade pip
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
- name: Update README
run: |
python src/getTopFollowers.py ${{ github.repository_owner }} ${{ secrets.GITHUB_TOKEN }} README.md
python src/GetZhihuData.py ${{ github.repository_owner }} ${{ secrets.GITHUB_TOKEN }} README.md
- name: Commit changes
run: |
git config --local user.email "41898282+github-actions[bot]@users.noreply.github.com"
git config --local user.name "github-actions[bot]"
git add -A
git diff-index --quiet HEAD || git commit -m "Update top followers"
- name: Pull changes
run: git pull -r
- name: Push changes
uses: ad-m/github-push-action@master
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
说明这里的触发是每天20点(美国时间),事件也可以添加 star/push等
push用到的token用到secrets.GITHUB_TOKEN,防止明文泄露
到这里全部步骤就完事儿了,到时间会自动把知乎状态更新上去。
扩展:可以用 html 语言把那一行文本做的很酷炫
可以爬取更多内容,例如单片回答之类的
甚至可以增量存储,而不是直接替换,来画出趋势图。