挑战内容
GitHub 上的每一个仓库默认都会有 Issues 页面,Issues 相当于仓库的问题追踪系统,开发者的功能需要,用户找到的 BUG 都可以提交为 Issues。例如,著名数据分析库 Pandas 其托管在 GitHub 上的地址为:
- 仓库地址:https://github.com/pandas-dev/pandas
- Issues 地址:https://github.com/pandas-dev/pandas/issues
- Issues API 地址:https://api.github.com/repos/pandas-dev/pandas/issues
其中 Issues API 返回 JSON 格式的数据,它最多能返回 Issues 页面的最近 30 条数据。单条数据示例如下:
{
"url": "https://api.github.com/repos/pandas-dev/pandas/issues/22658",
"repository_url": "https://api.github.com/repos/pandas-dev/pandas",
"labels_url": "https://api.github.com/repos/pandas-dev/pandas/issues/22658/labels{/name}",
"comments_url": "https://api.github.com/repos/pandas-dev/pandas/issues/22658/comments",
"events_url": "https://api.github.com/repos/pandas-dev/pandas/issues/22658/events",
"html_url": "https://github.com/pandas-dev/pandas/pull/22658",
"id": 358602608,
"node_id": "MDExOlB1bGxSZXF1ZXN0MjE0Mjk4MzQ0",
"number": 22658,
"title": "DOC iteritems docstring update and examples",
"user": {
"login": "Ecboxer",
"id": 20912214,
"node_id": "MDQ6VXNlcjIwOTEyMjE0",
"avatar_url": "https://avatars3.githubusercontent.com/u/20912214?v=4",
"gravatar_id": "",
"url": "https://api.github.com/users/Ecboxer",
"html_url": "https://github.com/Ecboxer",
"followers_url": "https://api.github.com/users/Ecboxer/followers",
"following_url": "https://api.github.com/users/Ecboxer/following{/other_user}",
"gists_url": "https://api.github.com/users/Ecboxer/gists{/gist_id}",
"starred_url": "https://api.github.com/users/Ecboxer/starred{/owner}{/repo}",
"subscriptions_url": "https://api.github.com/users/Ecboxer/subscriptions",
"organizations_url": "https://api.github.com/users/Ecboxer/orgs",
"repos_url": "https://api.github.com/users/Ecboxer/repos",
"events_url": "https://api.github.com/users/Ecboxer/events{/privacy}",
"received_events_url": "https://api.github.com/users/Ecboxer/received_events",
"type": "User",
"site_admin": false
},
"labels": [],
"state": "open",
"locked": false,
"assignee": null,
"assignees": [],
"milestone": null,
"comments": 2,
"created_at": "2018-09-10T12:34:25Z",
"updated_at": "2018-09-10T17:10:02Z",
"closed_at": null,
"author_association": "NONE",
"pull_request": {
"url": "https://api.github.com/repos/pandas-dev/pandas/pulls/22658",
"html_url": "https://github.com/pandas-dev/pandas/pull/22658",
"diff_url": "https://github.com/pandas-dev/pandas/pull/22658.diff",
"patch_url": "https://github.com/pandas-dev/pandas/pull/22658.patch"
},
"body": "Updated iteritems docstring to start with an infinitive and added a short example\r\n\r\n- [ ] closes #xxxx\r\n- [ ] tests added / passed\r\n- [ ] passes `git diff upstream/master -u -- \"*.py\" | flake8 --diff`\r\n- [ ] whatsnew entry\r\n"
},
本次挑战中,你需要在 ~/Code/github_data.py
文件中编写一个函数 issues
,issues
函数接受 1
个参数 repo
用于指定传入的仓库名称(例如 Pandas 仓库的名称为:pandas-dev/pandas
)。
你需要补充 issues
函数,使其能够获取到指定名称仓库最近的 issues 条目,条目数量以 Issues API 地址返回为准,不一定是 30 条。然后,将 JSON 处理成 DataFrame 后作为 issues
函数返回值 issues_df
。规定 DataFrame 的样式为(示例前 3 条):
number | title | user_name | |
---|---|---|---|
0 | 22658 | DOC iteritems docstring update and examples | Ecboxer |
1 | 22657 | DOC: Follows ISO 639-1 code | KangYoosam |
2 | 22655 | BUG: Column Offset with to_html(index=False) w... | simonjayhawkins |
其中:
number
: Issues 序号,对应示例 JSON 数据中的number
字段。title
: Issues 名称,对应示例 JSON 数据中的title
字段。user_name
: 提交该 Issues 的用户名,对应示例 JSON 数据中的user.login
字段。
挑战要求
- 代码必须写入
~/Code/github_data.py
文件中。 - 函数名必须是
issues
,并返回issues_df
。 - 测试时请使用
/home/shiyanlou/anaconda3/bin/python
运行github_data.py
,避免出现无相应模块的情况。
挑战代码答案
import requests
import pandas as pd
def issues(repo):
url = "https://api.github.com/repos/{}/issues".format(repo)
issues = requests.get(url)
issues_list = []
for issue in issues.json():
issues_dict = {'number':issue['number'],
'title':issue['title'],
'user_name':issue['user']['login']}
issues_list.append(issues_dict)
issues_df = pd.DataFrame(issues_list)
return issues_df
issues("numpy/numpy")