Wikipedia API Python教程

最新推荐文章于 2024-09-09 23:28:21 发布

culing2941

最新推荐文章于 2024-09-09 23:28:21 发布

阅读量2.8k

点赞数 2

文章标签： python 机器学习 java 人工智能 linux

原文链接：https://www.thecrazyprogrammer.com/2018/05/wikipedia-api-python-tutorial.html

版权

In this tutorial I’ll show you how we can implement Wikipedia API in Python to fetch information from a Wikipedia article. Let’s see how to do it.

在本教程中，我将向您展示如何在Python中实现Wikipedia API以从Wikipedia文章中获取信息。让我们来看看如何做。

First we have to install Wikipedia. To install it, open your command prompt or terminal and type this command.

首先，我们必须安装Wikipedia。要安装它，请打开命令提示符或终端，然后键入此命令。

pip install wikipedia

That’s all we have to do. Now we can fetch the data from Wikipedia very easily.

这就是我们要做的。现在，我们可以很容易地从Wikipedia中获取数据。

获取文章摘要 (To Get the Summary of an Article)

import wikipedia
print(wikipedia.summary("google"))

It will fetch the summary of google from wikipedia and print it on the screen.

它将从维基百科获取Google的摘要，并将其打印在屏幕上。

从文章摘要获得给定数量的句子 (To Get a Given Number of Sentences From the Summary of an Article)

import wikipedia
print(wikipedia.summary("google", sentences=1))

Output:

输出：

Google LLC is an American multinational technology company that specializes in Internet-related services and products, which include online advertising technologies, search engine, cloud computing, software, and hardware.

Google LLC是一家美国跨国技术公司，专门研究与Internet相关的服务和产品，其中包括在线广告技术，搜索引擎，云计算，软件和硬件。

Same way you can pass any number as a parameter to get the number of sentences you want.

您可以通过任何方式将任何数字作为参数来获取所需句子的数量。

更改文章的语言 (To Change the Language of the Article)

import wikipedia
wikipedia.set_lang("fr")
print(wikipedia.summary("google", sentences=1))

Output:

输出：

Google (prononcé [ˈguːgəl]) est une entreprise américaine de services technologiques fondée en 1998 dans la Silicon Valley, en Californie, par Larry Page et Sergueï Brin, créateurs du moteur de recherche Google.

Google(prononcé)，是1998年在美国硅谷，加利福尼亚州，拉里·佩奇(Larry Page etSergueïBrin)，研究者谷歌(Cerateurs du moteur de recherche)创立的美国企业服务技术基金会。

Here fr stands for French. You can use any other code instead of fr to get the information in other language. But make sure that the Wikipedia should have that article in the language you want.

在这里fr代表法语。您可以使用任何其他代码而不是fr来获取其他语言的信息。但是请确保Wikipedia应该以您想要的语言显示该文章。

To see the code of other languages open this link https://www.loc.gov/standards/iso639-2/php/code_list.php

要查看其他语言的代码，请打开此链接https://www.loc.gov/standards/iso639-2/php/code_list.php

搜索以获取文章标题 (Search to Get the Titles of the Articles)

import wikipedia
print(wikipedia.search("google"))

Output:

输出：

[‘Google’, ‘Google+’, ‘Google Maps’, ‘Google Search’, ‘Google Translate’, ‘Google Chrome’, ‘.google’, ‘Google Earth’, ‘Gmail’, ‘Google Scholar’]

['Google'，'Google +'，'Google Maps'，'Google Search'，'Google Translate'，'Google Chrome'，'。google'，'Google Earth'，'Gmail'，'Google Scholar' ]

The method search() will return a list which consist of all the article’s titles that we can open.

search()方法将返回一个列表，其中包含我们可以打开的所有文章标题。

获取文章的URL (To Get the URL of the Article)

import wikipedia
page = wikipedia.page("google")
print(page.url)

Output:

输出：

https://en.wikipedia.org/wiki/Google

First wikipedia.page() will store all the relevant information in variable page. Then we can use the url property to get the link of the page.

第一个Wikipedia.page()将所有相关信息存储在变量页面中。 然后，我们可以使用url属性获取页面的链接。

获取文章标题 (To Get the Title of the Article)

import wikipedia
page = wikipedia.page("google")
print(page.title)

Output:

输出：

Google

谷歌

获得完整的文章 (To Get Complete Article)

import wikipedia
page = wikipedia.page("google")
print(page.content)

Complete article from starting to end will be printed on the screen.

从头到尾的完整文章将被打印在屏幕上。

获取文章中包含的图像 (To Get the Images Included in Article )

import wikipedia
page = wikipedia.page("google")
print(page.images[0])

Output:

输出：

https://upload.wikimedia.org/wikipedia/commons/1/1d/20_colleges_with_the_most_alumni_at_Google.png

So it will return us the URL of the particular image present at index 0. To fetch another image use 1, 2, 3, etc, according to images present in the article.

因此，它将返回给我们索引为0的特定图像的URL。要获取另一个图像，请根据文章中显示的图像使用1、2、3等。

But if you want image to be downloaded into your local directory instead of printing the result then we can use urllib. Here’s the program which will help you to download an image from the link.

但是，如果您希望将图像下载到本地目录中而不是打印结果，则可以使用urllib。 这是可以帮助您从链接下载图像的程序。

import urllib.request
import wikipedia
page = wikipedia.page("Google")
image_link = page.images[0]
urllib.request.urlretrieve(image_link , "local-filename.jpg")

The image present at index 0 will be saved as local-filename.jpg into the same directory where your program is saved. The above program will work for python 3.x, if you’re using Python 2.x then please see the program below.

位于索引0处的图像将作为local-filename.jpg保存到保存程序的同一目录中。上面的程序适用于python 3.x，如果您使用的是Python 2.x，请参见下面的程序。

import urllib
import wikipedia
page = wikipedia.page("Google")
image_link = page.images[0]
urllib.urlretrieve(image_link , "local-filename.jpg")

That’s all for this article, for more information please visit https://pypi.org/project/wikipedia/

这就是本文的全部内容，有关更多信息，请访问https://pypi.org/project/wikipedia/

If you’ve any problem or suggestion related to wikipedia python api then please comment below.

如果您有任何有关Wikipedia python API的问题或建议，请在下面发表评论。

翻译自: https://www.thecrazyprogrammer.com/2018/05/wikipedia-api-python-tutorial.html

culing2941

关注

2
点赞
踩
9

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫