用于YouTube数据API的简单python包装器3 0

YouTube is one of the main sources of education, entertainment, advertisement, and much more. YouTube has so much data that a data scientist can use to run interesting projects or build products. If you are a novice or an expert data scientist, you definitely heard about sentiment analysis, one of the main applications of natural language processing. Sentiment analysis is being used, for example, to monitor social media or customer reviews.

YouTube是教育,娱乐,广告等的主要来源之一。 YouTube拥有大量数据,数据科学家可用来运行有趣的项目或构建产品。 如果您是新手还是数据专家,那么您肯定听说过情感分析,这是自然语言处理的主要应用之一。 情感分析例如用于监视社交媒体或客户评论。

When you search online, you can easily find several datasets for sentiment analysis that collected Amazon product reviews or IMDB movie reviews. Although, there are not many API services that let you work with online data. Several weeks ago, I decided to run a sentiment analysis project on YouTube video comments.

在线搜索时,您可以轻松地找到用于情感分析的多个数据集,这些数据集收集了亚马逊产品评论或IMDB电影评论。 虽然,没有多少API服务可让您使用在线数据。 几周前,我决定对YouTube视频评论进行情绪分析项目。

YouTube Data API service can be a bit confusing for unexperienced data scientists. That is why I decided to write a user-friendly Python wrapper to expedite development in the data science community.

YouTube数据API服务可能会使经验不足的数据科学家感到困惑。 这就是为什么我决定编写一个用户友好的Python包装程序来加快数据科学社区的开发。

I configured everything and conducted my experiments. Fortunately, Google has introduced a powerful API to search for YouTube videos matching specific search criteria. Nevertheless, I found that their data service can be a bit confusing for unexperienced data scientists. That is why I decided to write a user-friendly Python wrapper named youtube-easy-api for YouTube Data API. This module helps the community to run more interesting data science projects faster. In this article, I want to show you how to use this module. Hope you enjoy it.

我配置了所有内容并进行了实验。 幸运的是,Google引入了功能强大的API来搜索符合特定搜索条件的YouTube视频。 不过,我发现他们的数据服务可能会使经验不足的数据科学家感到困惑。 因此,我决定为YouTube数据API编写一个名为youtube-easy-api的用户友好型Python包装器。 该模块帮助社区更快地运行更多有趣的数据科学项目。 在本文中,我想向您展示如何使用此模块。 希望你喜欢它。

获取您的Google凭据 (Get Your Google Credentials)

First, you must set up your credentials before being able to use this module. If you have your Google API key, you can skip this section; otherwise, check out the video below. You must pass the API_KEY when you want to initialize the youtube-easy-api module.

首先,必须先设置凭据,然后才能使用此模块。 如果您拥有Google API密钥,则可以跳过此部分; 否则,请查看下面的视频。 要初始化youtube-easy-api模块时,必须传递API_KEY

从PyPI服务器安装模块 (Install the module from PyPI server)

First, you should install the prerequisites libraries developed by Google as follows:google-api-python-client , google-auth-oauthlib , google from PyPI server. Then, you can install the youtube-easy-api module using thepip command, similar to the above libraries.

首先,您应该按照以下步骤安装由Google开发的必备软件库: google-api-python-clientgoogle-auth-oauthlib ,来自PyPI服务器的google 。 然后,您可以使用pip命令安装youtube-easy-api模块,类似于上述库。

pip3 install youtube-easy-api

现在,您可以使用模块了…… (Now, you are ready to use the module …)

The youtube-easy-api module currently supports two methods search_videos and get_metadata.

youtube-easy-api模块当前支持两种方法search_videosget_metadata

-如何在YouTube视频中搜索 (- How to search among YouTube videos)

You can specify a keyword and search it among all the YouTube videos using search_videos method. This method takes a search_keyword and returns an ordered list of dictionaries each of which contains video_id , title and channel. You can find an example below. The API_KEY must be passed at the initialization step.

您可以指定一个关键字,然后使用search_videos方法在所有YouTube视频中进行search_videos 。 此方法使用search_keyword并返回字典的有序列表,每个字典包含video_idtitlechannel 。 您可以在下面找到一个示例。 必须在初始化步骤中传递API_KEY

from youtube_easy_api.easy_wrapper import *
easy_wrapper = YoutubeEasyWrapper()
easy_wrapper.initialize(api_key=API_KEY)
results = easy_wrapper.search_videos(search_keyword='python',
order='relevance')

The order parameter specifies the sorting method used in the API response. The default value is relevance (i.e., sorted in relevance to the search query). According to the original API document, the other acceptable values are as follows.

order参数指定API响应中使用的排序方法。 默认值是relevance (即,与搜索查询相关性排序)。 根据原始API文档,其他可接受的值如下。

  • date — sorted in reverse chronological order (the date of creation)

    date -按时间倒序排序(创建日期)

  • rating — sorted from highest to lowest rating

    rating -从最高到最低排序

  • viewCount — sorted from highest to lowest number of views

    viewCount —按viewCount排序

As mentioned, the search_videos method returns a list of dictionaries. If you use the above call, the first element of the results is as follows.

如前所述, search_videos方法返回字典列表。 如果使用上述调用,结果的第一个元素如下。

results[0]['video_id'] = 'rfscVS0vtbw'
results[0]['title'] = 'Learn Python - Full Course for Beginners ...[Tutorial]'
results[0]['channel'] = 'freeCodeCamp.org'

-如何提取YouTube视频的元数据 (- How to extract the metadata of a YouTube video)

When you have the video_id, you can extract all the relevant metadata including title, comments, and stats. Note that the video_id is also used in the URL. So, you may retrieve the video_idusing thesearch_videos method, a web scraping tool, or a manual selection. You can find an example below. The API_KEY must be passed at the initialization step.

有了video_id ,您可以提取所有相关的元数据,包括标题,评论和统计信息。 请注意,URL中也使用了video_id 。 因此,您可以使用search_videos方法,网络抓取工具或手动选择来检索video_id 。 您可以在下面找到一个示例。 必须在初始化步骤中传递API_KEY

from youtube_easy_api.easy_wrapper import *
easy_wrapper = YoutubeEasyWrapper()
easy_wrapper.initialize(api_key=API_KEY)
metadata = easy_wrapper.get_metadata(video_id=VIDEO_ID)

In the end, you can extract all the YouTube video metadata such as title , description , statistics , contentDetails , and comments stored in the metadata dictionary; the output of the get_metadata method. You can find an example below.

最后,您可以提取存储在metadata字典中的所有YouTube视频元数据,例如titledescriptionstatisticscontentDetailscommentsget_metadata方法的输出。 您可以在下面找到一个示例。

metadata = easy_wrapper.get_metadata(video_id='f3lUEnMaiAU')
print(metadata['comments][0])'Jack ma is like your drunk uncle trying to teach you a life lesson. Elon musk is like a robot trying to imitate a human'

Now, you have access to all the data that you need for a sentiment analysis project. One more thing, you have a daily limit to call this service set by Google. So, if you exceed that limit you will encounter an error. You must wait for the next day or subscribe to increase your daily limit.

现在,您可以访问情感分析项目所需的所有数据。 还有一件事,您每天都可以致电Google设置的这项服务。 因此,如果您超过该限制,则会遇到错误。 您必须等待第二天或订阅以增加您的每日限额。

To build a machine learning or ML product, you must train a large number of models with different parameter configurations. A data scientist or machine learning engineer must manage this complex process. If you want to learn more about how to manage this complex process, I would recommend reading this article. Kudos 😊

要构建机器学习或ML产品,您必须训练大量具有不同参数配置的模型。 数据科学家或机器学习工程师必须管理这个复杂的过程。 如果您想了解有关如何管理此复杂过程的更多信息,建议阅读本文。 荣誉😊

翻译自: https://towardsdatascience.com/an-easy-python-wrapper-for-youtube-data-api-3-0-a0f1b9f4c964

  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值