【Lecture 3.3】从网络请求数据·Project

从网络请求数据·Project

You’ll be getting and using data from two different APIs. Process all of that data in a step by step way, and achieve a new result,您将需要阅读并理解每个API随附的文档。 您阅读和理解该文档的目的是从您需要的文档中提取信息,以便向每个API发出请求。

Questions

该项目将带您完成将来自两个不同API的数据进行融合以提出电影建议的过程。

  1. The TasteDive API lets you provide a movie (or bands, TV shows, etc.) as a query input, and returns a set of related items.
  2. The OMDB API lets you provide a movie title as a query input and get back data about the movie, including scores from various review sites (Rotten Tomatoes, IMDB, etc.).

你会把这两个放在一起。您将使用TasteDive来获取相关电影的完整标题列表。您将组合相关电影的结果列表,并根据它们的烂番茄分数对它们进行排序(这将需要调用OMDB API)。)

*为避免速率限制和站点可访问性问题,我们提供了一个缓存文件,其中包含您需要对OMDB和TasteDive进行的所有查询的结果。*只需使用requests_with_caching.get()而不是 requests.get()

如果您遇到问题,您可能没有正确格式化您的查询,或者您可能没有要求我们的缓存中存在的数据。我们将尽力提供尽可能多的信息,以帮助指导您对缓存中存在的数据进行查询。

Your first task will be to fetch data from TasteDive. The documentation for the API is at https://tastedive.com/read/api.

定义一个函数,叫做 get_movies_from_tastedive()。它应该接受一个输入参数,一个电影或音乐艺术家的名字的字符串。该函数应该返回与该字符串相关联的5个TasteDive结果;确定只返回电影,不返回其他类型的媒体。 It will be a python dictionary with just one key, ‘Similar’.

Try invoking your function with the input “Black Panther”.

提示: 确保只包含q、type和limit作为参数,以便从缓存中提取数据。如果包含任何其他参数,则该函数将无法识别您试图从缓存中提取的数据。请记住,您不需要api键来完成项目,因为所有数据都将在缓存中找到。

缓存包含以下查询的数据:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Ms8tt1ke-1592397271447)(C:\Users\ZhuXuxu\AppData\Roaming\Typora\typora-user-images\image-20200617200252357.png)]

image-20200617200311087

image-20200617203051774

The returned object contains, under the Similar key, the item(s) that were searched for (a list in the Info key) and the recommended items (a list in the Results key). Each item in a list has the Name and Type keys. The type can be music, movie, show, book, author or game.

。在这种情况下,结构非常简单,它是一个JSON格式的字典列表,其中每个字典提供一个满足查询中硬约束的单词

开始


# some invocations that we use in the automated tests; uncomment these if you are getting errors and want better error messages
# get_movies_from_tastedive("Bridesmaids")
# get_movies_from_tastedive("Black Panther")


import requests_with_caching
import json

def get_movies_from_tastedive(movie_string):
    baseurl = 'https://tastedive.com/api/similar'
    
    # 请求的参数
    params_dict = {}
    params_dict['q'] = movie_string
    params_dict['type'] = 'movies'
    params_dict['limit'] = 5
    
    # 默认返回的就是 json
    # 返回的对象是requst
    tastedive_resp = requests_with_caching.get(baseurl, params = params_dict)
    # 按照api网站 可以整理为 json对象
    # print(type(tastedive_resp.json())) 字典类型
    # print(tastedive_resp.url) 返回完全的地址,输入地址栏可以看到字典的结果(见下)
    return tastedive_resp.json()
    
get_movies_from_tastedive("Black Panther")
    
https://tastedive.com/api/similar?q=Black+Panther&type=movies&limit=5

{"Similar": {"Info": [{"Name": "Black Panther", "Type": "movie"}], "Results": [{"Name": "Captain Marvel", "Type": "movie"}, {"Name": "Avengers: Infinity War", "Type": "movie"}, {"Name": "Ant-Man And The Wasp", "Type": "movie"}, {"Name": "Deadpool 2", "Type": "movie"}, {"Name": "Jumanji: Welcome To The Jungle", "Type": "movie"}]}}

任务2

Please copy the completed function from above into this active code window. Next, you will need to write a function that extracts just the list of movie titles from a dictionary returned by get_movies_from_tastedive. Call it 函数 extract_movie_titles.

在此之前首先测试了 返回的 json数据的格式(dict)查找键(keys,只有一个),打印输出(打印了全部,一般来说如果数据很大,先打印前100个字符串?),找到数据,确认索引。


# some invocations that we use in the automated tests; uncomment these if you are getting errors and want better error messages
# extract_movie_titles(get_movies_from_tastedive("Tony Bennett"))
# extract_movie_titles(get_movies_from_tastedive("Black Panther"))
import requests_with_caching
import json

def get_movies_from_tastedive(movie_string):
    baseurl = 'https://tastedive.com/api/similar'
    
    # 请求的参数
    params_dict = {}
    params_dict['q'] = movie_string
    params_dict['type'] = 'movies'
    params_dict['limit'] = 5
    
    # 默认返回的就是 json
    # 返回的对象是requst
    tastedive_resp = requests_with_caching.get(baseurl, params = params_dict)
    # 按照api网站 可以整理为 json对象
    # print(type(tastedive_resp.json())) 字典类型
    # print(tastedive_resp.url) 返回完全的地址,输入地址栏可以看到字典的结果
    return tastedive_resp.json()

def extract_movie_titles(getmovie_respon):
    mov_lst = getmovie_respon['Similar']['Results']
    resp = [similar_mov['Name'] for similar_mov in mov_lst if True]
    #for similar_mov in mov_lst:
    #    similar_mov['Name']
    return resp


任务三

Please copy the completed functions from the two code windows above into this active code window. Next, you’ll write a function, called get_related_titles. It takes a list of movie titles as input. It gets five related movies for each from TasteDive, extracts the titles for all of them, and combines them all into a single list. Don’t include the same movie twice.

q: 由一系列(至少一个)乐队,电影,电视节目,播客,书籍,作者和/或游戏组成,并以逗号分隔。 有时,在查询中指定某种资源的类型很有用(例如,电影和书本具有相同的标题)。 您可以使用“ band:”,“ movie:”,“ show:”,“ podcast:”,“ book:”,“ author:”或“ game:”运算符for example: “band:underworld, movie:harry potter, book:trainspotting”.

技巧 如何去除列表中的重复元素?

先把列表转换成 集合,再把 集合转换成列表!!

Python3 集合 集合(set)是一个无序的不重复元素序列。 可以使用大括号 { } 或者 set() 函数创建集合,注意:创建一个空集合必须用 set() 而不是 { },因为 { } 是用来创建一个空字典。

https://www.geeksforgeeks.org/append-extend-python/

参考来源

def get_related_titles(movie_list):
    li = []
    for movie in movie_list:
        li.extend(extract_movie_titles(get_movies_from_tastedive(movie)))
    return list(set(li))

# some invocations that we use in the automated tests; uncomment these if you are getting errors and want better error messages
# get_related_titles(["Black Panther", "Captain Marvel"])
# get_related_titles([])

import requests_with_caching
import json

def get_movies_from_tastedive(movie_string):
    baseurl = 'https://tastedive.com/api/similar'
    
    # 请求的参数
    params_dict = {}
    params_dict['q'] = movie_string
    params_dict['type'] = 'movies'
    params_dict['limit'] = 5
    
    # 默认返回的就是 json
    # 返回的对象是requst
    tastedive_resp = requests_with_caching.get(baseurl, params = params_dict)
    # 按照api网站 可以整理为 json对象
    # print(type(tastedive_resp.json())) 字典类型
    # print(tastedive_resp.url) 返回完全的地址,输入地址栏可以看到字典的结果
    return tastedive_resp.json()

def extract_movie_titles(getmovie_respon):
    mov_lst = getmovie_respon['Similar']['Results']
    resp = [similar_mov['Name'] for similar_mov in mov_lst if True]
    #for similar_mov in mov_lst:
    #    similar_mov['Name']
    return resp

def get_related_titles(movie_list):
    lst = []
    for mov_tlt in movie_list:
        lst = lst + extract_movie_titles(get_movies_from_tastedive(mov_tlt))
    return(list(set(lst)))
    
get_related_titles(["Black Panther", "Captain Marvel"])

任务四

您的下一个任务是从OMDB中获取数据。 API的文档位于 https://www.omdbapi.com/

Define a function called get_movie_data.

t takes in one parameter which is a string that should represent the title of a movie you want to search. The function should return a dictionary with information about that movie.

Again, use requests_with_caching.get(). For the queries on movies that are already in the cache, you won’t need an api key.

You will need to provide the following keys: t and r. As with the TasteDive cache, be sure to only include those two parameters in order to extract existing data from the cache.


# some invocations that we use in the automated tests; uncomment these if you are getting errors and want better error messages
# get_movie_data("Venom")
# get_movie_data("Baby Mama")
import json
import requests_with_caching

def get_movie_data(mov_tit):
    baseurl = 'http://www.omdbapi.com/'
    params_dict = {}
    params_dict['t'] = mov_tit   # Movie title to search for.
    params_dict['r'] = 'json'
    
    # 返回的对象是requst
    info_dict_resp = requests_with_caching.get(baseurl, params = params_dict)
    # 可以整理为 json对象
    # print(type(info_dict_resp.json())) 字典类型
    # print(info_dict_resp.url) 返回完全的地址,输入地址栏可以看到字典的结果(见下)
    return info_dict_resp.json()  


    

任务五

Please copy the completed function from above into this active code window.

Now write a function called get_movie_rating.

It takes an OMDB dictionary result for one movie and extracts the Rotten Tomatoes rating as an integer.

For example, if given the OMDB dictionary for “Black Panther”, it would return 97. If there is no Rotten Tomatoes rating, return 0.

查看 返回的字典得到 key

<class 'dict'>
['Type', 'Title', 'Year', 'Rated', 'Released', 'Runtime', 'Genre', 'Director', 'Writer', 'Actors', 'Plot', 'Language', 'Country', 'Awards', 'Poster', 'Ratings', 'Metascore', 'imdbRating', 'imdbVotes', 'imdbID', 'DVD', 'BoxOffice', 'Production', 'Website', 'Response']

查看 键 Rating 得到

[{'Source': 'Internet Movie Database', 'Value': '6.9/10'}, {'Source': 'Metacritic', 'Value': '35/100'}]

怎么把字符串的百分比转换成整数 https://www.jianshu.com/p/159cf6dc5b86

对于带百分号的数值字符串处理方法

>>> s='12%'
>>> float(s.rstrip('%'))/100
0.12

# some invocations that we use in the automated tests; uncomment these if you are getting errors and want better error messages
# get_movie_rating(get_movie_data("Deadpool 2"))

# some invocations that we use in the automated tests; uncomment these if you are getting errors and want better error messages
# get_movie_data("Venom")
# get_movie_data("Baby Mama")
import json
import requests_with_caching

def get_movie_data(mov_tit):
    baseurl = 'http://www.omdbapi.com/'
    params_dict = {}
    params_dict['t'] = mov_tit   # Movie title to search for.
    params_dict['r'] = 'json'
    
    # 返回的对象是requst
    info_dict_resp = requests_with_caching.get(baseurl, params = params_dict)
    # 可以整理为 json对象
    # print(type(info_dict_resp.json())) 字典类型
    # print(info_dict_resp.url) 返回完全的地址,输入地址栏可以看到字典的结果(见下)
    return info_dict_resp.json()  

# 从上面的 get_movie_data() 返回的 信息的字典中提取 rating 数据
# 【绕进去犯的错误】 这个函数的输入是一个字典(是上一个函数返回的)那么我直接用就行
def get_movie_rating(mov_data_dict):
    # mov_data_dict = get_movie_data(mov_tit)
    # explore dict['Ratings'][1]['Value']
    mov_OMDB = mov_data_dict['Ratings'][1]
    if mov_OMDB['Source'] == 'Rotten Tomatoes':
        mov_OMDB_rating = int(mov_OMDB['Value'].rstrip('%'))
        print(mov_OMDB_rating) 
    else:
        mov_OMDB_rating = 0
        print(mov_OMDB_rating)
    return mov_OMDB_rating


# get_movie_rating(??) 在系统判定的时候 我的输入是已经产生的 一个返回的字典

任务六

现在,将所有内容组合在一起。 不要忘记将以前定义的所有函数复制到此代码窗口中。 定义一个函数get_sorted_recommendations。

It takes a list of movie titles as an input.。

It returns a sorted list of related movie titles as output, up to five related movies for each input movie title。

The movies should be sorted in descending order by their Rotten Tomatoes rating, as returned by the get_movie_rating function.

以相反的字母顺序断开领带,以使“YahşiBatı”排在“ Eyyvah Eyvah”之前。

Break ties in reverse alphabetic order, so that ‘Yahşi Batı’ comes before ‘Eyyvah Eyvah’.

【坑爹】

对电影’Yahşi Batı’ 使用第一个查找烂番茄数据的函数

import json
import requests_with_caching

def get_movie_data(mov_tit):
    baseurl = 'http://www.omdbapi.com/'
    params_dict = {}
    params_dict['t'] = mov_tit   # Movie title to search for.
    params_dict['r'] = 'json'
    
    # 返回的对象是requst
    info_dict_resp = requests_with_caching.get(baseurl, params = params_dict)
    # 可以整理为 json对象
    dict = info_dict_resp.json()['Ratings']
    print(dict) #字典类型
    # print(info_dict_resp.url) 返回完全的地址,输入地址栏可以看到字典的结果(见下)
    return info_dict_resp.json()  

get_movie_data("Yahşi Batı")
[{'Source': 'Internet Movie Database', 'Value': '7.3/10'}]

导致下面在字典 索引 1 的时候超出范围,需要做一个判断?

# [{'Source': 'Internet Movie Database', 'Value': '7.8/10'}, {'Source': 'Rotten Tomatoes', 'Value': '83%'}, {'Source': 'Metacritic', 'Value': '66/100'}]
def get_movie_rating(data): 
    rating = 0 
    for i in data['Ratings']: 
        if i['Source'] == 'Rotten Tomatoes': 
            rating = int(i['Value'][:-1]) 
    #print(rating) 
    return rating
---
def get_movie_rating(mov_data_dict):
    for i in mov_data_dict['Ratings']:
        if i['Source'] == 'Rotten Tomatoes':
            # mov_OMDB_rating = int(i['Value'].rstrip('%')) 错
            mov_OMDB_rating = int(i['Value'][:-1])
        else:
            mov_OMDB_rating = 0
    return mov_OMDB_rating
---
def get_movie_rating(mov_data_dict):
    for i in mov_data_dict['Ratings']:
        if i['Source'] == 'Rotten Tomatoes':
            # mov_OMDB_rating = int(i['Value'].rstrip('%')) 错
            mov_OMDB_rating = int(i['Value'][:-1])
            break # 跳出循环,否则如果存在下一项 但是 Source 不是 'Rotten Tomatoes' RATING
            # 又会被赋值为零
        else:
            mov_OMDB_rating = 0
    return mov_OMDB_rating

就是下面的问题

逻辑有问题,第一种是 先把 rating 设为0 ,如果存在 Rotten Tomatoes 指数 则赋给它

第二种,遍历所有source 如果 Rotten Tomatoes 存在,则赋给它,否则赋给它0


# some invocations that we use in the automated tests; uncomment these if you are getting errors and want better error messages
# get_sorted_recommendations(["Bridesmaids", "Sherlock Holmes"])

# It takes a list of movie titles as an input.
import requests_with_caching
import json
# 1
def get_related_titles(movie_list):
    lst = []
    for mov_tlt in movie_list:
        lst = lst + extract_movie_titles(get_movies_from_tastedive(mov_tlt)) #2 3
    return(list(set(lst)))

# 2 3
def get_movies_from_tastedive(movie_string):
    baseurl = 'https://tastedive.com/api/similar'
    
    # 请求的参数
    params_dict = {}
    params_dict['q'] = movie_string
    params_dict['type'] = 'movies'
    params_dict['limit'] = 5
    
    # 默认返回的就是 json
    # 返回的对象是requst
    tastedive_resp = requests_with_caching.get(baseurl, params = params_dict)
    # 按照api网站 可以整理为 json对象
    # print(type(tastedive_resp.json())) 字典类型
    # print(tastedive_resp.url) 返回完全的地址,输入地址栏可以看到字典的结果
    return tastedive_resp.json()

def extract_movie_titles(getmovie_respon):
    mov_lst = getmovie_respon['Similar']['Results']
    resp = [similar_mov['Name'] for similar_mov in mov_lst if True]
    #for similar_mov in mov_lst:
    #    similar_mov['Name']
    return resp

# 4 5
def get_movie_data(mov_tit):
    baseurl = 'http://www.omdbapi.com/'
    params_dict = {}
    params_dict['t'] = mov_tit   # Movie title to search for.
    params_dict['r'] = 'json'
    
    # 返回的对象是requst
    info_dict_resp = requests_with_caching.get(baseurl, params = params_dict)
    # 可以整理为 json对象
    # print(type(info_dict_resp.json())) 字典类型
    # print(info_dict_resp.url) 返回完全的地址,输入地址栏可以看到字典的结果(见下)
    return info_dict_resp.json()  


# 从上面的 get_movie_data() 返回的 信息的字典中提取 rating 数据
def get_movie_rating(data): 
    # rating = 0 
    for i in data['Ratings']: 
        if i['Source'] == 'Rotten Tomatoes': 
            rating = int(i['Value'][:-1]) 
            break
        else:
            rating = 0 
    return rating

def get_sorted_recommendations(mov_lst):
    rel_movlist = get_related_titles(mov_lst) # 1 返回一个列表 与给定列表的电影相关
    # print(rel_movlist)正确
    # 对这个列表里的电影查找烂番茄指数,添加到字典中{电影:值}中
    mov_val_set = {}
    for mov in rel_movlist:
        mov_rating = get_movie_rating(get_movie_data(mov)) # 4,5
        #print(mov_rating) 正常
        mov_val_set[mov] = mov_rating  # 【错】mov_val_set['mov'] = mov_rating
    #print(mov_val_set) 正常
    # 对这个字典按照 值进行排序,且同值按照字符从后到前排序
    # sorted() 函数
    #sorted_dict = sorted(mov_val_set.keys(),key=lambda k: mov_val_set[k], reverse = True)
    #print(sorted_dict)
    return [i[0] for i in sorted(mov_val_set.items(), key=lambda item: (item[1], item[0]), reverse=True)]
    
       
get_sorted_recommendations(["Bridesmaids", "Sherlock Holmes"])

最后一行这个排序没搞明白,不是自己写的,要求按照值从大到小,键的字母反序

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值