Python爬虫爬取ok资源网电影播放地址

最新推荐文章于 2023-11-20 15:43:45 发布

林林木林林L

最新推荐文章于 2023-11-20 15:43:45 发布

阅读量3.1w

点赞数 3

文章标签： python xpath html 爬虫搜索引擎 java爬虫程序爬虫搜索关键字搜索

本文链接：https://blog.csdn.net/qq_42348956/article/details/107061401

版权

本文介绍了使用Python进行网络爬虫，通过XPath解析HTML，从OK资源网站上抓取电影播放地址的过程。内容涵盖Python基础、网络请求库的使用、XPath选择器的实践和爬虫策略的制定，对于初学者理解网页爬取具有指导意义。

摘要由CSDN通过智能技术生成

#爬取ok资源网电影播放地址

#www.okzy.co
#入口一：http://okzy.co/index.php?m=vod-search&wd={关键字}&submit=search
#入口二：http://www.okzy.co/?m=vod-type-id-{1-34}.html
#       http://www.okzy.co/?m=vod-index-pg-{1-1110}.html

# for x in range(1110):
# 	print("http://www.okzy.co/?m=vod-index-pg-{}.html".format(x))

#请求，响应，分析保存
#目标入口：首页->列表->子页面->内容（播放地址，对应名称）->保存（电影标题）

import requests
from lxml import etree
#表格模块
#pip install prettytable
from prettytable import PrettyTable

host = "http://www.okzy.co"
rooturl = "/?m=vod-index-pg-{}.html".format(1)

#请求入口页
response = requests.get(host+rooturl)
#输出页面内容-HTML
response.encoding = 'utf-8'
# print(response.text)
if response