python爬取小游戏_如何用Python爬取小游戏网站，把喜欢的游戏收藏起来（附源码）...

最新推荐文章于 2024-04-30 14:04:40 发布

weixin_39904268

最新推荐文章于 2024-04-30 14:04:40 发布

阅读量513

点赞数

文章标签： python爬取小游戏

1540344b-0b53-4bd1-954c-82f3a13d4a28

简介：

Python 是一门简单易学且功能强大的编程语言，无需繁琐的配置，掌握基本语法，了解基本库函数，就可以通过调用海量的现有工具包编写自己的程序，轻松实现批量自动化操作，可以极大提高办公和学习效率。Python爬虫可以批量获取网页上的数据。

Python的环境配置

1. 代码编辑器 Pycharm community

2. 代码解释器 Python 3.7.6

3. 在Pycharm中创建项目并配置Python环境

4. 安装工具包的两种方式

4399小游戏爬虫实战

1、爬虫的基本步骤

使用requests下载网页

使用BeautifulSoup将requests下载的内容解析为DOM (文档对象模型)

通过DOM获取所需要的数据

2、4399小游戏的本地运行

支持下载到本地的游戏 : 以 .swf 为扩展名的游戏

游戏主体页的的src属性可以得到绝对地址

游戏绝对地址示例: http://sxiao.4399.com/4399swf/upload_swf/ftp29/liuxinyu/20190731/7/main.swf

游戏信息页可以获取相对地址: 在

游戏相对地址示例: /upload_swf/ftp29/liuxinyu/20190731/7/main.swf

所需文件: 爱奇艺万能播放器 ( 已更名为万能联播 ) ( GeePlayer.exe )万能联播PC版

1ab36b1b47ce41cf8f5e427d5ae906e1

3、4399小游戏爬虫实现思路

爬取4399好玩的小游戏页面（http://www.4399.com/flash/gamehw.htm）, 通过解析得到DOM来获取所有的游戏链接

遍历所有的游戏链接, 开启线程下载该链接的网页并判断该游戏是否支持下载到本地, 如果支持则拼接下载地址, 并开启游戏下载线程

游戏下载线程: 根据下载地址来下载 .swf 文件并保存到本地

完整代码

1import os

2import re

3import threading

4

5from bs4 import BeautifulSoup as bs

6import requests

7

8

9def getAllGameUrl():

10 """

11 获取所有游戏的名称和游戏信息页的链接

12 :return:

13 """

14 gameUrlList = []

15 response = requests.get(‘http://www.4399.com/flash/gamehw.htm‘)

16 dom = bs(response.content, ‘html.parser‘)

17 gameLiList = dom.select(‘#skinbody > div:nth-child(6) > ul > li‘)

18 for i in gameLiList:

19 # 获取游戏的名称

20 gameName = i.select_one(‘a > b‘).get_text()

21 # 获取游戏信息页的链接

22 # ‘http://www.4399.com/flash/212103.htm‘

23 gameInfoUrl = indexUrl + i.select_one(‘a‘)[‘href‘]

24 gameUrlList.append({‘gameName‘: gameName, ‘gameInfoUrl‘: gameInfoUrl})

25 return gameUrlList

26

27

28def downloadIfAvailable(game):

29 """

30 判断一个游戏是否支持本地下载

31 :return:

32 """

33 response = requests.get(game[‘gameInfoUrl‘])

34 plainText = response.text

35 relativeUrlList = re.findall(r‘(?<=_strGamePath=").+?\.swf‘, plainText)

36 if len(relativeUrlList) != 0:

37 gameUrl = gameBaseUrl + relativeUrlList[0]

38 game[‘gameUrl‘] = gameUrl

39 threading.Thread(target=downloadAGame, args=(game,)).start()

40

41

42def downloadAGame(game):

43 """

44 根据下载链接下载游戏，并保存到.swf文件

45 :param game:

46 :return:

47 """

48 downloadPath = ‘games/‘

49 if not os.path.exists(downloadPath):

50 try:

51 os.mkdir(downloadPath)

52 except FileExistsError as e:

53 print(e)

54 with open(downloadPath + game[‘gameName‘] + ‘.swf‘, ‘wb‘) as file:

55 file.write(requests.get(game[‘gameUrl‘]).content)

56 print(game[‘gameName‘] + ‘下载完成‘)

57

58

59if __name__ == ‘__main__‘:

60 indexUrl = ‘http://www.4399.com‘

61 gameBaseUrl = ‘http://sxiao.4399.com/4399swf‘

62 gameUrlList = getAllGameUrl()

63 for i in gameUrlList:

64 threading.Thread(target=downloadIfAvailable, args=(i,)).start()

原文：https://www.cnblogs.com/python0921/p/12869725.html

weixin_39904268

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
python爬取小游戏_如何用Python爬取小游戏网站，把喜欢的游戏收藏起来（附源码）...

简介：Python 是一门简单易学且功能强大的编程语言，无需繁琐的配置，掌握基本语法，了解基本库函数，就可以通过调用海量的现有工具包编写自己的程序，轻松实现批量自动化操作，可以极大提高办公和学习效率。Python爬虫可以批量获取网页上的数据。Python的环境配置1. 代码编辑器 Pycharm community2. 代码解释器 Python 3.7.63. 在Pycharm中创建项目并配置Py...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。