简单的python爬取网页字符串内容并保存

最新推荐文章于 2024-04-30 13:19:06 发布

朴瑞祥

最新推荐文章于 2024-04-30 13:19:06 发布

阅读量1.1w

点赞数 2

分类专栏： python 文章标签：正则 python 爬虫

本文链接：https://blog.csdn.net/u010571211/article/details/51440216

版权

最近想试试python的爬虫库，就找了个只有字符串的的网页来爬取。网址如下：

http://mobilecdn.kugou.com/api/v3/special/song?plat=0&page=1&pagesize=-1&version=7993&with_res_tag=1&specialid=26430

打开后看到是一些歌名还有hash等信息。按照hash|filename的方式存在文件里，先贴代码

#coding=utf-8

import urllib

import re

import os

 

def getHtml(url):

    page = urllib.urlopen(url)

    html = page.read()

    return html

 

def getHash(html):

    reg = r'"hash":"(.+?)",'

    has = re.compile(reg)

 

    hashlist = re.findall(has,html)

    with  open('1.txt','w') as f:

最低0.47元/天解锁文章

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

朴瑞祥

关注关注

2
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
简单的python爬取网页字符串内容并保存

最近想试试python的爬虫库，就找了个只有字符串的的网页来爬取。网址如下：http://mobilecdn.kugou.com/api/v3/special/song?plat=0&page=1&pagesize=-1&version=7993&with_res_tag=1&specialid=26430打开后看到是一些歌名还有hash等信息。按照hash|filename的方式存在
复制链接

扫一扫