python爬取某网站练习题

python爬取网页资源(针对特定网页)
由于刚刚接触python,代码存在缺陷,敬请谅解

代码:

import urllib.request
import os
import time 

def get_liti(url):
	html = get_page(url)
	b=html.find('题目')
	begin = b+2
	end = int(html.find(' 程序分析:'))
	timu = html[begin:end]
	return timu 
def url_open(url):
     response = urllib.request.urlopen(url)
     return response
def get_page(url):
     response=url_open(url)
     html = response.read().decode('utf-8')
     return html
def download_liti(total):
     temp = 1
     liti=[]
     while temp<=total:
          url="https://www.runoob.com/python/python-exercise-example"+str(temp)+".html"
          temp+=1
          if temp== 3:
               continue
          a="例题"+str(temp-1)+get_liti(url)
          liti.append(a)          
       with open("E://timu.txt","w+",encoding="utf-8") as file:
          for each in liti:
               file.write(each+"\n")
          file.close()
download_liti(100)     

图片是错误原因:windows默认保存文本编码方式为GBK
with open(“E://timu.txt”,“w+”) as file:
for each in liti:
file.write(each+"\n")
file.close()
解决方法: 编码方式改为UTF-8
with open(“E://timu.txt”,“w+”,encoding=“utf-8”) as file:
for each in liti:
file.write(each+"\n")
file.close()

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值