作业2

作业day2:
1.股吧信息爬取:
url:http://guba.eastmoney.com/
要求:
1、爬取10页内容,保存到guba文件夹下

第一种方法

import requests,os
base_url='http://guba.eastmoney.com/'
headers={
    'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.79 Safari/537.36',
}
kw='guba02'
filename = './guba02/'+kw
dirname = os.path.dirname(filename)
if not os.path.exists(dirname):
    os.mkdir(dirname)
if not os.path.exists(filename):
    os.mkdir(filename)
for i in range(10):
    params={
        'kw':kw,
        'ie':'utf-8',
        'pn':str(i*50),
    }
    response = requests.get(base_url,headers=headers,params=params)
    with open(filename+'/{}.html'.format(i+1),'w',encoding='utf-8') as fp:
        fp.write(response.text)


第二种方法

## 书本上到批量爬取百度贴吧
# 批量爬取上证指数
import requests
import os
def tieba4(kw,start,end):
    dir_name = './guba/' +kw+'/'
    if not os.path.exists(dir_name):
     os.makedirs(dir_name)
    payload = {'kw' :kw,'ie' :'utf-8'}
    for i in range(int(start),int(end)+1):
        pn = (i-1)*50
        payload['pn'] = str(pn)
        response = requests.get(base_url,params=payload)
        html = response.content.decode('utf-8')
        with open(dir_name+str(i)+'.html','w',encoding='utf-8')as f:
            f.write(html)
if __name__ == '__main__':
    base_url = 'http://guba.eastmoney.com/list,zssh000001.html'
    kw = '上证指数'
    start = 1
    end = 10
    tieba4(kw,start,end)

2、金山词霸:http://www.iciba.com/
做到和有道相似想过。
1.0版本

import requests,json
headers={
    # 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.79 Safari/537.36',
}
kw='kill'
data={'w':kw,}
base_url='http://fy.iciba.com/ajax.php?a=fy'
response = requests.post(base_url,headers=headers,data=data)
json_data = json.loads(response.text)
result = ''
for i in json_data['content']['word_mean']:
    result+=i+'\n'
print(result)

E:\project\python.exe C:/Users/Administrator/Desktop/四阶xpat爬虫系列/Requests/Requests01/Requests01/作业系列/fanyi_jscb.py
vt.& vi. 杀死…;
vt. 使停止[结束,失败];破坏,减弱,抵消;使痛苦,使受折磨;使笑得前仰后合,使笑死了;
n. 杀死;猎;被捕杀的动物;猎物;
adj. 致命的;

Process finished with exit code 0

1.01版本添加input标签

import requests,json
url = 'http://fy.iciba.com/ajax.php?a=fy'
headers = {}
wk = input('输入单词')
data = {'w': wk}
response = requests.post(url,headers=headers,data=data)
json_data = json.loads(response.text)
result = ''
try:
    for i in json_data['content']['word_mean']:
        result += i + '\n'
except Exception as a:
    print(a)

print(result)

E:\project\python.exe C:/Users/Administrator/Desktop/四阶xpat爬虫系列/Requests/Requests01/Requests01/作业系列/fanyi_jscb_tym.py
输入单词obj
abbr. object 物体;目标;(工程)项目;objection 反对;

Process finished with exit code 0

  • 2
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值