python爬天天基金_手把手教你用数据选基金（1）-CSDN博客

本文链接：https://blog.csdn.net/weixin_39735288/article/details/112418763

这里手把手教你用数据来选基金。

工具是Python，Excel。

1. 获取基金基本信息

方法：使用Python抓取天天基金网页数据。

1.1 下面URL可以获得所有基金信息

http://fund.eastmoney.com/js/fundcode_search.jsfund.eastmoney.com

怎么知道这个URL？使用Chrome或Firefox，F12进入开发模式寻找内部数据链接。

1.2 使用Python把内容导入DataFrame中即可。直接上代码。

# -*- coding: utf-8 -*-
import urllib2
from pandas import DataFrame

url = 'http://fund.eastmoney.com/js/fundcode_search.js'
while True:
    try:
        r = urllib2.urlopen(url, timeout=3)
    except Exception as exc:
        print ('error %s, wait 5 seconds...' % exc)
        time.sleep(5)
    else:
        break
content = r.read()

# 删减头尾
content = content[11:]
content = content[:-1]

# 存入DataFrame
a = []
a = eval(content)
b = ['code', u'基金代码2', u'基金简称', u'基金类型', u'基金简称2']
df = DataFrame(a, columns=b)
df.set_index('code', inplace=True)

*为什么要用Try-Except？网络上爬数据必须考虑网络出错的可能性。

DataFrame内容如下：

1.3 继续获取基金详细信息。

下面URL可以获得各基金详细信息

http://fund.eastmoney.com/f10/000001.htmlfund.eastmoney.com

*000001即为基金代码

继续上代码

cd=df.index[0]
print("Read fund_info (", cd, ") info from web, please wait...")
url = "http://fund.eastmoney.com/f10/" + cd + ".html"
# print("url=", url)
while True:
    try:
        r = urllib2.urlopen(url, timeout=3)
    except Exception as exc:
        print ('error %s, wait 5 seconds...' % exc)
        time.sleep(5)
    else:
        break

print("Read OK")

# 使用BeautifulSoup解析内容
content = r.read()
soup = BeautifulSoup(content, 'lxml')

tb = soup.find_all('table', class_='info w790')
if len(tb) == 0:
    exit()
trs = tb[0].find_all('tr')
d = {'code': cd}
for j in range(len(trs)):
    ths = trs[j].find_all('th')
    tds = trs[j].find_all('td')
    if len(ths) != len(tds):
        exit()
    for k in range(len(ths)):
        th = ths[k].getText()
        td = tds[k].getText()
        d[th] = td

d的内容如下：

1.4 保存到mongodb

import pymongo

client = pymongo.MongoClient('localhost', 27017)
collection = client['fund']['fund_info']
collection.ensure_index([('key', pymongo.ASCENDING)], unique=True)

"""数据插入到Mongo数据库中"""
flt = {'key': d['key']}
collection.update_one(flt, {'$set': d}, upsert=True)

1.5 重复1.3和1.4的内容把所有基金信息导入数据库中

（完）

接下来做什么？

2. 获取基金涨跌信息

3. 进行数据分析