python天天基金爬虫存储csv格式

最新推荐文章于 2024-10-30 19:27:31 发布

小小白的日常闲逛

最新推荐文章于 2024-10-30 19:27:31 发布

阅读量435

点赞数 1

分类专栏：爬虫文章标签： python

本文链接：https://blog.csdn.net/weixin_43943481/article/details/116425127

版权

这段代码实现了一个名为'TianTianSpider'的类，用于爬取天天基金网站上的基金数据。它使用了requests库进行HTTP请求，通过UserAgent模拟浏览器行为，避免被网站屏蔽。爬虫从指定URL开始，获取每个页面的基金排名数据，并将数据保存到CSV文件中，包括基金代码、名称、净值等信息。程序通过设置页码遍历多个页面，每页数据间隔1-3秒下载，以减少对目标网站的压力。

摘要由CSDN通过智能技术生成

# @Software:PyCharm

from fake_useragent import UserAgent
import time
import random
import ssl
import requests
import json
import csv
import numpy as np

ssl._create_default_https_context = ssl._create_unverified_context


class TianTianSpider(object):
    def __init__(self):
        self.url = "http://fund.eastmoney.com/data/rankhandler.aspx?op=ph&dt=kf&ft=all&rs=&gs=0&sc=6yzf&st=desc&sd=2020-05-05&ed=2021-05-05&qdii=&tabSubtype=,,,,,&pi={}&pn=50&dx=1&v=0.0418420971918686"
        ua = UserAgent()
        self.headers = {
            'User-Agent': ua.random,
            "Referer": 'http://fund.eastmoney.com/data/fundranking.html'
        }
        self.page = 1

    def get_page(self, url):
        html = requests.get(
            url=url,
            headers=self.heade